The Big Data Africa School aims to introduce fundamental data science tools & techniques to talented young science and engineering graduates across a range of disciplines, who have an interest to develop their skills and knowledge in working efficiently on extremely large datasets in any research environment.
The 4th Big Data Africa School will allow students to work on real-life data sets in the area of healthcare focusing on biomedical imaging by solving some of the biggest challenges facing the African continent.
Who can apply?
Open to all African students including South Africa – students must be permanently based in an African country.
Students currently undertaking a Master of Science or PhD degree in
Bioinformatics/Computational Biology/Computer Science/Radiography
Students currently in the Final Year of their 4th year BSc Honours of BEng degree in
Bioinformatics/Computational Biology/Computer Science/Diagnostic Radiography / Diagnostics Ultrasound/Nuclear Medicine Technology/Radiation Therapy/Computer/Biomedical Engineering
Students in disciplines outside of the above domains are welcome to apply.
Intermediate to advanced programming skills will be advantageous to the applicants.
Application enquiries can be emailed to bigdataschool@ska.ac.za
Partners
View and download the Brochure below
View and download the Digital Booklet below
For more information contact:
Dr Bonita de Swardt
Programme Manager: Strategic Partnerships for Human Capital Development
Email: bonita@sarao.ac.za
Big Data Africa School – Projects
Identifying out-of-distribution samples in healthcare: skin cancer and malaria as use cases
Recent advances in deep learning have led to breakthroughs in the development of automated medical disease diagnosis. As we observe an increasing interest in these models in the healthcare space, it is crucial to address aspects such as the robustness towards these variations, i.e., input data distribution shifts. Current models tend to make incorrect inferences for test samples from different hardware devices and clinical settings or unknown conditions samples, which are out-of-distribution (OOD) from the training samples.
In this project, we will explore multiple Machine Learning solutions to search for effective approaches to detect these OOD samples prior to making any decision.
Mentors: Girmaw Abebe Tadesse & Celia Cintas
Generalisable cardiac image segmentation using deep learning and transfer learning
Accurate segmentation of cardiovascular magnetic resonance (CMR) images is an important pre-requisite in cardiology to reliably assess and diagnose a number of major cardiovascular diseases. Deep learning techniques represent nowadays the state-of-the-art in automatic CMR segmentation. However, these models are commonly trained and validated using datasets collected in single clinical centres or with homogeneous imaging protocols, limiting the development of models that are generalisable across different clinical centres or different scanners.
In this project, the participants will be trained to implement generalisable deep learning-based segmentation models using the python programming language. First, they will implement baseline convolutional neural networks trained with images from a single domain (e.g. from a single hospital) and validated on new unseen domains (i.e. in other hospitals). Second, they will implement several strategies such as data augmentation, domain adaptation or transfer learning to obtain generalisable models across hospitals, scanners and populations.
Mentor: Victor Campello
Breast mammogram mass synthesis using generative adversarial networks
Breast mass segmentation in full-field digital mammograms plays a significant role in tumour classification and treatment planning. Over the past years, deep learning methods have shown great potential in accurately segmenting the masses from mammograms. However, due to the heterogeneous nature of breast tumours, training a state-of-the-art model requires a large number of real patient datasets, which are not always accessible due to limited availability or privacy concerns.
One of the approaches to overcome the data scarcity problem is to generate synthetic images using Generative Adversarial Networks (GANs). In this project, the participants will develop a GAN model to generate synthetic mammogram masses
Mentor: Kaisar Kushibar
Explainable AI: medical images using SHAP and Lime tools
Explainable AI (XAI) has been suggested to improve the interpretability of AI-based solutions, providing qualitative and quantitative reasons for how AI models make their decisions. Therefore, increase trust in AI-based solutions in healthcare.
In this project, participants will train an AI model and attempt to explain/interpret the decisions made by the model.
First, participants will train an AI model to detect pneumonia using x-ray images. Secondly, they will explore XAI tools: Shapely Addictive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (Lime) to determine what features influence the model prediction.
Finally, participants will explore qualitative and quantitative explanations with links to underlying biological phenotypes in medical images.
Mentor: Lameck Amugongo
3D from 2D reconstruction using Gaussian Process Morphable Models
In this project, we address the problem of 2D to 3D reconstruction. Given one or several contours of an organ in 2D, the goal is to reconstruct the full 3D shape of the organ that generated this contour. As the solution is not unique, an important goal of this project is to characterize the full space of possible solutions and the uncertainty associated with a given reconstruction.
To solve this problem, we will employ Gaussian process morphable models – a class of linear shape models based on Gaussian processes. These models can be trained using only a few dozen representative training examples and are easy to understand and validate. We will work through a principled Bayesian workflow for shape analysis, which can be applied to rigorously validate the properties of these models. The result is a fully probabilistic reconstruction procedure, which can safely be deployed in practice.
Mentor: Marcel Lüthi