Prioritizing Essential Care with AI
Advancing Neonatal Care: The Role of IMPALA and AI in Improving Early Diagnosis and Treatment
Over 4 billion people lack access to essential health services, leading to an estimated 15.8 million preventable deaths each year. Approximately half of these deaths result from inadequate care. Many patients do reach a hospital where affordable and effective treatments are available; however, treatment often arrives too late due to shortages in staff, training, and medical equipment.
There is an urgent need to effectively identify, prioritize, and manage patients in resource-constrained environments. To address this, GOAL3, in collaboration with partners from Malawi and Rwanda, developed the IMPALA system. This system assists clinical health workers in analyzing patient data more effectively, detecting trends in vital signs, and identifying high-risk patients. As a result, health workers can provide better care and manage patients more efficiently, all without increasing their workload.
In the AI against Infant Deaths Challenge, FruitPunch AI is collaborating with GOAL3 to address these challenges in neonatal care, and developing AI solutions into everyday medical practices to significantly enhance infant survival rates.
The IMPALA system is at the heart of this challenge. IMPALA is designed to empower clinical health workers by enhancing their ability to analyze patient data, detect vital sign trends, and promptly identify high-risk patients.
The monitor uses 4 sensors to gather vital data from a patient:
Besides the vital data, clinical data is collected manually by the nurses. Manual inputs entered by nurses containing:
Using both the vital and clinical data we set out to tackle 3 problems in this Challenge. To achieve this, the Challenge was divided into three teams, each with a specific focus:
The risk assessment team aimed to develop a more accurate metric than the current ICLEN score for evaluating an infant's risk of critical intervention. By integrating clinical and vital signs data, the team created a new AI risk score that outperforms the current methods.
The ICLEN score, or "Index for Children, Life-threatening Event Notification," is a vital metric used to predict and manage life-threatening events. It consists of the Cardiovascular Score, based on heart rate and blood pressure deviations; the Respiratory Score, which evaluates respiratory rate and oxygen saturation; and the Neurological Score, which considers alertness and neurological status through the AVPU score.
The second team’s focus was predicting health deterioration in infants. AI models were trained to forecast potential critical health events using complex time-series data. This capability allows for predictions of health deterioration, which can trigger timely medical interventions.
The objective of the ballistography team was to incorporate Ballistography (BSG) technology into the IMPALA framework to create a non-invasive and cost-effective method for monitoring respiratory rates. Ballistography is a non-invasive technique that measures the mechanical movements of the body caused by physiological functions such as heartbeats and breathing. This is achieved through sensors placed under a mattress. The technology allows for real-time, continuous monitoring of vital signs, including heart rate and respiratory rate, making it ideal for environments where minimal patient disturbance is essential.
The trained model predicts a value representing the probability of an event occurring within the next 1 to 6 hours. A threshold can then be applied to assess the model's performance in detecting true events.
Multiple modelling approaches were trialled and two main models were selected. A Deep Learning model and an XGBoost model.
The deep learning approach showed strong overall power in predicting a patients’ death, with an F1-score of 0.60 in test data, compared to 0.23 in the XGBoost model. It is therefore recommended that the deep learning model be applied in practice. In cases with limited data or limited computational ability, the XGBoost model can be used instead.
Risk scores are calculated for each model by multiplying the classification probability by 100. This gives an easily interpretable score out of 100. To better compare the performance of the models of the ICLEN scores, several plots were created to illustrate the difference between risk scores.
We used two major approaches to accurately predict health outcomes (deterioration):
The analysis of both forecasting and classification-based approaches has provided valuable insights into predicting clinical deterioration events. For forecasting, while the approach was time-consuming, it showed promise in capturing general trends in vital signs and overall patient health. Although it was less effective at detecting extremes or outliers—such as those that may signal an imminent deterioration—forecasting remains a useful tool for monitoring broader physiological patterns over time. It is especially valuable in scenarios where long-term trends are critical for anticipating patient needs.
For classification-based prediction, the results were promising, particularly in the identification of potential deterioration events, although model performance—measured through AUC scores—was moderate (~0.5-0.7). While the classifiers demonstrated some capability in detecting deterioration, refining the definition of deterioration and improving feature engineering could potentially enhance model accuracy. The combination of vitals data and clinical features, including descriptive statistics, was effective in identifying general deterioration patterns, but further improvements in handling data imbalance and model tuning would be necessary for more robust predictions.
Overall, both approaches show promise, but further optimization is needed to address challenges such as data imbalance, outlier handling, and the refinement of deterioration definitions. With improvements in these areas, both forecasting and classification models have the potential to significantly enhance clinical decision-making and patient care.
Preprocessing was critical to ensure that the BSG signals were clean and focused on respiratory rate detection. We used two methods for cleaning the data.
For the first method, we applied some basic filtering and segmentation steps. The following steps were implemented:
The graph above shows the distribution of breaths per minute (BPM) for both the RPR 'ground truth' (which may not be very accurate) and the predictions from our BCG-based BSensor model. Both distributions appear to resemble a normal distribution, which is what we expected. However, it is noteworthy that the BSensor exhibits a higher spike around the center.
The results are presented here for each Bsensor and Resp file, along with the overall average peak count per minute for each source. As you can see, the overall averages are quite similar. However, given the visualization above, this is not surprising and does not necessarily indicate that the model performs well in every case.
Ultimately, we achieved a mean error of 4.9, which is slightly below the BPM we had established, suggesting promising results. With further adjustments, we might be able to improve this even more. Additionally, since we do not have a reliable gold standard, it is difficult to determine the actual performance of our current model with certainty.
This Challenge has shown how AI can make a big difference in caring for newborns. We've improved how we assess health risks and predict when infants might get seriously ill using the IMPALA system. We've also shown the potential of Ballistography, which helps monitor babies' breathing without disturbing them. We'll keep working to make these tools even better, ensuring they can help save more young lives.
We are very thankful to Achmea and the Achmea Foundation for their generous sponsorship of this AI for Good challenge. Their support has been crucial in advancing our mission to enhance neonatal care through innovative technologies.
Thanks to all the participants and organizers of this Challenge!
Bart Bierling, Niek Versteegde, Job Calis, Lonneke Gerrits-Aanraad, Agnes van Daal, Arnoud Boere, Maaike Blansjaar-Versteeg, Manuella Pot-Willer, Arjan Juurlink, Erwin Kersten, Leony van Kooten, Cedrik Dubois, Cristhian Humberto Amaya, Hemalatha Ramanujam, Honore K. Kayumba, Juan Eduardo Delgado, Kulsoom Abdullah, Raza Sekha, Yuri Shlyakhter, Elaheh Imani, Farukcan Sağlam, Gerson Foks, Icxa Khandelwal, Jesse Wiers, Koosha Tahmasebipour, Olga Sirbu, Dan Mayonde, Ernö Groeneweg, Etienne Galea, Feline Spijkerboer, Lucas Vergeest, Karan Behar, Buster Franken and Dorian Groen.