Background: The use of real-world data has become increasingly popular, also in the field of infectious disease (ID), particularly since the COVID-19 pandemic emerged. While much useful data for research is being collected, these data are generally stored across different sources. Privacy concerns limit the possibility to store the data centrally, thereby also limiting the possibility of fully leveraging the potential power of combined data. Federated learning (FL) has been suggested to overcome privacy issues by making it possible to perform research on data from various sources without those data leaving local servers. In this review, we discuss existing applications of FL in ID research, as well as the most relevant opportunities and challenges of this method.
Methods: References for this review were identified through searches of MEDLINE/PubMed, Google Scholar, Embase and Scopus until July 2023. We searched for studies using FL in different applications related to ID.
Results: Thirty references were included and divided into four sub-topics: disease screening, prediction of clinical outcomes, infection epidemiology, and vaccine research. Most research was related to COVID-19. In all studies, FL achieved good accuracy when predicting diseases and outcomes, also in comparison to non-federated methods. However, most studies did not make use of real-world federated data, but rather showed the potential of FL by using data that was manually partitioned.
Conclusions: FL is a promising methodology which allows using data from several sources, potentially generating stronger and more generalisable results. However, further exploration of FL application possibilities in ID research is needed.
Keywords: AI; Big data; Federated learning; Infection; Machine learning; Vaccine.
© 2024. The Author(s).