A Machine Learning Approach to Parkinson's Disease Blood Transcriptomics

Genes (Basel). 2022 Apr 21;13(5):727. doi: 10.3390/genes13050727.

Abstract

The increased incidence and the significant health burden associated with Parkinson's disease (PD) have stimulated substantial research efforts towards the identification of effective treatments and diagnostic procedures. Despite technological advancements, a cure is still not available and PD is often diagnosed a long time after onset when irreversible damage has already occurred. Blood transcriptomics represents a potentially disruptive technology for the early diagnosis of PD. We used transcriptome data from the PPMI study, a large cohort study with early PD subjects and age matched controls (HC), to perform the classification of PD vs. HC in around 550 samples. Using a nested feature selection procedure based on Random Forests and XGBoost we reached an AUC of 72% and found 493 candidate genes. We further discussed the importance of the selected genes through a functional analysis based on GOs and KEGG pathways.

Keywords: Parkinson’s disease; blood transcriptomics; feature selection; inflammation; machine learning; mitochondrial dysfunction; oxidative stress; xgboost.

MeSH terms

  • Cohort Studies
  • Early Diagnosis
  • Humans
  • Machine Learning
  • Parkinson Disease* / diagnosis
  • Parkinson Disease* / genetics
  • Transcriptome / genetics