A percentage of the population suffers prolonged and persistent post-concussion symptoms (PCS) following average head injuries or develops severe neurological dysfunction following minor head trauma. Genetic variants that may contribute to individual response to head trauma have been investigated in some studies, but to date none have explored the use of machine learning (ML) methods with genomic data to specifically explore outcomes of head trauma. Whole exome sequencing (WES) was completed for three groups of individuals (N = 60): (a) 16 individuals with severe neurological responses to minor head trauma, (b) 26 individuals with persistent PCS and (c) 18 individuals with normal recovery from concussion or mTBI. Gradient boosted tree algorithms were applied to the data using XGBoost. By using variants with CADD scores above 15 in the training set (randomly sampled 70%), we identified signatures that accurately distinguish to accurately distinguish the test groups with an average area under the curve (AUC) of 0.8 (SE = 0.019). Metrics including positive and negative prediction values, as well as kappa were all within acceptable range to support the prediction accuracy. This study illustrates how ML methods in combination with WES data have the potential to predict severe or prolonged responses to head trauma from healthy recovery. KEY MESSAGES: Linear association analysis has been inconclusive in concussion genetics. Non-linear methods as boosted trees can offer better insights in small samples. Strong discrimination trends can be achieved from exome data of cases and controls.
Keywords: Concussion; Genomics; Head trauma; Machine learning; Neurotrauma.
© 2021. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.