Automated multi-model deep neural network for sleep stage scoring with unfiltered clinical data

Sleep Breath. 2020 Jun;24(2):581-590. doi: 10.1007/s11325-019-02008-w. Epub 2020 Jan 14.

Abstract

Purpose: To develop an automated framework for sleep stage scoring from PSG via a deep neural network.

Methods: An automated deep neural network was proposed by using a multi-model integration strategy with multiple signal channels as input. All of the data were collected from one single medical center from July 2017 to April 2019. Model performance was evaluated by overall classification accuracy, precision, recall, weighted F1 score, and Cohen's Kappa.

Results: Two hundred ninety-four sleep studies were included in this study; 122 composed the training dataset, 20 composed the validation dataset, and 152 were used in the testing dataset. The network achieved human-level annotation performance with an average accuracy of 0.8181, weighted F1 score of 0.8150, and Cohen's Kappa of 0.7276. Top-2 accuracy (the proportion of test samples for which the true label is among the two most probable labels given by the model) was significantly improved compared to the overall classification accuracy, with the average being 0.9602. The number of arousals affected the model's performance.

Conclusion: This research provides a robust and reliable model with the inter-rater agreement nearing that of human experts. Determining the most appropriate evaluation parameters for sleep staging is a direction for future research.

Keywords: Deep learning; Obstructive sleep apnea (OSA); Polysomnography (PSG); Sleep staging.

MeSH terms

  • Adult
  • Aged
  • Deep Learning
  • Female
  • Humans
  • Male
  • Middle Aged
  • Neural Networks, Computer*
  • Polysomnography
  • Sleep Apnea, Obstructive / physiopathology*
  • Sleep Stages / physiology*
  • Young Adult