In this work we examine the problematic associated to the development of machine learning models to achieve robust generalization capabilities on common-task multiple-database scenarios. Referred as the "database variability problem", we focus on a specific medical domain (sleep staging in Sleep Medicine) to show the non-triviality of translating the estimated model's local generalization capabilities to independent external databases. We analyze some of the scalability problems when multiple-database data are used as input to train a single learning model. Then, we introduce a novel approach based on an ensemble of local models, and we show its advantages in terms of inter-database generalization performance and data scalability. Further on, we analyze different model configurations and data preprocessing techniques to evaluate their effects over the overall generalization performance. For this purpose we carry out experimentation involving several sleep databases evaluating different machine learning models based on Convolutional Neural Networks.
Study objectives
Development of inter-database generalizable sleep staging algorithms represents a challenge due to increased data variability across different datasets. Sharing data between different centers is also a problem due to potential restrictions due to patient privacy protection. In this work, we describe a new deep learning approach for automatic sleep staging, and address its generalization capabilities on a wide range of public sleep staging databases. We also examine the suitability of a novel approach that uses an ensemble of individual local models and evaluate its impact on the resulting inter-database generalization performance.
Methods
A general deep learning network architecture for automatic sleep staging is presented. Different preprocessing and architectural variant options are tested. The resulting prediction capabilities are evaluated and compared on a heterogeneous collection of six public sleep staging datasets. Validation is carried out in the context of independent local and external dataset generalization scenarios.
Results
Best results were achieved using the CNN_LSTM_5 neural network variant. Average prediction capabilities on independent local testing sets achieved 0.80 kappa score. When individual local models predict data from external datasets, average kappa score decreases to 0.54. Using the proposed ensemble-based approach, average kappa performance on the external dataset prediction scenario increases to 0.62. To our knowledge this is the largest study by the number of datasets so far on validating the generalization capabilities of an automatic sleep staging algorithm using external databases.
Conclusions
Validation results show good general performance of our method, as compared with the expected levels of human agreement, as well as to state-of-the-art automatic sleep staging methods. The proposed ensemble-based approach enables flexible and scalable design, allowing dynamic integration of local models into the final ensemble, preserving data locality, and increasing generalization capabilities of the resulting system at the same time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.