Objectives: With advances in data availability and computing capabilities, artificial intelligence and machine learning technologies have evolved rapidly in recent years. Researchers have taken advantage of these developments in healthcare informatics and created reliable tools to predict or classify diseases using machine learning-based algorithms. To correctly quantify the performance of those algorithms, the standard approach is to use cross-validation, where the algorithm is trained on a training set, and its performance is measured on a validation set. Both datasets should be subject-independent to simulate the expected behavior of a clinical study. This study compares two cross-validation strategies, the subject-wise and the record-wise techniques; the subject-wise strategy correctly mimics the process of a clinical study, while the record-wise strategy does not.Methods: We started by creating a dataset of smartphone audio recordings of subjects diagnosed with and without Parkinson’s disease. This dataset was then divided into training and holdout sets using subject-wise and the record-wise divisions. The training set was used to measure the performance of two classifiers (support vector machine and random forest) to compare six cross-validation techniques that simulated either the subject-wise process or the record-wise process. The holdout set was used to calculate the true error of the classifiers.Results: The record-wise division and the record-wise cross-validation techniques overestimated the performance of the classifiers and underestimated the classification error.Conclusions: In a diagnostic scenario, the subject-wise technique is the proper way of estimating a model’s performance, and record-wise techniques should be avoided.
The aim of this study is to compare different algorithms destined to de-noise an ECG signal, starting with a normal FIR filter to eliminate the noise. Three other filtering methods are used: a Kalman filter and an LMS filter alongside with the undecimated wavelet transform. These three are considered as adaptive filters that essentially minimize the mean-squared error between the noisy ECG and a reference input, which is either noise that is correlated with the noise in the primary input or a signal that is correlated only with ECG in the primary input. To have a better view of the results, two random ECG signals from the Physiobank database are used.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.