BackgroundAs more and more researchers are turning to big data for new opportunities of biomedical discoveries, machine learning models, as the backbone of big data analysis, are mentioned more often in biomedical journals. However, owing to the inherent complexity of machine learning methods, they are prone to misuse. Because of the flexibility in specifying machine learning models, the results are often insufficiently reported in research articles, hindering reliable assessment of model validity and consistent interpretation of model outputs.ObjectiveTo attain a set of guidelines on the use of machine learning predictive models within clinical settings to make sure the models are correctly applied and sufficiently reported so that true discoveries can be distinguished from random coincidence.MethodsA multidisciplinary panel of machine learning experts, clinicians, and traditional statisticians were interviewed, using an iterative process in accordance with the Delphi method.ResultsThe process produced a set of guidelines that consists of (1) a list of reporting items to be included in a research article and (2) a set of practical sequential steps for developing predictive models.ConclusionsA set of guidelines was generated to enable correct application of machine learning models and consistent reporting of model specifications and results in biomedical research. We believe that such guidelines will accelerate the adoption of big data analysis, particularly with machine learning methods, in the biomedical research community.
The data-driven approach has shown to be powerful in ADRs detection and prediction. The review helps researchers and pharmacists to have a quick overview on the current status of ADRs detection and prediction.
Abstract. Time-series classification has attracted increasing interest in recent years, particularly for long time-series as those arising in bioinformatics and financial domain. Many dimensionality reduction algorithms have been proposed to attack the so-called curse of dimensionality problem. However, choosing the number of features is not a trivial task and has not been well considered. In this paper, we propose a novel blind feature extraction algorithm with Haar wavelet transform which can determine the feature dimensionality automatically. The algorithm takes the tradeoff of achieving lower dimensionality and lower sum of squared errors between the features and original time-series. Experimental results performed on several widely used time-series data demonstrate the effectiveness of the proposed algorithm.
Background: MicroRNAs (miRNAs) are a class of small non-coding RNA molecules (20-24 nt), which are believed to participate in repression of gene expression. They play important roles in several biological processes (e.g. cell death and cell growth). Both experimental and computational approaches have been used to determine the function of miRNAs in cellular processes. Most efforts have concentrated on identification of miRNAs and their target genes. However, understanding the regulatory mechanism of miRNAs in the gene regulatory network is also essential to the discovery of functions of miRNAs in complex cellular systems. To understand the regulatory mechanism of miRNAs in complex cellular systems, we need to identify the functional modules involved in complex interactions between miRNAs and their target genes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.