Abstract-Most machine learning algorithms, including many online learners, assume that the data distribution to be learned is fixed. There are many real-world problems where the distribution of the data changes as a function of time. Changes in nonstationary data distributions can significantly reduce the generalization ability of the learning algorithm on new or field data, if the algorithm is not equipped to track such changes. When the stationary data distribution assumption does not hold, the learner must take appropriate actions to ensure that the new/relevant information is learned. On the other hand, data distributions do not necessarily change continuously, necessitating the ability to monitor the distribution and detect when a significant change in distribution has occurred. In this work, we propose and analyze a feature based drift detection method using the Hellinger distance to detect gradual or abrupt changes in the distribution.
Recent advances in machine learning, specifically in deep learning with neural networks, has made a profound impact on fields such as natural language processing, image classification, and language modeling; however, feasibility and potential benefits of the approaches to metagenomic data analysis has been largely under-explored. Deep learning exploits many layers of learning nonlinear feature representations, typically in an unsupervised fashion, and recent results have shown outstanding generalization performance on previously unseen data. Furthermore, some deep learning methods can also represent the structure in a data set. Consequently, deep learning and neural networks may prove to be an appropriate approach for metagenomic data. To determine whether such approaches are indeed appropriate for metagenomics, we experiment with two deep learning methods: i) a deep belief network, and ii) a recursive neural network, the latter of which provides a tree representing the structure of the data. We compare these approaches to the standard multi-layer perceptron, which has been well-established in the machine learning community as a powerful prediction algorithm, though its presence is largely missing in metagenomics literature. We find that traditional neural networks can be quite powerful classifiers on metagenomic data compared to baseline methods, such as random forests. On the other hand, while the deep learning approaches did not result in improvements to the classification accuracy, they do provide the ability to learn hierarchical representations of a data set that standard classification methods do not allow. Our goal in this effort is not to determine the best algorithm in terms accuracy-as that depends on the specific application-but rather to highlight the benefits and drawbacks of each of the approach we discuss and provide insight on how they can be improved for predictive metagenomic analysis.
Road roughness is a measure of how uncomfortable a ride is, and provides an important indicator for the needs of roadway maintenance or repavement, which is closely tied to the state and federal budget prioritization. As such, accurate and timely monitoring of deteriorating road conditions and following maintenance are essential to improve the overall ride quality on the road. Various technologies, including vehiclemounted laser profiling systems, have been developed and adopted for road roughness (e.g., IRI-International Roughness Index) measurement; however, their high cost limits their use. While recent advances in smartphone technologies allow us to use their embedded accelerometers for road roughness monitoring, the complicated process of necessary vehicle calibration hinders the widespread use of the technology in the actual practices. In this work, a deep learning IRI estimation method is proposed with the goal of using anonymous (i.e., calibration-free) vehicles and their responses measured by smartphones as road roughness sensors. A state-of-the-art deep learning algorithm (i.e., CNN-convolutional neural network) and multimetric vehicle dynamics data (i.e., accelerometer, gyroscope), possibly measured by drivers' smartphones, are employed for the purpose. Optimized CNN architecture and data configuration have been investigated to achieve the best performance. The efficacy of the proposed method has been numerically validated using real road IRI information (i.e., Speedway, Tucson, AZ), real driving speed profiles, and four different types of vehicle data with associated uncertainties.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.