In the last 5 years there have been a large number of new time series classification algorithms proposed in the literature. These algorithms have been evaluated on subsets of the 47 data sets in the University of California, Riverside time series classification archive. The archive has recently been expanded to 85 data sets, over half of which have been donated by researchers at the University of East Anglia. Aspects of previous evaluations have made comparisons between algorithms difficult. For example, several different programming languages have been used, experiments involved a single train/test split and some used normalised data whilst others did not. The relaunch of the archive provides a timely opportunity to thoroughly evaluate algorithms on a larger number of datasets. We have implemented 18 recently proposed algorithms in a common Java framework and compared them against two standard benchmark classifiers (and each other) by performing 100 resampling experiments on each of the 85 datasets. We use these results to test several hypotheses relating to whether the algorithms are significantly more accurate than the benchmarks and each other. Our results indicate that only nine of these algorithms are significantly more accurate than both benchmarks and that one classifier, the collective of transformation ensembles, is significantly more accurate than all of the others. All of our experiments and results are reproducible: we release all of our code, results and experimental details and we hope these experiments form the basis for more robust testing of new algorithms in the future.
Recently, two ideas have been explored that lead to more accurate algorithms for time-series classification (TSC). First, it has been shown that the simplest way to gain improvement on TSC problems is to transform into an alternative data space where discriminatory features are more easily detected. Second, it was demonstrated that with a single data representation, improved accuracy can be achieved through simple ensemble schemes. We combine these two principles to test the hypothesis that forming a collective of ensembles of classifiers on different data transformations improves the accuracy of time-series classification. The collective contains classifiers constructed in the time, frequency, change, and shapelet transformation domains. For the time domain we use a set of elastic distance measures. For the other domains we use a range of standard classifiers. Through extensive experimentation on 72 datasets, including all of the 46 UCR datasets, we demonstrate that the simple collective formed by including all classifiers in one ensemble is significantly more accurate than any of its components and any other previously published TSC algorithm. We investigate alternative hierarchical collective structures and demonstrate the utility of the approach on a new problem involving classifying {\em Caenorhabditis elegans} mutant types
25Progress in remote sensing and robotic technologies decreases the hardware costs of 26 phenotyping. Here, we first review cost-effective imaging devices and environmental sensors, 27 and present a trade-off between investment and manpower costs. We then discuss the structure 28 of costs in various real-world scenarios. Hand-held low-cost sensors are suitable for quick and 29 infrequent plant diagnostic measurements. In experiments for genetic or agronomic analyses, (i) 30 major costs arise from plant handling and manpower; (ii) the total costs per pot/microplot are 31 similar in robotized platform or field experiments with drones, hand-held or robotized ground 32 vehicles; (iii) the cost of vehicles carrying sensors represents only 5-26% of the total costs. These 33 conclusions depend on the context, in particular for labor cost, the quantitative demand of 34 phenotyping and the number of days available for phenotypic measurements due to climatic 35 constraints. Data analysis represents 10-20% of total cost if pipelines have already been 36 developed. A trade-off exists between the initial high cost of pipeline development and labor cost 37 of manual operations. Overall, depending on the context and objectives, "cost-effective" 38 phenotyping may involve either low investment ("affordable phenotyping"), or initial high 39 investments in sensors, vehicles and pipelines that result in higher quality and lower operational 40 costs. 41 Highlights 42 -New technologies considerably reduce the costs of sensors and automated vehicles 43 -Low investment in sensors, vehicles or pipelines present trade-offs with labor costs 44 -Plant/plot handling and labor costs represent the major proportion of costs in phenotyping 45 experiments 46 -The costs of high-throughput experiments in the field and in automated platforms is similar 47 regardless of vehicles 48 -The development of software applications (e.g. imaging, phenotypic analyses, models, 49 information system) is a major part of costs 50 51 52 54 I Imaging techniques with a range of hardware costs 55 1.1 Handheld phenotyping technologies 56 1.2 Aerial imaging for large-scale phenotyping 57 1.3 Imaging with ground vehicles 58 1.4 Environmental measurements 59 II Costs associated with image capture represent a fraction of the overall cost of phenotyping 60 2.1 A method for calculating costs in field and greenhouse platforms 61 2.2 A high cost for plant management 62 2.3 Investing in appropriate environmental characterization results in comparatively low cost 63 for a high return 64 2.4 Imaging costs: a trade-off between investment and labor costs 65 2.4.1 The choice of vehicle mostly depends on the demand for microplots per year 66 2.4.2 The cost of imaging devices is similar to that of vehicles that carry sensors 67 2.5 Costs of typical experiments 68 2.5.1 Image analysis: a tradeoff between investment in automated workflows and day-to-day 69 labor costs 70 2.5.2 High costs for data analysis for the identification of traits 71 2.5.3 Costs associated with data storag...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.