BackgroundThe massive quantities of genetic data generated by high-throughput sequencing pose challenges to data storage, transmission and analyses. These problems are effectively solved through data compression, in which the size of data storage is reduced and the speed of data transmission is improved. Several options are available for compressing and storing genetic data. However, most of these options either do not provide sufficient compression rates or require a considerable length of time for decompression and loading.ResultsHere, we propose TRCMGene, a lossless genetic data compression method that uses a referential compression scheme. The novel concept of two-step compression method, which builds an index structure using K-means and k-nearest neighbours, is introduced to TRCMGene. Evaluation with several real datasets revealed that the compression factor of TRCMGene ranges from 9 to 21. TRCMGene presents a good balance between compression factor and reading time. On average, the reading time of compressed data is 60% of that of uncompressed data. Thus, TRCMGene not only saves disc space but also saves file access time and speeds up data loading. These effects collectively improve genetic data storage and transmission in the current hardware environment and render system upgrades unnecessary. TRCMGene, user manual and demos could be accessed freely from https://github.com/tangyou79/TRCM. The data mentioned in this manuscript could be downloaded from: https://github.com/tangyou79/TRCM/wiki.
Aiming at precise evaluation of the performance of soybean seed metering devices, a photoelectric sensor-based real-time monitoring system was designed. The proposed system mainly included a photoelectric sensor module for seeding signal collecting, Hall sensors speeding module, microcontroller unit (MCU), light and sound alarm module, human–machine interface (HMI), and other parts. The indexes of miss, multiples, flow rate, and application rate were estimated on the basis of seeder speed, seed metering disk rotation rate, photoelectric sensor signals, and clock signals. These real-time statistics of the seeding process were recorded by seeding management system. The laboratory results showed that the detection errors of seeding quantity of both big- and small-diameter soybeans were less than 2.0%. Miss and multiples index estimated by this system were 0.4% and 0.5% than that of seeding image monitoring platform (SIMP), respectively. In field tests, miss and multiples index can be used to evaluate the performance of seed metering device, and big-diameter seeds can be detected more precisely than small ones by these photoelectric sensors. This system can provide support for evaluation of working performance of seed metering devices and have a positive effect on seeding monitoring technology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.