BackgroundUntargeted metabolomics generates a huge amount of data. Software packages for automated data processing are crucial to successfully process these data. A variety of such software packages exist, but the outcome of data processing strongly depends on algorithm parameter settings. If they are not carefully chosen, suboptimal parameter settings can easily lead to biased results. Therefore, parameter settings also require optimization. Several parameter optimization approaches have already been proposed, but a software package for parameter optimization which is free of intricate experimental labeling steps, fast and widely applicable is still missing.ResultsWe implemented the software package IPO (‘Isotopologue Parameter Optimization’) which is fast and free of labeling steps, and applicable to data from different kinds of samples and data from different methods of liquid chromatography - high resolution mass spectrometry and data from different instruments.IPO optimizes XCMS peak picking parameters by using natural, stable 13C isotopic peaks to calculate a peak picking score. Retention time correction is optimized by minimizing relative retention time differences within peak groups. Grouping parameters are optimized by maximizing the number of peak groups that show one peak from each injection of a pooled sample. The different parameter settings are achieved by design of experiments, and the resulting scores are evaluated using response surface models. IPO was tested on three different data sets, each consisting of a training set and test set. IPO resulted in an increase of reliable groups (146% - 361%), a decrease of non-reliable groups (3% - 8%) and a decrease of the retention time deviation to one third.ConclusionsIPO was successfully applied to data derived from liquid chromatography coupled to high resolution mass spectrometry from three studies with different sample types and different chromatographic methods and devices. We were also able to show the potential of IPO to increase the reliability of metabolomics data.The source code is implemented in R, tested on Linux and Windows and it is freely available for download at https://github.com/glibiseller/IPO. The training sets and test sets can be downloaded from https://health.joanneum.at/IPO.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0562-8) contains supplementary material, which is available to authorized users.
Mycobacterium avium ssp. paratuberculosis (MAP) causes Johne's disease (JD) in ruminants and is shed into the milk of infected cows, which contributes to the controversial discussion about a possible link between MAP and Crohn's disease in humans. The aim of the study was to investigate the risk for the entry of MAP in the food chain via milk from dairy farms with subclinical JD. Therefore, the occurrence of MAP in the milk of a dairy herd with a low prevalence of JD was studied in single and bulk tank milk samples over a period of 23 mo and compared with MAP shedding into feces. Milk, fecal, and blood samples were taken from all cows older than 1.5 yr of age at the beginning and the end of the trial and analyzed for MAP or specific antibodies. In addition, 63 cows (33 MAP infected and 30 MAP noninfected) were selected for monthly sampling. Raw and pasteurized bulk tank milk samples were collected on a monthly basis. The milk samples were tested for MAP by real-time quantitative PCR (qPCR), and the fecal samples were tested for bacterial shedding by qPCR or solid culture. Based on the results of the herd investigations, the prevalence of cows shedding MAP was around 5%; no cases of clinical JD were observed during the study period. The results of the ELISA showed high variation, with 2.1 to 5.1% positive milk samples and 14.9 to 18.8% ELISA-positive blood samples. Monthly milk sampling revealed low levels of MAP shedding into the individual milk samples of both MAP-infected and noninfected cows, with only 13 cows shedding the bacterium into milk during the study period. Mycobacterium avium ssp. paratuberculosis was not detected by qPCR in any raw or pasteurized bulk tank milk sample throughout the study. A significant positive association could be found between MAP shedding into milk and feces. From the results of the present study, it can be concluded that MAP is only shed via milk in a small proportion of cows with subclinical JD for a limited period of time and is diluted below the detection level of qPCR within the bulk tank milk of these herds. These findings indicate that dairy herds subclinically infected with JD pose only a minor source for human MAP consumption with milk and milk products.
For detecting anomalies or interventions in the field of forest monitoring we propose an approach based on the spatial and temporal forecast of satellite time series data. For each pixel of the satellite image three different types of forecasts are provided, namely spatial, temporal and combined spatio-temporal forecast. Spatial forecast means that a clustering algorithm is used to group the time series data based on the features normalised difference vegetation index (NDVI) and the short-wave infrared band (SWIR). For estimation of the typical temporal trajectory of the NDVI and SWIR during the vegetation period of each spatial cluster, we apply several methods of functional data analysis including functional principal component analysis, and a novel form of random regression forests with online learning (streaming) capability. The temporal forecast is carried out by means of functional time series analysis and an autoregressive integrated moving average model. The combination of the temporal forecasts, which is based on the past of the considered pixel, and spatial forecasts, which is based on highly correlated pixels within one cluster and their past, is performed by functional data analysis, and a variant of random regression forests adapted to online learning capabilities. For evaluation of the methods, the approaches are applied to a study area in Germany for monitoring forest damages caused by wind-storm, and to a study area in Spain for monitoring forest fires.
Paratuberculosis (Johne's disease) in ruminants is caused by Mycobacterium avium subsp. paratuberculosis (MAP). Owing to the lack of accurate laboratory tests, diagnosis is challenging in subclinically infected cattle. To evaluate the long-term performance of serum ELISAs for the detection of paratuberculosis in a dairy herd with low MAP-prevalence, three investigations of all the cows and the consecutive testing of 33 cows suspected to be infected with MAP and 30 cows classified as MAP free were performed over a period of 22 months. Blood samples were tested by three commercial serum ELISAs, MAP shedding was detected by bacteriological culture and polymerase chain reaction (PCR). The ELISA results varied in a wide range in the herd investigations with 1.2% to 18.8% positive samples, the faecal samples were positive for MAP between 1.8% and 4.9% in the three herd investigations. Over the study period, ELISA-positive serum samples varied between 0.0% and 69.7% in MAP-suspicious and 0.0% and 17.6% in MAP-unsuspicious cows with a poor correlation between ELISAs and faecal shedding. The correlation coefficient of the optical density values of the three ELISAs varied between 0.348 and 0.61. Evidence of cow specific variations of residuals was found in all linear models. The linear mixed models showed relevant contribution of cow specific variation in explanation of the residual variances. They also showed significant effects of the explanatory ELISA, the group (MAP-suspicious or MAP-unsuspicious) and the time of sampling. It can be concluded that the choice of the laboratory test significantly influences the outcome of the testing for MAP and that none of the three ELISAs can be thoroughly recommended as single test for the early diagnosis of paratuberculosis in cattle. Test results should always be interpreted with caution to avoid erroneous decisions and the disappointment of those engaged in the abatement of paratuberculosis.
This paper investigates the influences of different statistical network traffic feature sets on detecting advanced persistent threats. The selection of suitable features for detecting targeted cyber attacks is crucial to achieving high performance and to address limited computational and storage costs. The evaluation was performed on a semi-synthetic dataset, which combined the CICIDS2017 dataset and the Contagio malware dataset. The CICIDS2017 dataset is a benchmark dataset in the intrusion detection field and the Contagio malware dataset contains real advanced persistent threat (APT) attack traces. Several different combinations of datasets were used to increase variety in background data and contribute to the quality of results. For the feature extraction, the CICflowmeter tool was used. For the selection of suitable features, a correlation analysis including an in-depth feature investigation by boxplots is provided. Based on that, several suitable features were allocated into different feature sets. The influences of these feature sets on the detection capabilities were investigated in detail with the local outlier factor method. The focus was especially on attacks detected with different feature sets and the influences of the background on the detection capabilities with respect to the local outlier factor method. Based on the results, we could determine a superior feature set, which detected most of the malicious flows.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.