BackgroundIn China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue.Methodology/Principal findingsWeekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011–2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China.Conclusion and significanceThe proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics.
Cannabis is one of the most important industrial crops distributed worldwide. However, the phylogeographic structure and domestication knowledge of this crop remains poorly understood. In this study, sequence variations of five chloroplast DNA (cpDNA) regions were investigated to address these questions. For the 645 individuals from 52 Cannabis accessions sampled (25 wild populations and 27 domesticated populations or cultivars), three haplogroups (Haplogroup H, M, L) were identified and these lineages exhibited distinct high-middle-low latitudinal gradients distribution pattern. This pattern can most likely be explained as a consequence of climatic heterogeneity and geographical isolation. Therefore, we examined the correlations between genetic distances and geographical distances, and tested whether the climatic factors are correlated with the cpDNA haplogroup frequencies of populations. The “isolation-by-distance” models were detected for the phylogeographic structure, and the day-length was found to be the most important factor (among 20 BioClim factors) that influenced the population structures. Considering the distinctive phylogeographic structures and no reproductive isolation among members of these lineages, we recommend that Cannabis be recognized as a monotypic genus typified by Cannabis sativa L., containing three subspecies: subsp. sativa, subsp. Indica, and subsp. ruderalis. Within each haplogroup which possesses a relatively independent distribution region, the wild and domesticated populations shared the most common haplotypes, indicating that there are multiregional origins for the domesticated crop. Contrast to the prevalent Central-Asia-Origin hypothesis of C. saltiva, molecular evidence reveals for the first time that the low latitude haplogroup (Haplogroup L) is the earliest divergent lineage, implying that Cannabis is probably originated in low latitude region.
BackgroundHPV has been found repeatedly in esophageal carcinoma tissues. However, reported detection rates of HPV DNA in these tumors have varied markedly. Differences in detection methods, sample types, and geographic regions of sample origin have been suggested as potential causes of this discrepancy.MethodsHPV L1 DNA and HPV genotypes were evaluated in 435 esophageal carcinoma specimens collected from four geographic regions with different ethnicities including Anyang in north China, Shantou in south China, Xinjiang in west China, and the United States. The HPV L1 fragment was detected using SPF1/GP6+ primers. HPV genotyping was performed using genotype specific PCR.ResultsTwo hundred and forty four of 435 samples (56.1%) tested positive for HPV L1. Significant differences in detection rate were observed neither among the three areas of China nor between China and the US. HPV6, 16, 18, 26, 45, 56, 57, and 58 were identified in L1 positive samples. HPV16 and 57 were the most common types in all regions, followed by HPV26 and HPV18.ConclusionsHPV infection is common in esophageal carcinoma independent of region and ethnic group of origin. Findings in this study raise the possibility that HPV is involved in esophageal carcinogenesis. Further investigation with a larger sample size over broader geographic areas may be warranted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.