Tong Zhou scite author profile

et al. 2022

Objective To develop and validate an index to quantify the multimorbidity burden in Chinese middle-aged and older community-dwelling individuals. Methods We included 20,035 individuals aged 45 and older from the China Health and Retirement Longitudinal Study (CHARLS) and 19,297 individuals aged 65 and older from the Chinese Longitudinal Healthy Longevity Survey (CLHLS). Health outcomes of physical functioning (PF), basic and instrumental activities of daily living (ADL and IADL) and mortality were obtained. Based on self-reported disease status, we calculated five commonly used western multimorbidity indexes for CHARLS baseline participants. The one that predicted the health outcomes the best was selected and then modified through a linear mixed model using the repeated individual data in CHARLS. The performance of the modified index was internally and externally evaluated with CHARLS and CLHLS data. Results The multimorbidity-weighted index (MWI) performed the best among the five indexes. In the modified Chinese multimorbidity-weighted index (CMWI), the weights of the diseases varied greatly (range 0.2–5.1). The top three diseases with the highest impact were stroke, memory-related diseases and cancer, corresponding to weights of 5.1, 4.3 and 3.4, respectively. Compared with the MWI, the CMWI showed better model fits for PF and IADL with larger R2 and smaller Akaike information criterion, and comparable prediction performances for ADL, IADL and mortality (e.g. the same predictive accuracy of 0.80 for ADL disability). Conclusion The CMWI is an adequate index to quantify the multimorbidity burden for Chinese middle-aged and older community-dwelling individuals. It can be directly computed via disease status examined in regular community health check-ups to facilitate health management.

Performance and environmental implication assessments of green bio-composite from rice straw and bamboo

Pang

Journal of Cleaner Production

Cao

et al. 2022

An ensemble approach to predict binding hotspots in protein–RNA interactions based on SMOTE data balancing and Random Grouping feature selection strategies

Rong

Yang

et al. 2022

Motivation The identification of binding hotspots in protein-RNA interactions is crucial for understanding their potential recognition mechanisms and drug design. The experimental methods have many limitations, since they are usually time-consuming and labor-intensive. Thus, developing an effective and efficient theoretical method is urgently needed. Results Here we present SREPRHot, a method to predict hotspots, defined as the residues whose mutation to alanine generate a binding free energy change ≥ 2.0 kcal/mol, while others use a cutoff of 1.0 kcal/mol to obtain balanced datasets. To deal with the dataset imbalance, Synthetic Minority Over-sampling Technique (SMOTE) is utilized to generate minority samples to achieve a dataset balance. Additionally, besides conventional features, we use two types of new features, residue interface propensity previously developed by us, and topological features obtained using node-weighted networks, and propose an effective Random Grouping feature selection strategy combined with a two-step method to determine an optimal feature set. Finally, a stacking ensemble classifier is adopted to build our model. The results show SREPRHot achieves a good performance with SEN, MCC and AUC of 0.900, 0.557 and 0.829 on the independent testing dataset. The comparison study indicates SREPRHot shows a promising performance. Availability and implementation The source code is available at https://github.com/ChunhuaLiLab/SREPRHot. Supplementary information Supplementary data are available at Bioinformatics online.

Identification of volatile components from oviposition and non-oviposition plants of Gasterophilus pecorum (Diptera: Gasterophilidae)

Zhang

et al. 2020

Sci Rep

Oviposition by Gasterophilus pecorum on shoot tips of Stipa caucasica is a key determinant of its severe infection of the reintroduced Przewalski’s horse (Equus przewalskii). Volatiles in shoots of grasses on which Przewalski’s horse feeds, including S. caucasica at preoviposition, oviposition, and postoviposition stages of G. pecorum, S. caucasica, Stipa orientalis, and Ceratoides latens at the oviposition stage, and S. caucasica in various growth periods, were collected by dynamic headspace adsorption and analyzed by automatic thermal desorption gas chromatography-mass spectrometry. Among five volatiles with highest relative contents under three sets of conditions, caprolactam and 3-hexen-1-ol,(Z)- were common to all samples. Caprolactam was highest in C. latens at oviposition stage of G. pecorum and lowest in S. caucasica at postoviposition stage, and that of 3-hexen-1-ol,(Z)- was lowest in C. latens and highest in S. caucasica at its oviposition stage. Particularly, in S. caucasica during the three oviposition phenological stages of G. pecorum, 3-hexen-1-ol,acetate,(Z)-, 2(5H)-furanone,5-ethyl-, and 3-hexen-1-ol,acetate,(E)- were unique, respectively, to the preoviposition, oviposition, and postoviposition stages; in three plant species during the oviposition stage of G. pecorum, 3-hexen-1-ol,acetate,(Z)-, 3-hexenal, and 1-hexanol were unique to S. orientalis, acetic acid, hexanal, and 2(5H)-furanone,5-ethyl- to S. caucasica, and 1,3,6-octatriene,3,7-dimethyl-, cis-3-hexenyl isovalerate, and acetic acid hexyl ester to C. latens; in S. caucasica, 2-undecanone,6,10-dimethyl- was unique to the early growth period, acetic acid and 2(5H)-furanone,5-ethyl- to the flourishing growth period, and 3-hexen-1-ol,acetate,(Z)- and 1,3,6-octatriene,3,7-dimethyl- to the late growth period. Furthermore, substances specific to S. orientalis and C. latens were also present in S. caucasica, except at oviposition stage. Our findings will facilitate studies on G. pecorum’s adaptation to the arid desert steppe and its future control.

emPDBA: protein-DNA binding affinity prediction by combining features from binding partners and interface learned with ensemble regression model

Yang

Gong

et al. 2023

Protein–deoxyribonucleic acid (DNA) interactions are important in a variety of biological processes. Accurately predicting protein-DNA binding affinity has been one of the most attractive and challenging issues in computational biology. However, the existing approaches still have much room for improvement. In this work, we propose an ensemble model for Protein-DNA Binding Affinity prediction (emPDBA), which combines six base models with one meta-model. The complexes are classified into four types based on the DNA structure (double-stranded or other forms) and the percentage of interface residues. For each type, emPDBA is trained with the sequence-based, structure-based and energy features from binding partners and complex structures. Through feature selection by the sequential forward selection method, it is found that there do exist considerable differences in the key factors contributing to intermolecular binding affinity. The complex classification is beneficial for the important feature extraction for binding affinity prediction. The performance comparison of our method with other peer ones on the independent testing dataset shows that emPDBA outperforms the state-of-the-art methods with the Pearson correlation coefficient of 0.53 and the mean absolute error of 1.11 kcal/mol. The comprehensive results demonstrate that our method has a good performance for protein-DNA binding affinity prediction. Availability and implementation: The source code is available at https://github.com/ChunhuaLiLab/emPDBA/.