Ming-Tsung Hsu scite author profile

The traditional biological assay is very time-consuming, and thus the ability to quickly screen large numbers of compounds against a specific biological target is appealing. To speed up the biological evaluation of compounds, high-throughput screening is widely used in the fields of biomedical, biological information, and drug discovery. The research presented in this study focuses on the use of support vector machines, a machine learning method, various classes of molecular descriptors, and different sampling techniques to overcome overfitting to classify compounds for cytotoxicity with respect to the Jurkat cell line. The cell cytotoxicity data set is imbalanced (a few active compounds and very many inactive compounds), and the ability of the predictive modeling methods is adversely affected in these situations. Commonly imbalanced data sets are overfit with respect to the dominant classified end point; in this study the models routinely overfit toward inactive (noncytotoxic) compounds when the imbalance was substantial. Support vector machine (SVM) models were used to probe the proficiency of different classes of molecular descriptors and oversampling ratios. The SVM models were constructed from 4D-FPs, MOE (1D, 2D, and 21/2D), noNP+MOE, and CATS2D trial descriptors pools and compared to the predictive abilities of CATS2D-based random forest models. Compared to previous results in the literature, the SVM models built from oversampled data sets exhibited better predictive abilities for the training and external test sets.

show abstract

Human Breathomics Database

Kuo

Tan

Wang

et al. 2020

View full text Add to dashboard Cite

Breathomics is a special branch of metabolomics that quantifies volatile organic compounds (VOCs) from collected exhaled breath samples. Understanding how breath molecules are related to diseases, mechanisms and pathways identified from experimental analytical measurements is challenging due to the lack of an organized resource describing breath molecules, related references and biomedical information embedded in the literature. To provide breath VOCs, related references and biomedical information, we aim to organize a database composed of manually curated information and automatically extracted biomedical information. First, VOCs-related disease information was manually organized from 207 literature linked to 99 VOCs and known Medical Subject Headings (MeSH) terms. Then an automated text mining algorithm was used to extract biomedical information from this literature. In the end, the manually curated information and auto-extracted biomedical information was combined to form a breath molecule database—the Human Breathomics Database (HBDB). We first manually curated and organized disease information including MeSH term from 207 literatures associated with 99 VOCs. Then, an automatic pipeline of text mining approach was used to collect 2766 literatures and extract biomedical information from breath researches. We combined curated information with automatically extracted biomedical information to assemble a breath molecule database, the HBDB. The HBDB is a database that includes references, VOCs and diseases associated with human breathomics. Most of these VOCs were detected in human breath samples or exhaled breath condensate samples. So far, the database contains a total of 913 VOCs in relation to human exhaled breath researches reported in 2766 publications. The HBDB is the most comprehensive HBDB of VOCs in human exhaled breath to date. It is a useful and organized resource for researchers and clinicians to identify and further investigate potential biomarkers from the breath of patients. Database URL: https://hbdb.cmdm.tw

show abstract

Structure, Mechanistic Action, and Essential Residues of a GH-64 Enzyme, Laminaripentaose-producing β-1,3-Glucanase

Liu

Hsu

et al. 2009

Journal of Biological Chemistry

View full text Add to dashboard Cite

Laminaripentaose-producing ␤-1,3-glucanase (LPHase), a member of glycoside hydrolase family 64, cleaves a longchain polysaccharide ␤-1,3-glucan into specific pentasaccharide oligomers. The crystal structure of LPHase from Streptomyces matensis DIC-108 was solved to 1.62 Å resolution using multiple-wavelength anomalous dispersion methods. The LPHase structure reveals a novel crescent-like fold; it consists of a barrel domain and a mixed (␣/␤) domain, forming a wide-open groove between the two domains. The liganded crystal structure was also solved to 1.

show abstract

Exosomal Proteins and Lipids as Potential Biomarkers for Lung Cancer Diagnosis, Prognosis, and Treatment

Hsu

Wang

Tseng

2022

Cancers

View full text Add to dashboard Cite

Exosomes participate in cell–cell communication by transferring molecular components between cells. Previous studies have shown that exosomal molecules derived from cancer cells and liquid biopsies can serve as biomarkers for cancer diagnosis and prognosis. The exploration of the molecules transferred by lung cancer-derived exosomes can advance the understanding of exosome-mediated signaling pathways and mechanisms. However, the molecular characterization and functional indications of exosomal proteins and lipids have not been comprehensively organized. This review thoroughly collected data concerning exosomal proteins and lipids from various lung cancer samples, including cancer cell lines and cancer patients. As potential diagnostic and prognostic biomarkers, exosomal proteins and lipids are available for clinical use in lung cancer. Potential therapeutic targets are mentioned for the future development of lung cancer therapy. Molecular functions implying their possible roles in exosome-mediated signaling are also discussed. Finally, we emphasized the importance and value of lung cancer stem cell-derived exosomes in lung cancer therapy. In summary, this review presents a comprehensive description of the protein and lipid composition and function of lung cancer-derived exosomes for lung cancer diagnosis, prognosis, and treatment.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ming-Tsung Hsu

Oversampling to Overcome Overfitting: Exploring the Relationship between Data Set Composition, Molecular Descriptors, and Predictive Modeling Methods

Human Breathomics Database

Structure, Mechanistic Action, and Essential Residues of a GH-64 Enzyme, Laminaripentaose-producing β-1,3-Glucanase

Exosomal Proteins and Lipids as Potential Biomarkers for Lung Cancer Diagnosis, Prognosis, and Treatment

Contact Info

Product

Resources

About