Shang Xue scite author profile

In the field of cancer genomics, the broad availability of genetic information offered by next-generation sequencing technologies and rapid growth in biomedical publication has led to the advent of the big-data era. Integration of artificial intelligence (AI) approaches such as machine learning, deep learning, and natural language processing (NLP) to tackle the challenges of scalability and high dimensionality of data and to transform big data into clinically actionable knowledge is expanding and becoming the foundation of precision medicine. In this paper, we review the current status and future directions of AI application in cancer genomics within the context of workflows to integrate genomic analysis for precision cancer care. The existing solutions of AI and their limitations in cancer genetic testing and diagnostics such as variant calling and interpretation are critically analyzed. Publicly available tools or algorithms for key NLP technologies in the literature mining for evidence-based clinical recommendations are reviewed and compared. In addition, the present paper highlights the challenges to AI adoption in digital healthcare with regard to data requirements, algorithmic transparency, reproducibility, and real-world assessment, and discusses the importance of preparing patients and physicians for modern digitized healthcare. We believe that AI will remain the main driver to healthcare transformation toward precision medicine, yet the unprecedented challenges posed should be addressed to ensure safety and beneficial impact to healthcare.

show abstract

Genetic Architecture of Domestication-Related Traits in Maize

Xue

Bradbury

Casstevens

et al. 2016

View full text Add to dashboard Cite

Strong directional selection occurred during the domestication of maize from its wild ancestor teosinte, reducing its genetic diversity, particularly at genes controlling domestication-related traits. Nevertheless, variability for some domestication-related traits is maintained in maize. The genetic basis of this could be sequence variation at the same key genes controlling maize-teosinte differentiation (due to lack of fixation or arising as new mutations after domestication), distinct loci with large effects, or polygenic background variation. Previous studies permit annotation of maize genome regions associated with the major differences between maize and teosinte or that exhibit population genetic signals of selection during either domestication or postdomestication improvement. Genome-wide association studies and genetic variance partitioning analyses were performed in two diverse maize inbred line panels to compare the phenotypic effects and variances of sequence polymorphisms in regions involved in domestication and improvement to the rest of the genome. Additive polygenic models explained most of the genotypic variation for domesticationrelated traits; no large-effect loci were detected for any trait. Most trait variance was associated with background genomic regions lacking previous evidence for involvement in domestication. Improvement sweep regions were associated with more trait variation than expected based on the proportion of the genome they represent. Selection during domestication eliminated large-effect genetic variants that would revert maize toward a teosinte type. Small-effect polygenic variants (enriched in the improvement sweep regions of the genome) are responsible for most of the standing variation for domestication-related traits in maize.KEYWORDS quantitative trait loci; nested association mapping; genome-wide association study; variance components; Zea mays T HE domestication of all major crop plants occurred in a relatively short period in human history, starting 10,000 years ago (Harlan 1992). During the domestication process, seeds of preferred forms were selected and saved to plant subsequent generations. Some alleles favored under domestication may have been neutral or even deleterious for the survival of wild plant species; for example, seed shattering promotes seed dispersal in wild grasses, but alleles for nondisarticulating seed structures were strongly selected for under domestication (Galinat 1983). Consequently, rare alleles favorable for growth and development under agricultural conditions or for traits desired by humans increased in frequency, often reaching fixation and reducing genetic variation very near causal sequence sites (Wang et al. 1999). In addition, domestication was often accompanied by severe genetic bottlenecks from the use of small founder populations. The reduction in effective population sizes also resulted in reduced genetic diversity genome-wide. Population genetics methods to model the strength and duration of bottlenecks provide a means to ...

show abstract

Contrast-enhanced sonography of thyroid nodules

Jiang

Huang

Zhao

et al. 2014

J. Clin. Ultrasound

View full text Add to dashboard Cite

show abstract

Comparison of one-stage and two-stage genome-wide association studies

Xue

Ogut

Miller

et al. 2017

Preprint

View full text Add to dashboard Cite

Linear mixed models are widely used in humans, animals, and plants to conduct genome-wide association studies (GWAS). A characteristic of experimental designs for plants is that experimental units are typically multiple-plant plots of families or lines that are replicated across environments. This structure can present computational challenges to conducting a genome scan on raw (plot-level) data. Two-stage methods have been proposed to reduce the complexity and increase the computational speed of whole-genome scans. The first stage of the analysis fits raw data to a model including environment and line effects, but no individual marker effects. The second stage involves the whole genome scan of marker tests using summary values for each line as the dependent variable. Missing data and unbalanced experimental designs can result in biased estimates of marker association effects from two-stage analyses. In this study, we developed a weighted two-stage analysis to reduce bias and improve power of GWAS while maintaining the computational efficiency of two-stage analyses. Simulation based on real marker data of a diverse panel of maize inbred lines was used to compare power and false discovery rate of the new weighted two-stage method to single-stage and other two-stage analyses and to compare different two-stage models. In the case of severely unbalanced data, only the weighted two-stage GWAS has power and false discovery rate similar to the one-stage analysis. The weighted GWAS method has been implemented in the open-source software TASSEL.

show abstract

Predictive article recommendation using natural language processing and machine learning to support evidence updates in domain-specific knowledge graphs

Sharma

Willis

Huettner

et al. 2020

View full text Add to dashboard Cite

Objectives Describe an augmented intelligence approach to facilitate the update of evidence for associations in knowledge graphs. Methods New publications are filtered through multiple machine learning study classifiers, and filtered publications are combined with articles already included as evidence in the knowledge graph. The corpus is then subjected to named entity recognition, semantic dictionary mapping, term vector space modeling, pairwise similarity, and focal entity match to identify highly related publications. Subject matter experts review recommended articles to assess inclusion in the knowledge graph; discrepancies are resolved by consensus. Results Study classifiers achieved F-scores from 0.88 to 0.94, and similarity thresholds for each study type were determined by experimentation. Our approach reduces human literature review load by 99%, and over the past 12 months, 41% of recommendations were accepted to update the knowledge graph. Conclusion Integrated search and recommendation exploiting current evidence in a knowledge graph is useful for reducing human cognition load.

show abstract

Discriminatory value of carotid artery elasticity changes for the evaluation of thyroid dysfunction in patients with hashimoto's thyroiditis

Feng

Zhao

Jiang

et al. 2016

J of Clinical Ultrasound

View full text Add to dashboard Cite

show abstract

Clinical insights into hematologic malignancies and comparative analysis of molecular signatures of acute myeloid leukemia in different ethnicities using an artificial intelligence offering

Snowdon

Weeraratne

Huang

et al. 2021

View full text Add to dashboard Cite

Next generation sequencing generates copious amounts of genomics data, causing manual interpretation to be laborious and non-scalable while remaining subjective (even for highly trained specialists). We evaluated the performance of the artificial intelligence-based offering Watson for Genomics (WfG), a variant interpretation platform, in hematologic malignancies for the first time. Next generation sequencing was performed for patients treated for various hematological malignancies at Hallym University Sacred Heart Hospital, South Korea, between December 2017 and August 2020 using a 54-gene panel. Both WfG and expert manual curation were used to evaluate the performance of WfG. Acute myeloid leukemia (AML) molecular profiles were compared between Koreans and other ethnic groups using a publicly available dataset. Seventy-seven patients were analyzed (AML: 45, myeloproliferative neoplasms: 12, multiple myeloma: 7, myelodysplastic syndromes: 6, and others: 7). The concordance between the manual and WfG interpretations of 35 variants in 11 random patients was 94%. Among all patients, WfG identified 39 (51%) with at least 1 clinically actionable therapeutic alteration (i.e., a variant targeted by a United States Food and Drug Administration [US FDA]-approved drug, off-label drug, or clinical trial). Moreover, 46% of these patients (18/39) had genes that were targeted by a US FDA-approved therapy. WfG identified diagnostic or prognostic insights in 65% of the patients with no targetable alterations. In those with AML, FLT3 -internal tandem duplications or tyrosine kinase domain mutations were less frequent among Koreans than among Caucasians (6.7% vs 30.2%, P < .001) or Hispanics (6.7% vs 28.3%, P = .005), suggesting ethnic differences. Variant interpretation using WfG correlated well with manually curated expert opinions. WfG provided therapeutic insights (including variant-specific drugs and clinical trials that cannot easily be provided by expert manual curation), as well as diagnostic and/or prognostic information.

show abstract

Genomic analysis of myeloproliferative neoplasm (MPN) patients from a single institution in South Korea to reveal novel pathogenic mutations and perturbed pathways.

Weeraratne

Huang

Brotman

et al. 2020

JCO

View full text Add to dashboard Cite

e19533 Background: Therapeutic, prognostic, and diagnostic insights gained from next generation sequencing (NGS) are a key premise of genomics-informed cancer care in hematological diseases. Particularly in BCR-ABL negative myeloproliferative neoplasms (MPN), insights gained from NGS is integral for risk stratification and prognostication. In this study, MPN patients of South Korean descent were sequenced, interpreted, and compared with a published validation cohort to identify variations in mutational profiles specific to demographics. Methods: 31 South Korean MPN patients including 12 essential thrombocythemia, 6 polycythemia vera, 6 primary myelofibrosis, and 7 chronic myelogenous leukemia were sequenced in 2018 and 2019 using the 54 gene Illumina TruSight Myeloid Panel at Hallym University College of Medicine. Orthogonal testing for CALR mutations was done by Sanger sequencing. Watson for Genomics (WfG), an artificial intelligence offering was used for variant interpretation and annotation. A cohort of 151 MPN patients previously published in the New England Journal of Medicine (NEJM) was used for comparison (PMID:24325359). Results: The table shows identified actionable mutations. Conclusions: Two novel pathogenic mutations in CALR (c.1162delG and c.1100_1145del)) were identified in Korean MPN patients. NOTCH1 pathogenic mutations were exclusive while TP53 mutations were significantly enriched in the Korean cohort suggesting that these pathways may play a role in MPN. TP53 mutations in MPN are clinically significant as they have been associated with increased risk for leukemic transformation. Of note, MPL mutations were not detected in the Korean cohort. In conclusion, race and ethnicity may contribute to some mutational signatures in cancer. [Table: see text]

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shang Xue

Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives

Genetic Architecture of Domestication-Related Traits in Maize

Contrast-enhanced sonography of thyroid nodules

Comparison of one-stage and two-stage genome-wide association studies

Predictive article recommendation using natural language processing and machine learning to support evidence updates in domain-specific knowledge graphs

Discriminatory value of carotid artery elasticity changes for the evaluation of thyroid dysfunction in patients with hashimoto's thyroiditis

Clinical insights into hematologic malignancies and comparative analysis of molecular signatures of acute myeloid leukemia in different ethnicities using an artificial intelligence offering

Genomic analysis of myeloproliferative neoplasm (MPN) patients from a single institution in South Korea to reveal novel pathogenic mutations and perturbed pathways.

Contact Info

Product

Resources

About