Philip O'Neil scite author profile

Philip O'Neil

3Publications

84Citation Statements Received

68Citation Statements Given

How they've been cited

156

How they cite others

Affiliations

Publications

Order By: Most citations

Applications of Machine Learning and High‐Dimensional Visualization in Cancer Detection, Diagnosis, and Management

McCarthy¹,

Marx²,

Hoffman³

et al. 2004

Annals of the New York Academy of Sciences

112

View full text Add to dashboard Cite

Recent technical advances in combinatorial chemistry, genomics, and proteomics have made available large databases of biological and chemical information that have the potential to dramatically improve our understanding of cancer biology at the molecular level. Such an understanding of cancer biology could have a substantial impact on how we detect, diagnose, and manage cancer cases in the clinical setting. One of the biggest challenges facing clinical oncologists is how to extract clinically useful knowledge from the overwhelming amount of raw molecular data that are currently available. In this paper, we discuss how the exploratory data analysis techniques of machine learning and high-dimensional visualization can be applied to extract clinically useful knowledge from a heterogeneous assortment of molecular data. After an introductory overview of machine learning and visualization techniques, we describe two proprietary algorithms (PURS and RadViz) that we have found to be useful in the exploratory analysis of large biological data sets. We next illustrate, by way of three examples, the applicability of these techniques to cancer detection, diagnosis, and management using three very different types of molecular data. We first discuss the use of our exploratory analysis techniques on proteomic mass spectroscopy data for the detection of ovarian cancer. Next, we discuss the diagnostic use of these techniques on gene expression data to differentiate between squamous and adenocarcinoma of the lung. Finally, we illustrate the use of such techniques in selecting from a database of chemical compounds those most effective in managing patients with melanoma versus leukemia.

show abstract

Data Mining the NCI Cancer Cell Line Compound GI₅₀ Values: Identifying Quinone Subtypes Effective Against Melanoma and Leukemia Cell Classes

Marx¹,

O'Neil²,

Hoffman³

et al. 2003

J. Chem. Inf. Comput. Sci.

View full text Add to dashboard Cite

Using data mining techniques, we have studied a subset (1400) of compounds from the large public National Cancer Institute (NCI) compounds data repository. We first carried out a functional class identity assignment for the 60 NCI cancer testing cell lines via hierarchical clustering of gene expression data. Comprised of nine clinical tissue types, the 60 cell lines were placed into six classes-melanoma, leukemia, renal, lung, and colorectal, and the sixth class was comprised of mixed tissue cell lines not found in any of the other five classes. We then carried out supervised machine learning, using the GI(50) values tested on a panel of 60 NCI cancer cell lines. For separate 3-class and 2-class problem clustering, we successfully carried out clear cell line class separation at high stringency, p < 0.01 (Bonferroni corrected t-statistic), using feature reduction clustering algorithms embedded in RadViz, an integrated high dimensional analytic and visualization tool. We started with the 1400 compound GI(50) values as input and selected only those compounds, or features, significant in carrying out the classification. With this approach, we identified two small sets of compounds that were most effective in carrying out complete class separation of the melanoma, non-melanoma classes and leukemia, non-leukemia classes. To validate these results, we showed that these two compound sets' GI(50) values were highly accurate classifiers using five standard analytical algorithms. One compound set was most effective against the melanoma class cell lines (14 compounds), and the other set was most effective against the leukemia class cell lines (30 compounds). The two compound classes were both significantly enriched in two different types of substituted p-quinones. The melanoma cell line class of 14 compounds was comprised of 11 compounds that were internal substituted p-quinones, and the leukemia cell line class of 30 compounds was comprised of 6 compounds that were external substituted p-quinones. Attempts to subclassify melanoma or leukemia cell lines based upon their clinical cancer subtype met with limited success. For example, using GI(50) values for the 30 compounds we identified as effective against all leukemia cell lines, we could subclassify acute lymphoblastic leukemia (ALL) origin cell lines from non-ALL leukemia origin cell lines without significant overlap from non-leukemia cell lines. Based upon clustering using GI(50) values for the 60 cancer cell lines laid out by the RadViz algorithm, these two compound subsets did not overlap with clusters containing any of the NCI's 92 compounds of known mechanism of action, a few of which are quinones. Given their structural patterns, the two p-quinone subtypes we identified would clearly be expected to possess different redox potentials/substrate specificities for enzymatic reduction in vivo. These two p-quinone subtypes represent valuable information that may be used in the elucidation of pharmacophores for the design of compounds to treat these two cancer tissue types...

show abstract

Data Mining the NCI Cancer Cell Line Compound GI₅₀ Values: Identifying Quinone Subtypes Effective Against Melanoma and Leukemia Cell Classes.

Marx¹,

O'Neil²,

Hoffman³

et al. 2003

ChemInform

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Philip O'Neil

Applications of Machine Learning and High‐Dimensional Visualization in Cancer Detection, Diagnosis, and Management

Data Mining the NCI Cancer Cell Line Compound GI₅₀ Values: Identifying Quinone Subtypes Effective Against Melanoma and Leukemia Cell Classes

Data Mining the NCI Cancer Cell Line Compound GI₅₀ Values: Identifying Quinone Subtypes Effective Against Melanoma and Leukemia Cell Classes.

Contact Info

Product

Resources

About

Philip O'Neil

Applications of Machine Learning and High‐Dimensional Visualization in Cancer Detection, Diagnosis, and Management

Data Mining the NCI Cancer Cell Line Compound GI50 Values: Identifying Quinone Subtypes Effective Against Melanoma and Leukemia Cell Classes

Data Mining the NCI Cancer Cell Line Compound GI50 Values: Identifying Quinone Subtypes Effective Against Melanoma and Leukemia Cell Classes.

Contact Info

Product

Resources

About

Data Mining the NCI Cancer Cell Line Compound GI₅₀ Values: Identifying Quinone Subtypes Effective Against Melanoma and Leukemia Cell Classes

Data Mining the NCI Cancer Cell Line Compound GI₅₀ Values: Identifying Quinone Subtypes Effective Against Melanoma and Leukemia Cell Classes.