A druggable cascade links methionine metabolism to epigenomic reprogramming in squamous cell carcinoma

The Clinical Language Understanding group at Nuance Communications has developed a medical information extraction system that combines a rule-based extraction engine with machine learning algorithms to identify and categorize references to patient smoking in clinical reports. The extraction engine identifies smoking references; documents that contain no smoking references are classified as UNKNOWN. For the remaining documents, the extraction engine uses linguistic analysis to associate features such as status and time to smoking mentions. Machine learning is used to classify the documents based on these features. This approach shows overall accuracy in the 90s on all data sets used. Classification using engine-generated and word-based features outperforms classification using only word-based features for all data sets, although the difference gets smaller as the data set size increases. These techniques could be applied to identify other risk factors, such as drug and alcohol use, or a family history of a disease.

show abstract

Event-building through role-filling and anaphora resolution

Whittemore¹,

Macpherson²,

Carlson

1991

View full text Add to dashboard Cite

In this study we map out a way to build event representations incrementally, using information which may be widely distributed across a discourse. An enhanced Discourse Representation (Kamp, 1981) provides the vehicle both for carrying open event roles through the discourse until they can be instantiated by NPs, and for resolving the reference of these otherwise problematic NPs by binding them to the event roles.

show abstract

Distilling information from text: The EDS TemplateFiller System

Shuldberg¹,

Macpherson²,

Humphrey³

et al. 1993

J. Am. Soc. Inf. Sci.

View full text Add to dashboard Cite

A system is described which digests large volumes of text, filtering out irrelevant articles and distilling the remainder into templates that represent information from the articles in simple slot/filler pairs. The system is highly modular in that it consists of a series of programs, each of which contributes information to the text to help in the final analysis of determining which strings constitute valid values for the slots in the template. This modular design has the dual advantage of allowing relatively easy debugging and of permitting many of the component programs to participate in other projects. The system is customized to specific domains, taking advantage of simple string matching techniques to improve the effectiveness of more complex sentence-level semantic processes. The extension to new domains has been facilitated by dividing system data files into generic vs. specific categories; domain extension requires the creation of only the domain-specific files.

show abstract

Redefining the “level” of the “word”

Macpherson¹

1992

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Melissa Macpherson

Identifying Smokers with a Medical Extraction System

Event-building through role-filling and anaphora resolution

Distilling information from text: The EDS TemplateFiller System

Redefining the “level” of the “word”

Contact Info

Product

Resources

About