Clinical sources of information are markedly increasing in both volume and variety. A significant portion of the valuable data resides in the unstructured or semi-structured clinical text of documents stored in disparate repositories or embedded in HL7 messages. Clinical documents such as discharge summaries, prescriptions, lab reports, and free-form physician notes are filled with abbreviations, acronyms, misspellings, and ungrammatical phrases. However, synoptic reporting methods are restrictive for health care practitioners who wish to express critical and comprehensive patient information in electronic medical records. Furthermore, they have been superseded by systems that use natural language processing (NLP) to extract clinical concepts from free-form text. To address the growing need for efficient NLP solutions that can handle the volume and variety of clinical text, we have developed an optimized rules-based clinical concept extractor called TRACE (Tactical Rules-based AQL Clinical Extractor) using the Annotation Query Language (AQL). We present the experience we have gained applying text mining tools to this challenging domain, as well as a comparison of our solution to cTAKES (clinical Text Analysis and Knowledge Extraction System), an open-source clinical text miner, on a set of prescription documents. We also describe how efficient and scalable clinical text mining techniques will improve several of our company's offerings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.