One of the tools that can aid researchers and clinicians in coping with the surfeit of biomedical information is text mining. In this chapter, we explore how text mining is used to perform biomedical knowledge extraction. By describing its main phases, we show how text mining can be used to obtain relevant information from vast online databases of health science literature and patients' electronic health records. In so doing, we describe the workings of the four phases of biomedical knowledge extraction using text mining (text gathering, text preprocessing, text analysis, and presentation) entailed in retrieval of the sought information with a high accuracy rate. The chapter also includes an in depth analysis of the differences between clinical text found in electronic health records and biomedical text found in online journals, books, and conference papers, as well as a presentation of various text mining tools that have been developed in both university and commercial settings.
Forensic Speaker Recognition T here has long been a desire to be able to identify a person on the basis of his or her voice. For many years, judges, lawyers, detectives, and law enforcement agencies have wanted to use forensic voice authentication to investigate a suspect or to confirm a judgment of guilt or innocence [3] [35]. Challenges, realities, and cautions regarding the use of speaker recognition applied to forensic-quality samples are presented. Identifying a voice using forensic-quality samples is generally a challenging task for automatic, semiautomatic, and humanbased methods. The speech samples being compared may be recorded in different situations; e.g., one sample could be a yelling over the telephone, whereas the other might be a whisper in an interview room. A speaker could be disguising his or her voice, ill, or under the influence of drugs, alcohol, or stress in one or more of the samples. The speech samples will most likely contain noise, may be very short, and may not contain enough relevant speech material for comparative purposes. Each of these variables, in addition to the known variability of speech in general, makes reliable discrimination of speakers a complicated and daunting task. Although the scientific basis of authentication of a person by using his or her voice has been questioned by researchers (e.g., by scientists in 1970 [4], British academic phoneticians in 1983 [5], and the French speech communication community from 1990 to today [6]), there is a perception among the
Abstract-The ambiguities, repetitions and ellipses commonly found in natural language dialog continue to hinder speech (and text) analytic mining programs that glean business intelligence data from consumer help-line calls, or extract important medical diagnostic information from doctor-patient interviews or consumer-generated health-related blogs. This poses an even greater problem when such mining programs attempt to extract critical emotional data from natural language dialog. At present, speech (and text) analytic programs that mine natural language dialog for signs of distress, frustration, anger or other human emotions are still largely ineffective, because conventional speech systems that are limited to a set of key words and phrases cannot process speech as it actually occurs; if a speaker or blogger fails to use the word(s) found in the speech application's vocabulary, the program yields a poor statistical word match (or no match). This paper shows how Sequence Package Analysis is informed by a set of algorithms -representing some of the more complex semantic aspects of communication in addition to syntaxthat can interpret less than perfect natural speech, enhancing intelligent mining of recordings of doctor-patient interviews, customer care help-line calls, and consumergenerated health-related blogs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.