Purpose The South London and Maudsley National Health Service (NHS) Foundation Trust Biomedical Research Centre (SLaM BRC) Case Register and its Clinical Record Interactive Search (CRIS) application were developed in 2008, generating a research repository of real-time, anonymised, structured and open-text data derived from the electronic health record system used by SLaM, a large mental healthcare provider in southeast London. In this paper, we update this register's descriptive data, and describe the substantial expansion and extension of the data resource since its original development. Participants Descriptive data were generated from the SLaM BRC Case Register on 31 December 2014. Currently, there are over 250 000 patient records accessed through CRIS. Findings to date Since 2008, the most significant developments in the SLaM BRC Case Register have been the introduction of natural language processing to extract structured data from open-text fields, linkages to external sources of data, and the addition of a parallel relational database (Structured Query Language) output. Natural language processing applications to date have brought in new and hitherto inaccessible data on cognitive function, education, social care receipt, smoking, diagnostic statements and pharmacotherapy. In addition, through external data linkages, large volumes of supplementary information have been accessed on mortality, hospital attendances and cancer registrations. Future plans Coupled with robust data security and governance structures, electronic health records provide potentially transformative information on mental disorders and outcomes in routine clinical care. The SLaM BRC Case Register continues to grow as a database, with approximately 20 000 new cases added each year, in addition to extension of follow-up for existing cases. Data linkages and natural language processing present important opportunities to enhance this type of research resource further, achieving both volume and depth of data. However, research projects still need to be carefully tailored, so that they take into account the nature and quality of the source information.
BackgroundElectronic health records (EHRs) provide enormous potential for health research but also present data governance challenges. Ensuring de-identification is a pre-requisite for use of EHR data without prior consent. The South London and Maudsley NHS Trust (SLaM), one of the largest secondary mental healthcare providers in Europe, has developed, from its EHRs, a de-identified psychiatric case register, the Clinical Record Interactive Search (CRIS), for secondary research.MethodsWe describe development, implementation and evaluation of a bespoke de-identification algorithm used to create the register. It is designed to create dictionaries using patient identifiers (PIs) entered into dedicated source fields and then identify, match and mask them (with ZZZZZ) when they appear in medical texts. We deemed this approach would be effective, given high coverage of PI in the dedicated fields and the effectiveness of the masking combined with elements of a security model. We conducted two separate performance tests i) to test performance of the algorithm in masking individual true PIs entered in dedicated fields and then found in text (using 500 patient notes) and ii) to compare the performance of the CRIS pattern matching algorithm with a machine learning algorithm, called the MITRE Identification Scrubber Toolkit – MIST (using 70 patient notes – 50 notes to train, 20 notes to test on). We also report any incidences of potential breaches, defined by occurrences of 3 or more true or apparent PIs in the same patient’s notes (and in an additional set of longitudinal notes for 50 patients); and we consider the possibility of inferring information despite de-identification.ResultsTrue PIs were masked with 98.8% precision and 97.6% recall. As anticipated, potential PIs did appear, owing to misspellings entered within the EHRs. We found one potential breach. In a separate performance test, with a different set of notes, CRIS yielded 100% precision and 88.5% recall, while MIST yielded a 95.1% and 78.1%, respectively. We discuss how we overcome the realistic possibility – albeit of low probability – of potential breaches through implementation of the security model.ConclusionCRIS is a de-identified psychiatric database sourced from EHRs, which protects patient anonymity and maximises data available for research. CRIS demonstrates the advantage of combining an effective de-identification algorithm with a carefully designed security model. The paper advances much needed discussion of EHR de-identification – particularly in relation to criteria to assess de-identification, and considering the contexts of de-identified research databases when assessing the risk of breaches of confidential patient information.
ObjectivesWe sought to use natural language processing to develop a suite of language models to capture key symptoms of severe mental illness (SMI) from clinical text, to facilitate the secondary use of mental healthcare data in research.DesignDevelopment and validation of information extraction applications for ascertaining symptoms of SMI in routine mental health records using the Clinical Record Interactive Search (CRIS) data resource; description of their distribution in a corpus of discharge summaries.SettingElectronic records from a large mental healthcare provider serving a geographic catchment of 1.2 million residents in four boroughs of south London, UK.ParticipantsThe distribution of derived symptoms was described in 23 128 discharge summaries from 7962 patients who had received an SMI diagnosis, and 13 496 discharge summaries from 7575 patients who had received a non-SMI diagnosis.Outcome measuresFifty SMI symptoms were identified by a team of psychiatrists for extraction based on salience and linguistic consistency in records, broadly categorised under positive, negative, disorganisation, manic and catatonic subgroups. Text models for each symptom were generated using the TextHunter tool and the CRIS database.ResultsWe extracted data for 46 symptoms with a median F1 score of 0.88. Four symptom models performed poorly and were excluded. From the corpus of discharge summaries, it was possible to extract symptomatology in 87% of patients with SMI and 60% of patients with non-SMI diagnosis.ConclusionsThis work demonstrates the possibility of automatically extracting a broad range of SMI symptoms from English text discharge summaries for patients with an SMI diagnosis. Descriptive data also indicated that most symptoms cut across diagnoses, rather than being restricted to particular groups.
ObjectivesMood instability is a clinically important phenomenon but has received relatively little research attention. The objective of this study was to assess the impact of mood instability on clinical outcomes in a large sample of people receiving secondary mental healthcare.DesignObservational study using an anonymised electronic health record case register.SettingSouth London and Maudsley NHS Trust (SLaM), a large provider of inpatient and community mental healthcare in the UK.Participants27 704 adults presenting to SLaM between April 2006 and March 2013 with a psychotic, affective or personality disorder.ExposureThe presence of mood instability within 1 month of presentation, identified using natural language processing (NLP).Main outcome measuresThe number of days spent in hospital, frequency of hospital admission, compulsory hospital admission and prescription of antipsychotics or non-antipsychotic mood stabilisers over a 5-year follow-up period.ResultsMood instability was documented in 12.1% of people presenting to mental healthcare services. It was most frequently documented in people with bipolar disorder (22.6%), but was common in people with personality disorder (17.8%) and schizophrenia (15.5%). It was associated with a greater number of days spent in hospital (β coefficient 18.5, 95% CI 12.1 to 24.8), greater frequency of hospitalisation (incidence rate ratio 1.95, 1.75 to 2.17), greater likelihood of compulsory admission (OR 2.73, 2.34 to 3.19) and an increased likelihood of prescription of antipsychotics (2.03, 1.75 to 2.35) or non-antipsychotic mood stabilisers (2.07, 1.77 to 2.41).ConclusionsMood instability occurs in a wide range of mental disorders and is not limited to affective disorders. It is generally associated with relatively poor clinical outcomes. These findings suggest that clinicians should screen for mood instability across all common mental health disorders. The data also suggest that targeted interventions for mood instability may be useful in patients who do not have a formal affective disorder.
Clozapine can cause severe adverse effects yet it is associated with reduced mortality risk. We test the hypothesis this association is due to increased clinical monitoring and investigate risk of premature mortality from natural causes. We identified 14 754 individuals (879 deaths) with serious mental illness (SMI) including schizophrenia, schizoaffective and bipolar disorders aged ≥ 15 years in a large specialist mental healthcare case register linked to national mortality tracing. In this cohort study we modeled the effect of clozapine on mortality over a 5-year period (2007–2011) using Cox regression. Individuals prescribed clozapine had more severe psychopathology and poorer functional status. Many of the exposures associated with clozapine use were themselves risk factors for increased mortality. However, we identified a strong association between being prescribed clozapine and lower mortality which persisted after controlling for a broad range of potential confounders including clinical monitoring and markers of disease severity (adjusted hazard ratio 0.4; 95% CI 0.2–0.7; p = .001). This association remained after restricting the sample to those with a diagnosis of schizophrenia or those taking antipsychotics and after using propensity scores to reduce the impact of confounding by indication. Among individuals with SMI, those prescribed clozapine had a reduced risk of mortality due to both natural and unnatural causes. We found no evidence to indicate that lower mortality associated with clozapine in SMI was due to increased clinical monitoring or confounding factors. This is the first study to report an association between clozapine and reduced risk of mortality from natural causes.
ObjectiveUnlocking the data contained within both structured and unstructured components of electronic health records (EHRs) has the potential to provide a step change in data available for secondary research use, generation of actionable medical insights, hospital management, and trial recruitment. To achieve this, we implemented SemEHR, an open source semantic search and analytics tool for EHRs.MethodsSemEHR implements a generic information extraction (IE) and retrieval infrastructure by identifying contextualized mentions of a wide range of biomedical concepts within EHRs. Natural language processing annotations are further assembled at the patient level and extended with EHR-specific knowledge to generate a timeline for each patient. The semantic data are serviced via ontology-based search and analytics interfaces.ResultsSemEHR has been deployed at a number of UK hospitals, including the Clinical Record Interactive Search, an anonymized replica of the EHR of the UK South London and Maudsley National Health Service Foundation Trust, one of Europe’s largest providers of mental health services. In 2 Clinical Record Interactive Search–based studies, SemEHR achieved 93% (hepatitis C) and 99% (HIV) F-measure results in identifying true positive patients. At King’s College Hospital in London, as part of the CogStack program (github.com/cogstack), SemEHR is being used to recruit patients into the UK Department of Health 100 000 Genomes Project (genomicsengland.co.uk). The validation study suggests that the tool can validate previously recruited cases and is very fast at searching phenotypes; time for recruitment criteria checking was reduced from days to minutes. Validated on open intensive care EHR data, Medical Information Mart for Intensive Care III, the vital signs extracted by SemEHR can achieve around 97% accuracy.ConclusionResults from the multiple case studies demonstrate SemEHR’s efficiency: weeks or months of work can be done within hours or minutes in some cases. SemEHR provides a more comprehensive view of patients, bringing in more and unexpected insight compared to study-oriented bespoke IE systems. SemEHR is open source, available at https://github.com/CogStack/SemEHR.
Objective To investigate whether cannabis use is associated with increased risk of relapse, as indexed by number of hospital admissions, and whether antipsychotic treatment failure, as indexed by number of unique antipsychotics prescribed, may mediate this effect in a large data set of patients with first episode psychosis (FEP). Design Observational study with exploratory mediation analysis. Setting Anonymised electronic mental health record data from the South London and Maudsley NHS Foundation Trust. Participants 2026 people presenting to early intervention services with FEP. Exposure Cannabis use at presentation, identified using natural language processing. Main outcome measures admission to psychiatric hospital and clozapine prescription up to 5 years following presentation. Mediator Number of unique antipsychotics prescribed. Results Cannabis use was present in 46.3% of the sample at first presentation and was particularly common in patients who were 16–25, male and single. It was associated with increased frequency of hospital admission (incidence rate ratio 1.50, 95% CI 1.25 to 1.80), increased likelihood of compulsory admission (OR 1.55, 1.16 to 2.08) and greater number of days spent in hospital (β coefficient 35.1 days, 12.1 to 58.1). The number of unique antipsychotics prescribed, mediated increased frequency of hospital admission (natural indirect effect 1.09, 95% CI 1.01 to 1.18; total effect 1.50, 1.21 to 1.87), increased likelihood of compulsory admission (natural indirect effect (NIE) 1.27, 1.03 to 1.58; total effect (TE) 1.76, 0.81 to 3.84) and greater number of days spent in hospital (NIE 17.9, 2.4 to 33.4; TE 34.8, 11.6 to 58.1). Conclusions Cannabis use in patients with FEP was associated with an increased likelihood of hospital admission. This was linked to the prescription of several different antipsychotic drugs, indicating clinical judgement of antipsychotic treatment failure. Together, this suggests that cannabis use might be associated with worse clinical outcomes in psychosis by contributing towards failure of antipsychotic treatment.
BackgroundTraditional health information systems are generally devised to support clinical data collection at the point of care. However, as the significance of the modern information economy expands in scope and permeates the healthcare domain, there is an increasing urgency for healthcare organisations to offer information systems that address the expectations of clinicians, researchers and the business intelligence community alike. Amongst other emergent requirements, the principal unmet need might be defined as the 3R principle (right data, right place, right time) to address deficiencies in organisational data flow while retaining the strict information governance policies that apply within the UK National Health Service (NHS). Here, we describe our work on creating and deploying a low cost structured and unstructured information retrieval and extraction architecture within King’s College Hospital, the management of governance concerns and the associated use cases and cost saving opportunities that such components present.ResultsTo date, our CogStack architecture has processed over 300 million lines of clinical data, making it available for internal service improvement projects at King’s College London. On generated data designed to simulate real world clinical text, our de-identification algorithm achieved up to 94% precision and up to 96% recall.ConclusionWe describe a toolkit which we feel is of huge value to the UK (and beyond) healthcare community. It is the only open source, easily deployable solution designed for the UK healthcare environment, in a landscape populated by expensive proprietary systems. Solutions such as these provide a crucial foundation for the genomic revolution in medicine.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.