Introduction Health systems are challenged by care underutilization, overutilization, disparities, and related harms. One problem is a multiyear latency between discovery of new best practice knowledge and its widespread adoption. Decreasing this latency requires new capabilities to better manage and more rapidly share biomedical knowledge in computable forms. Knowledge objects package machine‐executable knowledge resources in a way that easily enables knowledge as a service. To help improve knowledge management and accelerate knowledge sharing, the Knowledge Object Reference Ontology (KORO) defines what knowledge objects are in a formal way. Methods Development of KORO began with identification of terms for classes of entities and for properties. Next, we established a taxonomical hierarchy of classes for knowledge objects and their parts. Development continued by relating these parts via formally defined properties. We evaluated the logical consistency of KORO and used it to answer several competency questions about parthood. We also applied it to guide knowledge object implementation. Results As a realist ontology, KORO defines what knowledge objects are and provides details about the parts they have and the roles they play. KORO provides sufficient logic to answer several basic but important questions about knowledge objects competently. KORO directly supports creators of knowledge objects by providing a formal model for these objects. Conclusion KORO provides a formal, logically consistent ontology about knowledge objects and their parts. It exists to help make computable biomedical knowledge findable, accessible, interoperable, and reusable. KORO is currently being used to further develop and improve computable knowledge infrastructure for learning health systems.
Introduction To advance the goals of the Mobilizing Computable Biomedical Knowledge (MCBK) Movement, we are exploring the use of FAIR Digital Objects (FDOs) (De Smedt et al. 2020, Williams et al. 2021). First, we are beginning to clarify the full range of metadata for FDOs that carry bit sequences expressing knowledge in machine readable or executable formats. We view knowledge through an empirical lens as the reliable, valid, and valued results of analytic or deliberative data analysis. Computability of knowledge refers to the degree to which knowledge is formally represented for use by computing machines. Second, we are figuring out how to apply linked data principles to FDO metadata records (Bizer et al. (2008)). Linked data are structured data with openly defined and uniquely identified concepts. We are developing linked metadata that conform to the Resource Description Format (RDF), where domains of interest are represented using a pattern of subject-predicate-object “triples.” RDF triples give rise to machine actionable FDO metadata records that can be visualized as directed graphs. In keeping with the FAIR Digital Object Framework (FDOF), we value linked metadata as a general method of bringing consistency to FDO metadata records, making it so that artificial agents can act on them in predictable ways. Five other benefits of linked metadata are that they are divisible, aggregable, extensible, queryable (using SPARQL), and support logical inferencing. With a focus specifically on FDOs that carry computable knowledge artifacts at their core, here we present our recent metadata work completed between 2019 and mid-2022. Metadata Scope for FDOs Carrying Computable Knowledge This section summarizes previously published work to specify and scope FDO metadata. This work was completed by members of our team and the larger MCBK Movement. Through many dialogs over a period of more than a year, thirteen high-level categories of metadata for FDOs carrying computable knowledge were described (Alper et al. (2021)). These categories are listed in Table 1 below. For detailed explanations and examples of each metadata category above, see our full publication. Next, we briefly discuss six categories marked with an asterisk (*) in Table 1. These six categories are somewhat specific to FDOs that contain computable knowledge. For Knowledge Domain metadata, a large and growing number of biomedical vocabularies or schema exist. For clinical terms, the Standardized Nomenclature of Medicine (SNOMED) includes more than 350K RDF classes and 200 properties. Many bioscience vocabularies spanning a wide range of terms from human biology also exist. Purpose metadata are critical for FDOs that convey computable knowledge about the prevention, diagnosis, treatment, amelioration, and monitoring of disease. Interestingly, we have yet to find vocabularies for representing clinically-oriented FDO purposes as linked metadata. We anticipate needing FDO-to-FDO Relation metadata. Going beyond citations that relate knowledge to its antecedents, FDOs containing computable biomedical knowledge may relate sequentially (diagnostic knowledge preceding treatment knowledge), dependently (stratification depends on measurement), or comparatively (multiple models estimate the same factor). More work is needed to formalize these relations. For technical metadata about FDOs carrying computable knowledge, we emphasize existing vocabularies, including software ontologies like the function ontology. Moreover, for certain FDO operations, webservices are a way of leveraging the decentralized web. As Technical FDO metadata, we can describe FDO-backed webservices semantically by building on the work of the OpenAPI and AsyncAPI initiatives. Finally, we need FDO metadata about two different kinds of evidence. First, there are Evidential Basis metadata that describe features and details about how computable knowledge contained FDOs was generated. Second, there are Evidence from Use metadata that describe the effects of applying the computable knowledge contained in FDOs to simulated or real cases. Linked Metadata for actual FDOs Carrying Computable Knowledge This section shares new work. Since 2016, we have built and tested several hundred compound Digital Objects (DOs) carrying executable biomedical knowledge in the form of pure functions (e.g., math functions for estimating a health risk) (Beck et al. 2022). Our particular DOs – called Knowledge Objects (KOs) – conform to a common design pattern we created (Fig. 1). We have demonstrated how these DOs can be rapidly implemented in several technical environments to enable RESTful webservice requests and responses to and from pure functions of interest in biomedicine. In a move towards having a specific type of FDOs for carrying computable knowledge, we have started the process of developing linked metadata records for FDOs using a prototype metadata schema. An example of an early FDO linked data record appears in Example 1. { "@context": { "dcterms": "http://purl.org/dc/terms/", "koio": "http://kgrid.org/koio/", "fno" : "https://w3id.org/function/ontology/" }, "@id":"https://library.kgrid.org/#/object/99999%2Ffk4jh3tk9s%2Fv1.0%2Fv1.0", "@type": "koio:KnowledgeObject", "dcterms:title" : " Tammemagi, 6 year Lung Cancer Risk Prediction Model for Screening", "dcterms:identifier" : " ark:/99999/fk4jh3tk9s", "dcterms:hasVersion" :"v1.0", "dcterms:created":"2016-04-15", "dcterms:description" : "A 10-factor patient-level logistic regression model for estimating the risk of a future lung cancer diagnosis for a person", "dcterms:creator" : ["https://kgrid.org/ ","https://medicine.umich.edu/dept/learning-health-sciences"], "dcterms:source" : ["https://www.nejm.org/doi/pdf/10.1056/NEJMoa1211776"], "dcterms:publisher" : " https://medicine.umich.edu/dept/learning-health-sciences", "dcterms:rights" : "All rights reserved.", "dcterms:rightsHolder" : "Department of Learning Health Sciences, University of Michigan Medical School, 1111 E Catherine Street, Ann Arbor, MI, 48109", "dcterms:license":"NOT licensed for use outside the Department of Learning Health Sciences", "dcterms:valid" : "2016-04-15/2016-04-16", "dcterms:hasPart":["getSixyearprobability.js","deployment.yaml","service.yaml","metadata.jsonld"], "koio:hasPayload" : { "@id":"getSixyearprobability.js", "@type" : "fno:function", "dcterms:title" : " getSixyearprobability", "dcterms:language" : "Javascript", "fno:solves" : "Maps patient features to lung cancer risk scores", "fno:expects" : ["age", "ethnicity", "bmi","cigsPerDay","edLevel","hxLungCancer","hxLungCancerFam","hxNonLungCancerDz","yrsQuit","yrsSmoker"], "fno:returns" :["Lung Cancer Risk Score"] }} Example 1. An FDO linked metadata record iin JSON-LD format. (Cut and paste into the JSON-LD Playground to visualize.) The KO described in the linked metadata record above is available here for inspection. As Example 1 shows in bold text, our initial prototype linked metadata record for KOs relies on three vocabularies, Dublin Core Terms, the Function Ontology, and our own Knowledge Object Implementation Ontology (KOIO). As its FDO identifier, the KO uses an Archival Resource Key (ARK). ARKs are attractive because they support a suffix passthrough mechanism for consistently identifying the common parts of a KO, such as Deployment and Service Descriptions. This linked metadata record in Example 1 has been successfully loaded into several RDF systems, including the JSON-LD Playground and an instance of the Blue Brain Nexus knowledge graph system. We have used SPARQL queries to extract and filter elements from this linked metadata record. Conclusion For FDOs containing computable knowledge to have high-degrees of FAIRness, extensive metadata records are required. Some metadata content specified to date is specific to this type of FDO and payload. It is possible to represent FDO metadata as linked metadata, making the metadata richer semantically and potentially easier to manage with artificial agents and machines. In biomedicine especially, more work is needed to identify more vocabularies for use as controlled terminologies to arrive at suitably comprehensive linked metadata for this important new type of FDO.
Introduction Learning health systems are challenged to combine computable biomedical knowledge (CBK) models. Using common technical capabilities of the World Wide Web (WWW), digital objects called Knowledge Objects, and a new pattern of activating CBK models brought forth here, we aim to show that it is possible to compose CBK models in more highly standardized and potentially easier, more useful ways. Methods Using previously specified compound digital objects called Knowledge Objects, CBK models are packaged with metadata, API descriptions, and runtime requirements. Using open‐source runtimes and a tool we developed called the KGrid Activator, CBK models can be instantiated inside runtimes and made accessible via RESTful APIs by the KGrid Activator. The KGrid Activator then serves as a gateway and provides a means to interconnect CBK model outputs and inputs, thereby establishing a CBK model composition method. Results To demonstrate our model composition method, we developed a complex composite CBK model from 42 CBK submodels. The resulting model called CM‐IPP is used to compute life‐gain estimates for individuals based their personal characteristics. Our result is an externalized, highly modularized CM‐IPP implementation that can be distributed and made runnable in any common server environment. Discussion CBK model composition using compound digital objects and the distributed computing technologies is feasible. Our method of model composition might be usefully extended to bring about large ecosystems of distinct CBK models that can be fitted and re‐fitted in various ways to form new composites. Remaining challenges related to the design of composite models include identifying appropriate model boundaries and organizing submodels to separate computational concerns while optimizing reuse potential. Conclusion Learning health systems need methods for combining CBK models from a variety of sources to create more complex and useful composite models. It is feasible to leverage Knowledge Objects and common API methods in combination to compose CBK models into complex composite models.
Over the past 4 years, the authors have participated as members of the Mobilizing Computable Biomedical Knowledge Technical Infrastructure working group and focused on conceptualizing the infrastructure required to use computable biomedical knowledge. Here, we summarize our thoughts and lay the foundation for future work in the development of CBK infrastructure, including: explaining the difference between computable knowledge and data, and contextualizing the conversation with the Learning Health Systems and the FAIR principles. Specifically, we provide three guiding principles to advance the development of CBK infrastructure: (a) Promote interoperable systems for data and knowledge to be findable, accessible, interoperable, and reusable. (b) Enable stable, trustworthy knowledge representations that are human and machine readable. (c) Computable knowledge resources should, when possible, be open. Standards supporting computable knowledge infrastructures must be open.
Introduction We present current work to develop and define a class of digital objects that facilitates patient cohort identification for clinical studies, such that these objects are Findable, Accessible, Interoperable, and Reusable (FAIR) (Wilkinson et al. 2016). Developing this class of FAIR Digital Objects (FDOs) builds on the work of several years to develop the Knowledge Grid (https://kgrid.org/), which facilitates the development, description and implementation of biomedical knowledge packaged in machine-readable and machine-executable formats (Flynn et al. 2018). Additionally, this work aligns with the goals of the Mobilizing Computable Biomedical Knowledge (MCBK) community (https://mobilizecbk.med.umich.edu/) (Mobilizing Computable Biomedical Knowledge 2018). In this abstract, we describe our work to develop a FDO carrying a computable phenotype. Defining computable phenotypes In biomedical informatics, 'phenotyping' describes a data-driven approach to identifying a group of individuals sharing observable characteristics of interest, generally related to a disease or condition, and a 'computable phenotype' (CP) is a machine-processable expression of a phenotypic pattern of these characteristics (Hripcsak and Albers 2018). For the purposes of this work, we are interested in CPs derived from data contained in electronic health record (EHR) systems. This includes both structured data, e.g. codes for diseases, diagnoses, procedures, or laboratory tests, and unstructured data, e.g. free text including patient histories, clinical observations, discharge summaries, and reports. Thus, we define computable phenotype FDOs (CP-FDOs) as a class of FDO that packages an executable EHR-derived CP together with documentation needed to implement and use it effectively for creating cohorts of individuals with similar observable characteristics from EHR data sets. Importance of portable and FAIR CPs There is tremendous excitement for using real-world EHR data to discover important findings about human health and well-being. However, for discovery to happen, researchers need mechanisms like CPs to identify study cohorts for analysis. Beginning in the early 2010s, a growing literature explores various methods for the secondary use of EHR data for patient phenotyping to arrive at consistent study cohorts (Shivade et al. 2014, Banda et al. 2018). The heterogeneous nature of EHR data has inspired a wide variety of phenotyping methods, from those which rely solely on documented codes linked to terms in existing vocabularies to those which combine such codes with other concepts extracted from free text using natural language processing. Our current focus is on packaging CPs inside FDOs for classifying patients as having or not having a phenotype of interest. This can be done within an individual health system, or at scale across a clinical data research network. Using CPs for cohort identification can reduce the time and expense of traditional data set building and clincal trial recruitment, and expand the potential scope of a study population(Boland et al. 2013). Creating and validating CPs requires time, resources, and both clinical and technical expertise. One estimate is that it can take 6-10 months to develop and validate a CP (Shang et al. 2019). And, as there is no standard data model within EHRs in the United States, many CPs are designed for performance at a single site, rather than for portability, which is understood as the ability to implement a phenotype at a different site with similar performance (Shang et al. 2019). While portability is increasingly recognized as an important element of phenotyping, and there have been recent efforts to develop more portable CPs, many of these processes still require significant technical expertise at the implementation site to adapt the phenotype for use on local data. There may also be significant advantages to making CPs FAIR. These include transparency in cohort selection, and better generalizability of results. FAIR CPs may also increase the potential for robust comparisons of data from related studies, leading to better evidence synthesis to improve delivery of care and ultimately human health. Defining a new class of FDOs to hold and convey CPs We believe that packaging validated CPs inside digital objects may alleviate many of the pressures mentioned above, and contributes to making both the processes and products of clinical research more FAIR. To this end, our current work focuses on packaging a validated CP inside a machine-processable FDO. The phenotype of interest identifies pediatric and adult patients with a rare disease (Oliverio et al. 2021), and has several features which make it ideal for transformation to an executable FDO. First, the phenotype utilizes standards to define the clinical characteristics of interest, and is based on a common data model; these features increase the potential for both interoperability and reuse. Additionally, because the phenotype has been validated across three sites, its portability has already been demonstrated. Finally, the full computable phenotype has been shared as a series of SQL queries, including scripts for patient identification, deriving statistics, and validation, which have been annotated with instructions for implementation at other sites. The goals of this work are: To develop CPs as executable DOs, leveraging previous work to develop executable Knowledge Objects (KO) (Flynn et al. 2018) To advance our understanding of how to define computable phenotypes as a class of FDO, including what is needed to meet the requirements of binding, abstraction, and encapsulation (Wittenburg et al. 2019) To develop CPs as executable DOs, leveraging previous work to develop executable Knowledge Objects (KO) (Flynn et al. 2018) To advance our understanding of how to define computable phenotypes as a class of FDO, including what is needed to meet the requirements of binding, abstraction, and encapsulation (Wittenburg et al. 2019) Conclusion Computable phenotypes, packaged as FDOs, may increase the potential both for the portability of a phenotype and the reusability of data resulting from its implementation. Providing CPs as executable FDOs may also reduce barriers to portability and local implementation. In this presentation, we describe our work to develop a FDO computable phenotype from an existing validated phenotype. Lessons learned from this process will increase our understanding of both the technical requirements, and how to address necessary components of abstraction, binding, and encapsulation so that these can function as FAIR Digital Objects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.