Grid Binary LOgistic REgression (GLORE): building shared models without sharing data

Wu, Yuan; Jiang, Xiaoqian; Kim, Jihoon; Ohno-Machado, Lucila

doi:10.1136/amiajnl-2012-000862

Cited by 155 publications

(165 citation statements)

References 43 publications

Supporting

Mentioning

162

Contrasting

Order By: Relevance

“…This is possible as the central 'master' sends possible prediction models rather than fetching the data from remote nodes. Only statistical indexes totally unrelated to specific patients are exchanged between nodes and their master [47,[53][54]. Although this protocol does not require an intervention and the data is fully deidentified, internal review board ethics approval is recommended before implementing a local node.…”

Section: Privacy Protection Of Patientsmentioning

confidence: 99%

VATE: VAlidation of high TEchnology based on large database analysis by learning machine

Meldolesi

Soest²,

Alitto

et al. 2014

Colorect. Cancer

View full text Add to dashboard Cite

SummaryThe interaction between implementation of new technologies and different outcomes can allow a broad range of researches to be expanded. The purpose of this paper is to introduce the VAlidation of high TEchnology based on large database analysis by learning machine (VATE) project that aims to combine new technologies with outcomes related to rectal cancer in terms of tumor control and normal tissue sparing. Using automated computer bots and the knowledge for screening data it is possible to identify the factors that can mostly influence those outcomes. Population-based observational studies resulting from the linkage of different datasets will be conducted in order to develop predictive models that allow physicians to share decision with patients into a wider concept of tailored treatment. KeywordSOver the past decade, remarkable advances in cancer care with the adoption of newest diagnostic and treatment technologies has created new challenges [1].The use and role of medical imaging technologies in clinical oncology has greatly expanded from a primarily diagnostic tool to award a central role in the context of individualized medicine. Multiple imaging features involving descriptors of intensity distribution, spatial relationships between the various intensity levels, texture heterogeneity patterns, descriptors of shape and the relations of the tumor with the surrounding tissues have been analyzed for their relationship with treatment outcomes or Practice points• Progress in individualized medicine has created new challenges.• New technology implementation.• Necessity to develop systems that allow shared decision making by the physicians and the patients and chose a tailored treatment.• Standardization of the data collection -ontology.• Sharing data: Semantic Web and Resource Description Framework.• Statistical analysis.• Privacy protection of individual patients.• Development of predictive models based on individual patients features which complement existing consensus or guidelines.For reprint orders, please contact: reprints@futuremedicine.com

show abstract

Section: Privacy Protection Of Patientsmentioning

confidence: 99%

VATE: VAlidation of high TEchnology based on large database analysis by learning machine

Meldolesi

Soest²,

Alitto

et al. 2014

Colorect. Cancer

View full text Add to dashboard Cite

show abstract

“…In this setting, the amount of data per transfer diminishes; however, the data transfer frequency increases. A thorough explanation how distributed machine learning algorithms work is given by Boyd et al [5] and Wu et al [34]. From this work of Boyd et al, we reused the MapReduce concept, developed by Dean and Ghemawat [8], to implement the distributed machine learning concept.…”

Section: Distributed Machine Learningmentioning

confidence: 99%

Application of Machine Learning for Multicenter Learning

Soest

Dekker

Roelofs

et al. 2015

Machine Learning in Radiation Oncology

View full text Add to dashboard Cite

Advancements in radiation oncology are driving more specific, and thus improved, treatment opportunities. This creates challenges on the assessment of treatment options, as more information is needed to make an informed decision. One of the methods is to use machine-learning techniques to develop predictive models. Although prediction models, embedded in clinical decision support systems (CDSSs), are the foreseen solution, developing/training such prediction models requires large amounts of detailed patient information to reach decisive power. The amount of patients needed to train a reliable prediction model rapidly outgrows the numbers available in a single institution, hence the need for multicenter machinelearning. To be able to learn over multiple centers, several infrastructural prerequisites need to be addressed. First, data needs to be extracted from multiple source systems and represented using standardized terminologies, preferably including the semantics (the actual description) of the represented data. For research and model training purposes, this means that value representations (e.g. "m" or "f" indicating gender) need to be converted into standardized terms (the NCI Thesaurus codes C20197 or C16576, respectively), and that patient-identifiable information (e.g. name, institutional ID, address, etc.) needs to be removed or changed in a non-identifiable way. If datasets from different institutions use the same standardized terminology and data structure, data can be merged. Finally, after merging, prediction models can be learned on the complete dataset, in this chapter known as centralized learning.

show abstract

“…T e VM import model can also enable distributed computation, with each party installing the same VM and contributing results of its local computation to a coordinating center. For example, we have shown that it is possible to create an accurate predictive model by exporting the computation to dif erent centers and aggregating results only, without any individual patient data ever being transferred (21).…”

Section: " "mentioning

confidence: 99%