2017
DOI: 10.1016/j.jbiotec.2017.07.028
|View full text |Cite
|
Sign up to set email alerts
|

KNIME for reproducible cross-domain analysis of life science data

Abstract: Experiments in the life sciences often involve tools from a variety of domains such as mass spectrometry, next generation sequencing, or image processing. Passing the data between those tools often involves complex scripts for controlling data flow, data transformation, and statistical analysis. Such scripts are not only prone to be platform dependent, they also tend to grow as the experiment progresses and are seldomly well documented, a fact that hinders the reproducibility of the experiment. Workflow system… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
99
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 160 publications
(106 citation statements)
references
References 37 publications
0
99
0
Order By: Relevance
“… The use of well documented and amenable workflow management platforms like KNIME facilitate the construction of consistent, reproducible, and transferable protocols . The workflows can be transferred between, for example, workstations, users, and sites, and can be re‐run: i) as is, for example, when large data transfer is not feasible, or when new database versions are released; ii) with different configurations of the nodes, for example, changing ligand activity cut‐offs (Figure ), input ligands (Figures , , ), protein targets (Figure ); iii) with additional/modified nodes to obtain complementary information, for example, including annotations from other databases, further analyzing results, or performing machine learning on the obtained data.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“… The use of well documented and amenable workflow management platforms like KNIME facilitate the construction of consistent, reproducible, and transferable protocols . The workflows can be transferred between, for example, workstations, users, and sites, and can be re‐run: i) as is, for example, when large data transfer is not feasible, or when new database versions are released; ii) with different configurations of the nodes, for example, changing ligand activity cut‐offs (Figure ), input ligands (Figures , , ), protein targets (Figure ); iii) with additional/modified nodes to obtain complementary information, for example, including annotations from other databases, further analyzing results, or performing machine learning on the obtained data.…”
Section: Discussionmentioning
confidence: 99%
“…There is a need for eScience technologies to process the large volumes of rapidly generated, heterogeneous protein–ligand interaction data into computational models that enable the design of efficacious and safe medicines . The ChEMBL database (version 23), for example, contains over 14 million data entries on 11 500 protein targets, of which 4600 human, covering 1.7 million unique compounds .…”
Section: Introductionmentioning
confidence: 99%
“…First, preliminary labels were automatically generated using either classical image processing techniques, or existing, non-trained nuclear segmentation deep learning models. In particular, since nuclei of MCF10A cells displayed high contrast and good separation, ground truth labels (instance segmentation masks) for these cells were generated using a simple nuclear segmentation pipeline in KNIME (19,20), which included thresholding, gaps filling, connected components, and manual separation of a few adjacent nuclei. A set of Python scripts was used to convert these preliminary nuclear labels images into a format compatible with the interactive, web-based image editing tool Supervisely (21).…”
Section: Ground Truth Labels Generationmentioning
confidence: 99%
“…Moreover, a RESTful service has been implemented, so that Selenzyme can accept multiple queries from any other web-based application. As an example of the application of the REST service, a KNIME node (O'Hagan and Kell, 2015;Fillbrunn et al, 2017) is available so that the reaction query can be generated from chemoinformatics operations within a workflow (examples available at http://www.myexperiment.org/packs/734, see supplementary information). Similarly, the resulting output tables containing the sequences can easily be processed downstream in the workflow or sent to other sequence analysis services such as Galaxy.…”
Section: Web Server and Restful Servicementioning
confidence: 99%