2022
DOI: 10.1038/s41467-022-31007-x
|View full text |Cite
|
Sign up to set email alerts
|

HarmonizR enables data harmonization across independent proteomic datasets with appropriate handling of missing values

Abstract: Dataset integration is common practice to overcome limitations in statistically underpowered omics datasets. Proteome datasets display high technical variability and frequent missing values. Sophisticated strategies for batch effect reduction are lacking or rely on error-prone data imputation. Here we introduce HarmonizR, a data harmonization tool with appropriate missing value handling. The method exploits the structure of available data and matrix dissection for minimal data loss, without data imputation. Th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 24 publications
(31 citation statements)
references
References 40 publications
0
29
0
Order By: Relevance
“…Missing value tolerant data harmonization 27 adjusted sample specific mean and CV values across studies (Supplementary Figure 3A) and enabled a clear separation of MB groups. (Figure 1E-F).…”
Section: Results: Integration Of In-house Proteome Data On Ffpe Mb An...mentioning
confidence: 99%
See 3 more Smart Citations
“…Missing value tolerant data harmonization 27 adjusted sample specific mean and CV values across studies (Supplementary Figure 3A) and enabled a clear separation of MB groups. (Figure 1E-F).…”
Section: Results: Integration Of In-house Proteome Data On Ffpe Mb An...mentioning
confidence: 99%
“…To further increase the cohort size, we integrated protein abundances with reanalyzed LC-MS/MS data of FF-MB from public repositories 18 ; 19 ; 17 , Figure 1D-F). Missing value tolerant data harmonization 27 adjusted sample specific mean and CV values across studies (Supplementary Figure 3A) and enabled a clear separation of MB groups. (Figure 1E-F).…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Precursor, and protein identifications false discovery rate (FDR) threshold was set to 1%, while the threshold for peptide was 0.5%. The data was normalized in Spectronaut based only on proteins identified in all samples and then further processed for batch effect removal by HarmonizR (41).We utilized HarmonizR for batch effect correction is that it allows the retention of the proteins that otherwise would be dropped due to containing missing values (Supplementary Fig. S5).…”
Section: Librarymentioning
confidence: 99%