2018
DOI: 10.1093/ije/dyy113
|View full text |Cite
|
Sign up to set email alerts
|

Numero: a statistical framework to define multivariable subgroups in complex population-based datasets

Abstract: Large-scale epidemiological and population data provide opportunities to identify subgroups of people who are at risk of disease or exposed to adverse environments. Clustering algorithms are popular data-driven tools to identify these subgroups; however, relying exclusively on algorithms may not produce the best results if the dataset does not have a clustered structure. For this reason, we propose a framework (the R-library Numero) that combines the self-organizing map algorithm, permutation analysis for stat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
38
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 22 publications
(40 citation statements)
references
References 18 publications
0
38
0
2
Order By: Relevance
“…Groop et al 3 and suggested some correspondence between the independently specified data-driven clusters even though the data used were fundamentally different, i.e., genetic loci 2 versus clinical and biomarker data at the time of diabetes diagnosis. 3 The examples of SOM analyses demonstrated by Mäkinen and co-workers, 5,6,8 as well as the recent other clustering applications in diabetes; 2,3 suggest alluring potential for data-driven subgroup analyses in epidemiology and medicine. These types of approaches are likely to inform on the fundamental genetic and metabolic variation defining the complexity of polygenic diseases.…”
Section: Discussed Their Findings With Respect To Those Bymentioning
confidence: 99%
See 2 more Smart Citations
“…Groop et al 3 and suggested some correspondence between the independently specified data-driven clusters even though the data used were fundamentally different, i.e., genetic loci 2 versus clinical and biomarker data at the time of diabetes diagnosis. 3 The examples of SOM analyses demonstrated by Mäkinen and co-workers, 5,6,8 as well as the recent other clustering applications in diabetes; 2,3 suggest alluring potential for data-driven subgroup analyses in epidemiology and medicine. These types of approaches are likely to inform on the fundamental genetic and metabolic variation defining the complexity of polygenic diseases.…”
Section: Discussed Their Findings With Respect To Those Bymentioning
confidence: 99%
“…In this Issue Mäkinen and co-workers present a Software Application Profile for an opensource R library, titled Numero, which would be a versatile and powerful tool for the abovementioned types of subgrouping needs. 8 Numero provides a three-step framework that combines the self-organizing map (SOM) algorithm, permutation analyses for statistical evidence and an expert-driven final subgrouping decision. 5,8 Numero can handle both continuous and categorical variables in situations where there is no intrinsic clustering in the data and it creates data-driven statistically validated two-dimensional visualisations without explicit boundaries for the subgroups.…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…No estudo de Pearce et al 21 , o SOM foi utilizado para analisar a qualidade de ar com base em séries históricas. O trabalho realizado por Gao et al 22 utiliza uma variação do SOM para realizar análises exploratórias nas quais o algoritmo cria subgrupos de populações com características similares, usando como base de dados os registros médicos de doentes renais. O algoritmo gera um mapa bidimensional representando cinco dimensões escolhidas e exibe a intensidade de concentração das variáveis selecionadas, assim como as probabilidades de mortalidade em oito anos, de acordo com os perfis identificados pelo SOM.…”
Section: Aprendizagem Não Supervisionadaunclassified
“…Gao et al 22 Olson et al 18 Lee et al 19 Quadro Com os resultados obtidos na pesquisa realizada na Biblioteca Virtual em Saúde e nos demais estudos citados neste ensaio, elaborou-se uma relação de finalidades, aplicações e estudos exemplificativos (Quadro 2).…”
Section: Random Forestsunclassified