Beyond Tandem Analysis: Joint Dimension Reduction and Clustering in <i>R</i>

Markos, Angelos; D’Enza, Alfonso Iodice; Velden, Michel van de

doi:10.18637/jss.v091.i10

Cited by 48 publications

(47 citation statements)

References 20 publications

(42 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Hierarchical clustering was performed with the ComplexHeatmap (Gu et al, 2016) R package (clustering distance = Manhattan, clustering method = Ward.D2). Cluster correspondence analysis (van de Velden et al, 2017) of the 45 categorical variables (combinations of histone marks and decision-tree labels) across the 8,030 selected genes was performed with the R package clustrd (Markos et al, 2019). To select the optimal number of clusters and dimensions, we first run the function tuneclus() with the following parameters: nclusrange = 3:10, ndimrange = 2:9, method = "clusCA", nstart = 100, seed = 1234.…”

Section: Clustering Analysismentioning

confidence: 99%

Dynamics of gene expression and chromatin marking during cell state transition

Borsari

Abad

Klein

et al. 2020

Preprint

View full text Add to dashboard Cite

SummaryWe have monitored the transcriptomic and epigenomic status of cells at twelve time-points during the transdifferentiation of human pre-B cells into macrophages. Using this data, we have investigated some fundamental questions regarding the role of chromatin in gene expression. We have found that, over time, genes are characterized by a limited number of chromatin states (combinations of histone modifications), and that, consistently, chromatin changes over genes tend to occur in a coordinated manner. We have observed strong association between these changes and gene expression only at the time of initial gene activation. Activation is preceded by H3K4me1 and H3K4me2, and followed in a precise order by most other histone modifications. Further changes in gene expression, comparable or even stronger than those at initial activation, occur without associated changes in histone modifications. The data generated here constitutes, thus, a unique resource to investigate transcriptomic and epigenomic dynamics during a differentiation process.

show abstract

Section: Clustering Analysismentioning

confidence: 99%

Dynamics of gene expression and chromatin marking during cell state transition

Borsari

Abad

Klein

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Our primary sample consisted of 1210 adults—557 women and 652 men, ranging in age from 18 to 80 ( M = 30.90, SD = 11.99, Mdn = 27)—who completed the HSQ using a 5-point rating scale. Some of the responses ( n = 810) come from data collected by open psychometrics and shared in the R package clustrd [ 19 ]; we collected the remaining responses ( n = 400) using the Prolific.co survey panel. This 5-point dataset is the primary sample reported throughout the manuscript.…”

Section: Methodsmentioning

confidence: 99%

Time to Renovate the Humor Styles Questionnaire? An Item Response Theory Analysis of the HSQ

Silvia

Rodriguez-Boerwinkle

2020

Behavioral Sciences

View full text Add to dashboard Cite

The Humor Styles Questionnaire (HSQ) is one of the most popular self-report scales in humor research. The present research conducted a forward-looking psychometric analysis grounded in Rasch and item response theory models, which have not been applied to the HSQ thus far. Regarding strengths, the analyses found very good evidence for reliability and dimensionality and essentially zero gender-based differential item functioning, indicating no gender bias in the items. Regarding opportunities for future development, the analyses suggested that (1) the seven-point rating scale performs poorly relative to a five-point scale; (2) the affiliative subscale is far too easy to endorse and much easier than the other subscales; (3) the four subscales show problematic variation in their readability and proportion of reverse-scored items; and (4) a handful of items with poor discrimination and high local dependence are easy targets for scale revision. Taken together, the findings suggest that the HSQ, as it nears the two-decade mark, has many strengths but would benefit from light remodeling.

show abstract

“…For tandem clustering, we use the R package "clustrd" on CRAN (https://CRAN.R-project. org/package=clustrd) (Markos et al 2019) in which RKM and FKM are implemented. As a baseline, simple LBG k-means (Linde et al 1980) and k-means with initialization procedure 12 (KM_I12) (Steinley and Brusco 2007) were used.…”

Section: Algorithms Combining Dr With Clusteringmentioning

confidence: 99%

Using Projection-Based Clustering to Find Distance- and Density-Based Clusters in High-Dimensional Data

Thrun

Ultsch

2020

J Classif

View full text Add to dashboard Cite

For high-dimensional datasets in which clusters are formed by both distance and density structures (DDS), many clustering algorithms fail to identify these clusters correctly. This is demonstrated for 32 clustering algorithms using a suite of datasets which deliberately pose complex DDS challenges for clustering. In order to improve the structure finding and clustering in high-dimensional DDS datasets, projection-based clustering (PBC) is introduced. The coexistence of projection and clustering allows to explore DDS through a topographic map. This enables to estimate, first, if any cluster tendency exists and, second, the estimation of the number of clusters. A comparison showed that PBC is always able to find the correct cluster structure, while the performance of the best of the 32 clustering algorithms varies depending on the dataset.

show abstract

Beyond Tandem Analysis: Joint Dimension Reduction and Clustering in R

Cited by 48 publications

References 20 publications

Dynamics of gene expression and chromatin marking during cell state transition

Dynamics of gene expression and chromatin marking during cell state transition

Time to Renovate the Humor Styles Questionnaire? An Item Response Theory Analysis of the HSQ

Using Projection-Based Clustering to Find Distance- and Density-Based Clusters in High-Dimensional Data

Contact Info

Product

Resources

About