2014
DOI: 10.48550/arxiv.1411.4911
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multivariate Analysis of Mixed Data: The R Package PCAmixdata

Abstract: Mixed data arise when observations are described by a mixture of numerical and categorical variables. The R package PCAmixdata extends standard multivariate analysis methods to incorporate this type of data. The key techniques/methods included in the package are principal component analysis for mixed data (PCAmix), varimax-like orthogonal rotation for PCAmix, and multiple factor analysis for mixed multi-table data. This paper gives a synthetic presentation of the three algorithms with details to help the user … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
2

Relationship

1
9

Authors

Journals

citations
Cited by 25 publications
(25 citation statements)
references
References 9 publications
0
19
0
2
Order By: Relevance
“…To account for the fact that we have both noncategorical and categorical SES indicators, we used a method that is often called factorial analysis of mixed data, which is essentially a generalization of PCA that can handle such mixed data (45,46). This method combines ordinary PCA for noncategorical data with multiple correspondence analysis for categorical data and is implemented in the R package PCAmix (47).…”
Section: Ses Measuresmentioning
confidence: 99%
“…To account for the fact that we have both noncategorical and categorical SES indicators, we used a method that is often called factorial analysis of mixed data, which is essentially a generalization of PCA that can handle such mixed data (45,46). This method combines ordinary PCA for noncategorical data with multiple correspondence analysis for categorical data and is implemented in the R package PCAmix (47).…”
Section: Ses Measuresmentioning
confidence: 99%
“…Considering the potential bias on datasets, phylogenetic reconstruction of the Cactaceae tree was performed using three datasets with distinct amounts of missing data (MD; 40%, 60%, and 80%; supplementary material). We compared the phylogenetic tree topologies with the symmetric Robinson-Foulds (RF) pairwise distance (Robinson and Foulds, 1981) in the R package phytools (Revel, 2012) and performed a Principal Coordinate Analysis (PCoA) in the R package PCAmixdata (Chavent et al, 2017). We also conducted experimental pilots using the three MD datasets and observed a similar diversity pattern, except for the 40% MD dataset, which was an outlier in PCA, showed shorter branches, and displayed the highest level of PD.…”
Section: 2molecular Data and Phylogenetic Analysismentioning
confidence: 99%
“…However, this method can be sensitive to outliers and ignores important multivariate data features. An alternative approach is to apply a dimensionality reduction technique on mixed data, such as unimodal Variational Autoencoder (Simidjievski et al, 2019), Factor Analysis of Mixed Data (FAMD) (Pagès, 2021), or PCA for mixed data (PCAmix) (Chavent et al, 2014). An alternative regime is to operate in segmented datasets using methods such as subspace clustering and multi-view clustering.…”
Section: Mixed and Multimodal Datamentioning
confidence: 99%