High-Dimensional Bayesian Clustering with Variable Selection: The<i>R</i>Package<b>bclust</b>

Nia, Vahid Partovi; Davison, A. C.

doi:10.18637/jss.v047.i05

Cited by 34 publications

(9 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To determine the subtypes of LIHC, we performed a Bayesian clustering method with a spike-and-slab hierarchical model, which was suitable for clustering high-dimensional data using the function "bclust" in R package "e1071" [20].…”

Section: Subtypes Of Lihcmentioning

confidence: 99%

Identification of Diagnostic Biomarkers and Subtypes of Liver Hepatocellular Carcinoma by Multi-Omics Data Analysis

Ouyang

Fan

Ling

et al. 2020

Genes

View full text Add to dashboard Cite

As liver hepatocellular carcinoma (LIHC) has high morbidity and mortality rates, improving the clinical diagnosis and treatment of LIHC is an important issue. The advent of the era of precision medicine provides us with new opportunities to cure cancers, including the accumulation of multi-omics data of cancers. Here, we proposed an integration method that involved the Fisher ratio, Spearman correlation coefficient, classified information index, and an ensemble of decision trees (DTs) for biomarker identification based on an unbalanced dataset of LIHC. Then, we obtained 34 differentially expressed genes (DEGs). The ability of the 34 DEGs to discriminate tumor samples from normal samples was evaluated by classification, and a high area under the curve (AUC) was achieved in our studied dataset and in two external validation datasets (AUC = 0.997, 0.973, and 0.949, respectively). Additionally, we also found three subtypes of LIHC, and revealed different biological mechanisms behind the three subtypes. Mutation enrichment analysis showed that subtype 3 had many enriched mutations, including tumor protein p53 (TP53) mutations. Overall, our study suggested that the 34 DEGs could serve as diagnostic biomarkers, and the three subtypes could help with precise treatment for LIHC.

show abstract

Section: Subtypes Of Lihcmentioning

confidence: 99%

Identification of Diagnostic Biomarkers and Subtypes of Liver Hepatocellular Carcinoma by Multi-Omics Data Analysis

Ouyang

Fan

Ling

et al. 2020

Genes

View full text Add to dashboard Cite

show abstract

“…We applied Bayesian agglomerative clustering to find groupings in the infants based on their metabolomics profile. This approach is highly suitable for low-sample-size-high-dimensional data (41) where it is commonly difficult to provide reasonable statistical models (42,43). In contrast to the hierarchical cluster approach where the user has to decide and calculate other metrics such as the silhouette width in order to decide for the optimal grouping, the optimal grouping is returned by the Bayesian clustering procedure.…”

Section: Statistical Considerationsmentioning

confidence: 99%

“…The reason why we chose Bayesian clustering is that it is useful for high-dimensional continuous data (41). This is in contrast to distance-based hierarchical clustering techniques which may fail in high-dimensional settings (42,43).…”

Section: Statistical Considerationsmentioning

confidence: 99%

Are All Breast‐fed Infants Equal? Clustering Metabolomics Data to Identify Predictive Risk Clusters for Childhood Obesity

Kirchberg

Grote

Gruszfeld

et al. 2019

J. pediatr. gastroenterol. nutr.

View full text Add to dashboard Cite

Objectives: Fetal and early life represent a period of developmental plasticity during which metabolic pathways are modified by environmental and nutritional cues. Little is known on the pathways underlying this multifactorial complex. We explored whether 6 months old breast-fed infants could be clustered into metabolically similar groups and that those metabotypes could be used to predict later obesity risk. Methods: Plasma samples were obtained from 183 breast-fed infants aged 6 months participating in the European multicenter Childhood Obesity Project study. We measured amino acids along with polar lipid concentrations (acylcarnitines, lysophosphatidylcholines, phosphatidylcholines, sphingomyelins). We determined the metabotypes using a Bayesian agglomerative clustering method and investigated the properties of these clusters with respect to clinical, programming, and metabolic factors up to 6 years of age. Results: We identified 20 metabolite clusters comprising 1 to 39 children. Phosphatidylcholines predominantly influenced the clustering process. In the largest clusters (n ! 14), large differences existed for birth length (unadjusted P < 0.0001) and length and weight at 6 months (unadjusted P < 0.0001 and P ¼ 0.012, respectively). Infants tended to cluster together by country (unadjusted P < 0.001). The body mass index (BMI) z score at 6 years of age tended to differ (unadjusted P ¼ 0.07). Conclusions: Our exploratory study provided evidence that breast-fed infants are not metabolically homogeneous and that variation in metabolic profiles among infants may provide insight into later development and health. This work highlights the potential of metabotypes for identifying inter-individual differences that may form the basis for developing personalized early preventive strategies.

show abstract

“…Examples of SEM approaches are introduced and discussed by (Berry, Carlin, Lee, and Müller 2010, chapter 2), Thall, Wathen, Bekele, Champlin, Baker, and Benjamin (2003), and Berry, Broglio, Groshen, and Berry (2013, with providing additional background on these specific SEM implementations. SEM approaches are also implemented in packages by Nia and Davison (2012) and Savage, Cooke, Darkins, and Xu (2018) and have been extended to more specialized applications in fMRI studies (Stocco 2014), modeling clearance rates of parasites in biological organisms (Sharifi-Malvajerdi, Zhu, Fogarty, Fay, Fairhurst, Flegg, Stepniewska, and Small 2019), modeling genomic bifurcations (Campbell and Yau 2017), modeling ChIP-seq data through hidden Ising models (Mo 2018), modeling genome-wide nucleosome positioning with high-throughput short-read data (Samb, Khadraoui, Belleau, Deschênes, Lakhal-Chaieb, and Droit 2015), and modeling cross-study analysis of differential gene expression (Scharpf, Tjelmeland, Parmigiani, and Nobel 2009).…”

Section: The Single-source Exchangeability Modelmentioning

confidence: 99%

Analyzing Basket Trials under Multisource Exchangeability Assumptions

Kane¹,

Chen²,

Kaizer³

et al. 2019

Preprint

View full text Add to dashboard Cite

Basket designs are prospective clinical trials that are devised with the hypothesis that the presence of selected molecular features determine a patient's subsequent response to a particular "targeted" treatment strategy. Basket trials are designed to enroll multiple clinical subpopulations to which it is assumed that the therapy in question offers beneficial efficacy in the presence of the targeted molecular profile. The treatment, however, may not offer acceptable efficacy to all subpopulations enrolled. Moreover, for rare disease settings, such as oncology wherein these trials have become popular, marginal measures of statistical evidence are difficult to interpret for sparsely enrolled subpopulations. Consequently, basket trials pose challenges to the traditional paradigm for trial design, which assumes inter-patient exchangeability. The R-package basket facilitates the analysis of basket trials by implementing multi-source exchangeability models. By evaluating all possible pairwise exchangeability relationships, this hierarchical modeling framework facilitates Bayesian posterior shrinkage among a collection of discrete and pre-specified subpopulations. Analysis functions are provided to implement posterior inference of the response rates and all possible exchangeability relationships between subpopulations. In addition, the package can identify "poolable" subsets of and report their response characteristics. The functionality of the package is demonstrated using data from an oncology study with subpopulations defined by tumor histology.

show abstract

High-Dimensional Bayesian Clustering with Variable Selection: TheRPackagebclust

Cited by 34 publications

References 29 publications

Identification of Diagnostic Biomarkers and Subtypes of Liver Hepatocellular Carcinoma by Multi-Omics Data Analysis

Identification of Diagnostic Biomarkers and Subtypes of Liver Hepatocellular Carcinoma by Multi-Omics Data Analysis

Are All Breast‐fed Infants Equal? Clustering Metabolomics Data to Identify Predictive Risk Clusters for Childhood Obesity

Analyzing Basket Trials under Multisource Exchangeability Assumptions

Contact Info

Product

Resources

About