The progression of breast cancer involves cancer-cell invasions of extracellular matrices. To investigate the progression, 3D cell cultures are widely used along with different types of matrices. Currently, the matrices are often characterized using parallel-plate rheometry for matrix viscoelasticity, or liquid-like viscous and stiffness-related elastic characteristics. The characterization reveals averaged information and sample-to-sample variation, yet, it neglects internal heterogeneity within matrices, experienced by cancer cells in 3D culture. Techniques using optical tweezers and magnetic microrheometry have measured heterogeneity in viscoelasticity in 3D culture. However, there is a lack of probabilistic heterogeneity quantification and cell-size-relevant, microscale-viscoelasticity measurements at breast-tumor tissue stiffness up to ≃10 kPa in Young’s modulus. Here, we have advanced methods, for the purpose, which use a magnetic microrheometer that applies forces on magnetic spheres within matrices, and detects the spheres displacements. We present probabilistic heterogeneity quantification using microscale-viscoelasticity measurements in 3D culture matrices at breast-tumor-relevant stiffness levels. Bayesian multilevel modeling was employed to distinguish heterogeneity in viscoelasticity from the effects of experimental design and measurement errors. We report about the heterogeneity of breast-tumor-relevant agarose, GrowDex, GrowDex–collagen and fibrin matrices. The degree of heterogeneity differs for stiffness, and phase angle (i.e. ratio between viscous and elastic characteristics). Concerning stiffness, agarose and GrowDex show the lowest and highest heterogeneity, respectively. Concerning phase angle, fibrin and GrowDex–collagen present the lowest and the highest heterogeneity, respectively. While this heterogeneity information involves softer matrices, probed by ≃30 μm magnetic spheres, we employ larger ≃100 μm spheres to increase magnetic forces and acquire a sufficient displacement signal-to-noise ratio in stiffer matrices. Thus, we show pointwise microscale viscoelasticity measurements within agarose matrices up to Young’s moduli of 10 kPa. These results establish methods that combine magnetic microrheometry and Bayesian multilevel modeling for enhanced heterogeneity analysis within 3D culture matrices.
High-throughput sequencing (HTS) technologies have enabled rapid sequencing of genomes and large-scale genome analytics with massive data sets. Traditionally, genetic variation analyses have been based on the human reference genome assembled from a relatively small human population. However, genetic variation could be discovered more comprehensively by using a collection of genomes i.e., pan-genome as a reference. The pan-genomic references can be assembled from larger populations or a specific population under study. Moreover, exploiting the pan-genomic references with current bioinformatics tools requires efficient compression and indexing methods. To be able to leverage the accumulating genomic data, the power of distributed and parallel computing has to be harnessed for the new genome analysis pipelines. We propose a scalable distributed pipeline, PanGenSpark, for compressing and indexing pan-genomes and assembling a reference genome from the pan-genomic index. We experimentally show the scalability of the Pan-GenSpark with human pan-genomes in a distributed Spark cluster comprising 448 cores distributed to 26 computing nodes. Assembling a consensus genome of a pan-genome including 50 human individuals was performed in 215 minutes and with 500 human individuals in 1468 minutes. The index of 1.41 TB pan-genome was compressed into a size of 164.5 GB in our experiments.
Computational pan-genomics utilizes information from multiple individual genomes in large-scale comparative analysis. Genetic variation between case-controls, ethnic groups, or species can be discovered thoroughly using pan-genomes of such subpopulations. Whole-genome sequencing (WGS) data volumes are growing rapidly, making genomic data compression and indexing methods very important. Despite current space-efficient repetitive sequence compression and indexing methods, the deployed compression methods are often sequential, computationally time-consuming, and do not provide efficient sequence alignment performance on vast collections of genomes such as pan-genomes. For performing rapid analytics with the ever-growing genomics data, data compression and indexing methods have to exploit distributed and parallel computing more efficiently. Instead of strict genome data compression methods, we will focus on the efficient construction of a compressed index for pan-genomes. Compressed hybrid-index enables fast sequence alignments to several genomes at once while shrinking the index size significantly compared to traditional indexes. We propose a scalable distributed compressed hybrid-indexing method for large genomic data sets enabling pan-genome-based sequence search and read alignment capabilities. We show the scalability of our tool, DHPGIndex, by executing experiments in a distributed Apache Spark-based computing cluster comprising 448 cores distributed over 26 nodes. The experiments have been performed both with human and bacterial genomes. DHPGIndex built a BLAST index for n = 250 human pan-genome with an 870:1 compression ratio (CR) in 342 minutes and a Bowtie2 index with 157:1 CR in 397 minutes. For n = 1,000 human pan-genome, the BLAST index was built in 1520 minutes with 532:1 CR and the Bowtie2 index in 1938 minutes with 76:1 CR. Bowtie2 aligned 14.6 GB of paired-end reads to the compressed (n = 1,000) index in 31.7 minutes on a single node. Compressing n = 13,375,031 (488 GB) GenBank database to BLAST index resulted in CR of 62:1 in 575 minutes. BLASTing 189,864 Crispr-Cas9 gRNA target sequences (23 MB in total) to the compressed index of human pan-genome (n = 1,000) finished in 45 minutes on a single node. 30 MB mixed bacterial sequences were (n = 599) were blasted to the compressed index of 488 GB GenBank database (n = 13,375,031) in 26 minutes on 25 nodes. 78 MB mixed sequences (n = 4,167) were blasted to the compressed index of 18 GB E. coli sequence database (n = 745,409) in 5.4 minutes on a single node.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.