Flow cytometric analysis allows rapid single cell interrogation of surface and intracellular determinants by measuring fluorescence intensity of fluorophore-conjugated reagents. The availability of new platforms, allowing detection of increasing numbers of cell surface markers, has challenged the traditional technique of identifying cell populations by manual gating and resulted in a growing need for the development of automated, high-dimensional analytical methods. We present a direct multivariate finite mixture modeling approach, using skew and heavy-tailed distributions, to address the complexities of flow cytometric analysis and to deal with high-dimensional cytometric data without the need for projection or transformation. We demonstrate its ability to detect rare populations, to model robustly in the presence of outliers and skew, and to perform the critical task of matching cell populations across samples that enables downstream analysis. This advance will facilitate the application of flow cytometry to new, complex biological and clinical problems.finite mixture model ͉ flow cytometry ͉ multivariate skew distribution F low cytometry transformed clinical immunology and hematology over 2 decades ago by allowing the rapid interrogation of cell surface determinants and, more recently, by enabling the analysis of intracellular events using fluorophore-conjugated antibodies or markers. Although flow cytometry initially allowed the investigation of only a single fluorophore, recent advances allow close to 20 parallel channels for monitoring different determinants (1-4). These advances have now surpassed our ability to interpret manually the resulting high-dimensional data and have led to growing interest and recent activity in the development of new computational tools and approaches (5-8).The difficulty in data analysis arises from the traditional technique of identifying discrete cell populations by manual gating, which is a labor-intensive process and varies by user experience. The initial computational packages for flow cytometric analyses focused largely on different preprocessing tasks such as data acquisition, normalization, and live cell gating. Besides visualization and transformation of flow cytometric data, useful tools such as Flowjo (www.flowjo.com) and the packages in BioConductor (www.bioconductor.org) (such as prada, flowCore, flowViz, flowUtils, and rflowcyt) allow some form of software-assisted gating and extraction of populations of interest. The operator subjectively demarcates a cell population while moving through successive 2-or 3-dimensional projections of the data. This process limits the reproducibility of data processing. A more fundamental problem is that this lower dimensional visualization hinders the identification of higher-dimensional features. Furthermore, current methods extract only a limited number of sample parameters, such as the mean fluorescence intensity of a cell population, which can lead to loss of critical information in defining the properties of a cell population....
Tissue factor is a membrane-bound procoagulant protein that activates the extrinsic pathway of blood coagulation in the presence of factor VII and calcium. X Phage containing the tissue factor gene were isolated from a human placental cDNA library. The amino acid sequence deduced from the nucleotide sequence of the cDNAs indicates that tissue factor is synthesized as a higher molecular weight precursor with a leader sequence of 32 amino acids, while the mature protein is a single polypeptide chain composed of 263 residues. The derived primary structure of tissue factor has been confirmed by comparison to protein and peptide sequence data. The sequence of the mature protein suggests that there are three distinct domains: extracellular, residues 1-219; hydrophobic, residues 220-242; and cytoplasmic, residues 243-263. Three potential N-linked carbohydrate attachment sites occur in the extracellular domain. The amino acid sequence of tissue factor shows no significant homology with the vitamin Kdependent serine proteases, coagulation cofactors, or any other protein in the National Biomedical Research Foundation sequence data bank (Washington, DC).Blood coagulation can be initiated by a complex of tissue factor (TF), a membrane-bound glycoprotein, and factor VII, a plasma coagulation factor (for reviews, see refs. 1 and 2). The physiological significance of this extrinsic pathway can be judged by the severe bleeding frequently observed in individuals who are markedly deficient in factor VII (3, 4). In contrast, individuals who have deficiencies or abnormalities in proteins that are involved in the early steps of the intrinsic pathway of coagulation-i.e., high molecular weight kininogen, prekallikrein, and factor XII-are asymptomatic (5). The TF-factor VII complex activates factor IX, a component of the intrinsic pathway, as well as factor X (6). Thus, it is reasonable that the association of TF and factor VII may be the crucial event triggering the initiation of clotting in vivo.The cDNAs for all of the proteins involved in TF-initiated coagulation, with the exception of TF itself, have already been cloned and sequenced (7-13). The TF apoprotein has been purified from both bovine and human sources (14-16). Approximately 50-70% of the amino acid sequence of both species has now been determined, and this has permitted us to select suitable amino acid sequences to serve as a basis for constructing oligonucleotide probes. This in turn has enabled the isolation and characterization of two human placental TF cDNA clones that contain the entire coding region of the mature protein. The nucleotide sequence of these clones, together with amino acid sequence data, has allowed us to formulate a primary structure for the human TF apoprotein §. MATERIALS AND METHODSTF Purification and Sequencing. A monoclonal antibody (17) prepared against human TF that had been purified by using factor VII affinity columns (15) was used for immunoaffinity isolation of TF. Briefly, TF was extracted from human brain or placental tissue acet...
This paper presents a robust mixture modeling framework using the multivariate skew t distributions, an extension of the multivariate Student's t family with additional shape parameters to regulate skewness. The proposed model results in a very complicated likelihood. Two variants of Monte Carlo EM algorithms are developed to carry out maximum likelihood estimation of mixture parameters. In addition, we offer a general information-based method for obtaining the asymptotic covariance matrix of maximum likelihood estimates. Some practical issues including the selection of starting values as well as the stopping criterion are also discussed. The proposed methodology is applied to a subset of the Australian Institute of Sport data for illustration.
A finite mixture model using the Student's t distribution has been recognized as a robust extension of normal mixtures. Recently, a mixture of skew normal distributions has been found to be effective in the treatment of heterogeneous data involving asymmetric behaviors across subclasses. In this article, we propose a robust mixture framework based on the skew t distribution to efficiently deal with heavy-tailedness, extra skewness and multimodality in a wide range of settings. Statistical mixture modeling based on normal, Student's t and skew normal distributions can be viewed as special cases of the skew t mixture model. We present analytically simple EM-type algorithms for iteratively computing maximum likelihood estimates. The proposed methodology is illustrated by analyzing a real data example
We report the crystal structure of an NH2-terminal 388-residue fragment of T4 DNA polymerase (protein N388) refined at 2.2 A resolution. This fragment contains both the 3'-5' exonuclease active site and part of the autologous mRNA binding site (J. D. Karam, personal communication). The structure of a complex between the apoprotein N388 and a substrate, p(dT)3, has been refined at 2.5 A resolution to a crystallographic R-factor of 18.7%. Two divalent metal ion cofactors, Zn(II) and Mn(II), have been located in crystals of protein N388 which had been soaked in solutions containing Zn(II), Mn(II), or both. The structure of the 3'-5' exonuclease domain of protein N388 closely resembles the corresponding region in the Klenow fragment despite minimal sequence identity. The side chains of four carboxylate residues that serve as ligands for the two metal ions required for catalysis are located in geometrically equivalent positions in both proteins with a rms deviation of 0.87 A. There are two main differences between the 3'-5' exonuclease active site regions of the two proteins: (I) the OH of Tyr-497 in the Klenow fragment interacts with the scissile phosphate in the active site whereas the OH of the equivalent tyrosine (Tyr-320) in protein N388 points away from the active center; (II) different residues form of the binding pocket for the 3'-terminal bases of the substrate. In the protein N388 complex the 3'-terminal base of p(dT)3 is rotated approximately 60 degrees relative to the position that the corresponding base occupies in the p(dT)3 complex with the Klenow fragment. Finally, a separate domain (residues 1-96) of protein N388 may be involved in mRNA binding that results in translational regulation of T4 DNA polymerase (Pavlov & Karam, 1994).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.