Knowledge of spatial chromosomal organizations is critical for the study of transcriptional regulation and other nuclear processes in the cell. Recently, chromosome conformation capture (3C) based technologies, such as Hi-C and TCC, have been developed to provide a genome-wide, three-dimensional (3D) view of chromatin organization. Appropriate methods for analyzing these data and fully characterizing the 3D chromosomal structure and its structural variations are still under development. Here we describe a novel Bayesian probabilistic approach, denoted as “Bayesian 3D constructor for Hi-C data” (BACH), to infer the consensus 3D chromosomal structure. In addition, we describe a variant algorithm BACH-MIX to study the structural variations of chromatin in a cell population. Applying BACH and BACH-MIX to a high resolution Hi-C dataset generated from mouse embryonic stem cells, we found that most local genomic regions exhibit homogeneous 3D chromosomal structures. We further constructed a model for the spatial arrangement of chromatin, which reveals structural properties associated with euchromatic and heterochromatic regions in the genome. We observed strong associations between structural properties and several genomic and epigenetic features of the chromosome. Using BACH-MIX, we further found that the structural variations of chromatin are correlated with these genomic and epigenetic features. Our results demonstrate that BACH and BACH-MIX have the potential to provide new insights into the chromosomal architecture of mammalian cells.
Skyline query processing has been investigated extensively in recent years, mostly for only one query reference point. An example of a single-source skyline query is to find hotels which are cheap and close to the beach (an absolute query), or close to a user-given location (a relatively query). A multi-source skyline query considers several query points at the same time (e.g., to find hotels which are cheap and close to the University, the Botanic Garden and the China Town). In this paper, we consider the problem of efficient multi-source skyline query processing in road networks. It is not only the first effort to consider multi-source skyline query in road networks but also the first effort to process the relative skyline queries where the network distance between two locations needs to be computed on-the-fly. Three different query processing algorithms are proposed and evaluated in this paper. The Lower Bound Constraint algorithm (LBC) is proven to be an instance optimal algorithm. Extensive experiments using large real road network datasets demonstrate that LBC is four times more efficient than a straightforward algorithm.
We show that telco big data can make churn prediction much more easier from the 3V's perspectives: Volume, Variety, Velocity. Experimental results confirm that the prediction performance has been significantly improved by using a large volume of training data, a large variety of features from both business support systems (BSS) and operations support systems (OSS), and a high velocity of processing new coming data. We have deployed this churn prediction system in one of the biggest mobile operators in China. From millions of active customers, this system can provide a list of prepaid customers who are most likely to churn in the next month, having 0.96 precision for the top 50000 predicted churners in the list. Automatic matching retention campaigns with the targeted potential churners significantly boost their recharge rates, leading to a big business value.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.