We consider the mining of hidden block structures from time-varying data using evolutionary co-clustering. Existing methods are based on the spectral learning framework, thus lacking a probabilistic interpretation. To overcome this limitation, we develop a probabilistic model for evolutionary co-clustering in this paper. The proposed model assumes that the observed data are generated via a two-step process that depends on the historic co-clusters, thereby capturing the temporal smoothness in a probabilistically principled manner. We develop an EM algorithm to perform maximum likelihood parameter estimation. An appealing feature of the proposed probabilistic model is that it leads to soft co-clustering assignments naturally. To the best of our knowledge, our work represents the first attempt to perform evolutionary soft co-clustering. We evaluate the proposed method on both synthetic and real data sets. Experimental results show that our method consistently outperforms prior approaches based on spectral method.
IntroductionMany real-world processes are dynamically changing over time. As a consequence, the observed data generated by these processes also evolve smoothly. For example, in literature mining, the author-conference cooccurrence matrix evolves dynamically over time, since authors may shift their research interests smoothly. In computational biology, the expression data matrices are evolving, since gene expression controls are deployed sequentially in many biological processes. Temporal data mining aims at discovering knowledge from timevarying data and is now receiving increasing attention in many domains, including graph and network analysis [18,2,26] [1,4,6,19], and matrix factorization [28]. Since the data are evolving smoothly over time, the patterns embedded into the data are also expected to change smoothly. Therefore, one of the key challenges in temporal data mining is how to incorporate temporal smoothness into the patterns identified from temporally adjacent time points.Co-clustering, also known as bi-clustering, aims at