Frequent subgraph mining has been extensively studied on certain graph data. However, uncertainties are inherently accompanied with graph data in practice, and there is very few work on mining uncertain graph data. This paper investigates frequent subgraph mining on uncertain graphs under probabilistic semantics. Specifically, a measure called ϕ-frequent probability is introduced to evaluate the degree of recurrence of subgraphs. Given a set of uncertain graphs and two numbers 0 < ϕ, τ < 1, the goal is to quickly find all subgraphs with ϕ-frequent probability at least τ . Due to the NP-hardness of the problem, an approximate mining algorithm is proposed for this problem. Let 0 < δ < 1 be a parameter. The algorithm guarantees to find any frequent subgraph S with probability at least 1−δ 2 s , where s is the number of edges of S. In addition, it is thoroughly discussed how to set δ to guarantee the overall approximation quality of the algorithm. The extensive experiments on real uncertain graph data verify that the algorithm is efficient and that the mining results have very high quality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.