No abstract
Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents. BOW, however, ignores the important information on the semantic relationships between key terms. To overcome this problem, several methods have been proposed to enrich text representation with external resource in the past, such as WordNet. However, many of these approaches suffer from some limitations: 1) WordNet has limited coverage and has a lack of effective word-sense disambiguation ability; 2) Most of the text representation enrichment strategies, which append or replace document terms with their hypernym and synonym, are overly simple. In this paper, to overcome these deficiencies, we first propose a way to build a concept thesaurus based on the semantic relations (synonym, hypernym, and associative relation) extracted from Wikipedia. Then, we develop a unified framework to leverage these semantic relations in order to enhance traditional content similarity measure for text clustering. The experimental results on Reuters and OHSUMED datasets show that with the help of Wikipedia thesaurus, the clustering performance of our method is improved as compared to previous methods. In addition, with the optimized weights for hypernym, synonym, and associative concepts that are tuned with the help of a few labeled data users provided, the clustering performance can be further improved.
Many real networks share three generic properties: they are scale-free, display a small-world effect, and show a power-law strength-degree correlation. In this paper, we propose a type of deterministically growing networks called Sierpinski networks, which are induced by the famous Sierpinski fractals and constructed in a simple iterative way. We derive analytical expressions for degree distribution, strength distribution, clustering coefficient, and strength-degree correlation, which agree well with the characterizations of various real-life networks. Moreover, we show that the introduced Sierpinski networks are maximal planar graphs.
With the help of recursion relations derived from the self-similar structure, we obtain the exact solution of average path length,dt, for Apollonian networks. In contrast to the well-known numerical resultdt ∝ (ln Nt) 3/4 [Phys. Rev. Lett. 94, 018702 (2005)], our rigorous solution shows that the average path length grows logarithmically asdt ∝ ln Nt in the infinite limit of network size Nt. The extensive numerical calculations completely agree with our closed-form solution.PACS numbers: 89.75. Hc, 89.75.Da, 02.10.Ox, One of the most important properties of complex networks is average path length (APL), which is the mean length of the shortest paths between all pairs of vertices (nodes) [1]. Most real networks have been shown to be small-world or ultra small-world networks [2,3,4,5], that is, their APL d behaves a logarithmic or double logarithmic scaling with the network size N :. It has been established that APL is relevant in many fields regarding real-life networks. In the design or interpretation of routes in architectural design, signal integrity in communication networks, the propagation of diseases or beliefs in social networks or of technology in industrial networks, APL is a natural network statistic to compute and interpret. It is strongly believed that many processes such as routing, searching, and spreading become more efficient when APL is smaller. So far, much attention has been paid to the question of APL [8,9,10,11,12,13].Recently, on the basis of the well-known Apollonian packing [14], Andrade et al. introduced Apollonian networks [15] which were also proposed by Doye and Massen in Ref. [16] simultaneously. Apollonian networks belong to a deterministic growing type of networks, which have drawn much attention from the scientific communities and have turned out to be a useful tool [17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]. Many topological properties of Apollonian networks such as degree distribution, clustering coefficient, and correlations have been determined analytically [15,16], and the effects of the Apollonian networks on several dynamical models have been intensively studied, including Ising model and a magnetic model [15,32,33,34]. Despite the importance and usefulness of the quantity APL, there is no analytical calculations for the APL of Apollonian networks.In this report, we derive an exact formula for the average path length characterizing the Apollonian networks. The analytic method is based on the recursive construction and self-similar structure of Apollonian networks. * Electronic address: sgzhou@fudan.edu.cn Our rigorous result shows that APL grows logarithmically with the number of nodes. The obtained analytical solution modifies the previous numerical result in [15], where the authors claimed that the APL of Apollonian networks scales sub-logarithmically with network size. Our analytical technique could provide a paradigm for computing the APL of deterministic networks.The Apollonian network, denoted as A t (t ≥ 0) after t generations, is constructed as follows [15]: Fo...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.