Clustering Web services into functionally similar clusters is a very efficient approach to service discovery. A principal issue for clustering is computing the semantic similarity between services. Current approaches use similarity-distance measurement methods such as keyword, information-retrieval or ontology based methods. These approaches have problems that include discovering semantic characteristics, loss of semantic information and a shortage of high-quality ontologies. In this paper, the authors present a method that first adopts ontology learning to generate ontologies via the hidden semantic patterns existing within complex terms. If calculating similarity using the generated ontology fails, it then applies an information-retrieval-based method. Another important issue is identifying the most suitable cluster representative. This paper proposes an approach to identifying the cluster center by combining service similarity with term frequency–inverse document frequency values of service names. Experimental results show that our term-similarity approach outperforms comparable existing approaches. They also demonstrate the positive effects of our cluster-center identification approach.
SUMMARYWith the expansion of the Internet, the number of available Web services has increased. Web service clustering to identify functionally similar clusters has become a major approach to the efficient discovery of suitable Web services. In this study, we propose a Web service clustering approach that uses novel ontology learning and a similarity calculation method based on the specificity of an ontology in a domain with respect to information theory. Instead of using traditional methods, we generate the ontology using a novel method that considers the specificity and similarity of terms. The specificity of a term describes the amount of domain-specific information contained in that term. Although general terms contain little domain-specific information, specific terms may contain much more domain-related information. The generated ontology is used in the similarity calculations. New logic-based filters are introduced for the similarity-calculation procedure. If similarity calculations using the specified filters fail, then information-retrieval-based methods are applied to the similarity calculations. Finally, an agglomerative clustering algorithm, based on the calculated similarity values, is used for the clustering. We achieved highly efficient and accurate results with this clustering approach, as measured by improved average precision, recall, Fmeasure, purity and entropy values. According to the results, specificity of terms plays a major role when classifying domain information. Our novel ontology-based clustering approach outperforms comparable existing approaches that do not consider the specificity of terms.
Social media have become very popular in the last few decades. Users rely on social network sites like Twitter, Facebook, YouTube, and LinkedIn for both information and entertainment needs. Social media analytics with data mining technology could be an analysis axis centered on extracting trends, patterns, and rules from the social media pool, to serve the people and organizations to have optimum choices concerning many disciplines. The traditional media analytical techniques appear obsolete and inadequate to gratify this immense array of unstructured social media knowledge characterized by three key problems namely; size, noise, and dynamism, predominantly shifting from the batch scale to the streaming one. The objective of this study is to investigate the data mining techniques that were used by social media networks during the years 2010 and 2020. The effort is a systematic review of content analysis in studies within the field of social media analytics that was published in principal databases. 125 articles were reviewed in this paper. Content analysis was implemented based on their approach, tools utilized, language, the dataset used, country, year, and nature of the experiment. The review discovered that 22 data mining techniques were employed with social media data while frequently used in Artificial Neural Network (ANN), Bayesian networks (BN) and Support Vector Machine (SVM), K-means Clustering, and Neuro-Fuzzy Logic Approach. The study has focused to assist the involved analyzers and educators to capture the research trends and problems associated with the Social media analytics process with future research initiatives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.