E. K. Park scite author profile

In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two documents use different collections of core words to represent the same topic, they may be falsely assigned to different clusters due to the lack of shared core words, although the core words they use are probably synonyms or semantically associated in other forms. The most common way to solve this problem is to enrich document representation with the background knowledge in an ontology. There are two major issues for this approach: (1) the coverage of the ontology is limited, even for WordNet or Mesh, (2) using ontology terms as replacement or additional features may cause information loss, or introduce noise. In this paper, we present a novel text clustering method to address these two issues by enriching document representation with Wikipedia concept and category information. We develop two approaches, exact match and relatedness-match, to map text documents to Wikipedia concepts, and further to Wikipedia categories. Then the text documents are clustered based on a similarity metric which combines document content information, concept information as well as category information. The experimental results using the proposed clustering framework on three datasets (20-newsgroup, TDT2, and LA Times) show that clustering performance improves significantly by enriching document representation with Wikipedia concepts and categories.

show abstract

Exploit the tripartite network of social tagging for web clustering

Chen

Park

2009

View full text Add to dashboard Cite

On Approximation of New Optimization Methods for Assessing Network Vulnerability

Dinh

Yang

Thai

et al. 2010

View full text Add to dashboard Cite

Assessing network vulnerability before potential disruptive events such as natural disasters or malicious attacks is vital for network planning and risk management. It enables us to seek and safeguard against most destructive scenarios in which the overall network connectivity falls dramatically. Existing vulnerability assessments mainly focus on investigating the inhomogeneous properties of graph elements, node degree for example, however, these measures and the corresponding heuristic solutions can provide neither an accurate evaluation over general network topologies, nor performance guarantees to large scale networks. To this end, in this paper, we investigate a measure called pairwise connectivity and formulate this vulnerability assessment problem as a new graph-theoretical optimization problem called β-disruptor, which aims to discover the set of critical node/edges, whose removal results in the maximum decline of the global pairwise connectivity. Our results consist of the NP-Completeness and inapproximability proof of this problem, an O(log n log log n) pseudo-approximation algorithm for detecting the set of critical nodes and an O(log 1.5 n) pseudoapproximation algorithm for detecting the set of critical edges. In addition, we devise an efficient heuristic algorithm and validate the performance of the our model and algorithms through extensive simulations.

show abstract

Minimum-Latency Beaconing Schedule in Multihop Wireless Networks

Wan

Wang

et al. 2009

View full text Add to dashboard Cite

Abstract-Minimum-latency beaconing schedule (MLBS) in synchronous multihop wireless networks seeks a schedule for beaconing with the shortest latency. This problem is NP-hard even when the interference radius is equal to the transmission radius. All prior works assume that the interference radius is equal to the transmission radius, and the best-known approximation ratio for MLBS under this special interference model is 7. In this paper, we present a new approximation algorithm called strip coloring for MLBS under the general protocol interference model. Its approximation ratio is at most 5 when the interference radius is equal to transmission radius, and is between 3 and 6 in general.

show abstract

Performance of an enhanced GSM protocol supporting non-repudiation of service

Stach

Park

Makki

1999

Computer Communications

View full text Add to dashboard Cite

Distributed virtual backbone construction in sensor networks with asymmetric links

Xiang

Xing

Cheng

et al. 2011

Wireless Communications

View full text Add to dashboard Cite

In this paper, we study the problem of distributed virtual backbone construction in sensor networks, where the coverage area of nodes are disks with different radii. This problem is modeled by the construction of a minimum connected dominating set (MCDS) in geometric k-disk graphs. We derive the size relationship of any maximal independent set (MIS) and MCDS in geometric k-disk graphs, and apply it to analyze the performances of two distributed connected dominating set (CDS) algorithms we propose in this paper. These algorithms have bounded performance ratio and low communication overhead. To the best of our knowledge, the results reported in this paper represent the state-of-the-art.

show abstract

Using domain knowledge in knowledge discovery

Yoon

Henschen

Park

et al. 1999

View full text Add to dashboard Cite

OntoKhoj

Patel

Supekar

Lee

et al. 2003

View full text Add to dashboard Cite

The goal of the next generation Web is to build virtual communities, wherein software agents and people can work in cooperation by sharing knowledge. To achieve this goal, the emerging Semantic Web community has proposed ontologies to express knowledge in a machine understandable way. The process of building and maintaining ontologies, which is known as Ontology Engineering, presents unique challenges. These challenges are related to lack of trustworthy and authoritative knowledge sources and absence of a centralized repository to locate ontologies to be reused. In this paper, we propose a Semantic Web portal, called OntoKhoj that is designed to simplify the Ontology Engineering process. The methodology in developing OntoKhoj is based on algorithms used for searching, aggregating, ranking and classifying ontologies in Semantic Web. The proposed OntoKhoj would 1) allow agents and ontology engineers to retrieve trustworthy, authoritative knowledge, and 2) expedite the process of ontology engineering through extensive reuse of ontologies. We have implemented the OntoKhoj portal and further validated our system on the real ontological data in the Semantic Web.

show abstract

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

E. K. Park

Exploiting Wikipedia as external knowledge for document clustering

Exploit the tripartite network of social tagging for web clustering

On Approximation of New Optimization Methods for Assessing Network Vulnerability

Minimum-Latency Beaconing Schedule in Multihop Wireless Networks

Performance of an enhanced GSM protocol supporting non-repudiation of service

Distributed virtual backbone construction in sensor networks with asymmetric links

Using domain knowledge in knowledge discovery

OntoKhoj

Contact Info

Product

Resources

About