Sungmin Yi scite author profile

Structured entities are commonly abstracted, such as from XML, RDF or hidden-web databases. Direct retrieval of various structured entities is highly demanded in data lakes, e.g., given a JSON object, to find the XML entities that denote the same real-world object. Existing approaches on evaluating structured entity similarity emphasize too much the structural inconsistency. Indeed, entities from heterogeneous sources could have very distinct structures, owing to various information representation conventions. We argue that the retrieval could be more tolerant to structural differences and focus more on the contents of the entities. In this paper, we first identify the unique challenge of parent-child (containment) relationships among structured entities, which unfortunately prevent the retrieval of proper entities (returning parents or children). To solve the problem, a novel hierarchy smooth function is proposed to combine the term scores in different nodes of a structured entity. Entities sharing the same structure, namely an entity family, are employed to learn the coefficient in aggregating the scores, and thus distinguish/prune the parent or child entities. Remarkably, the proposed method could cooperate with both the bag-of-words (BOW) and word embedding models, successful in retrieving unstructured documents, for querying structured entities. Extensive experiments on real datasets demonstrate that our proposal is effective and efficient.

show abstract

Moving view field nearest neighbor queries

Kim

Shim

Heo

et al. 2019

Data & Knowledge Engineering

View full text Add to dashboard Cite

On processing view field nearest neighbor queries on the R<sup>∗</sup>-tree

Jung

Park

et al. 2011

View full text Add to dashboard Cite

Reverse View Field Nearest Neighbor queries

Shim

Chung

2017

Information Sciences

View full text Add to dashboard Cite

SSFile: A novel column-store for efficient data analysis in Hadoop-based distributed systems

Son

Ryu

et al. 2015

Information Sciences

View full text Add to dashboard Cite

Load Balancing for Real-Time, Location-Based Event Processing on Cloud Systems

Ryu

Chung

2013

View full text Add to dashboard Cite

For large-scale, and real-time processing, cloud systems are widely used due to their high scalability and availability. In this paper, we propose workload distribution methods for location-based event processing on the cloud systems. We define a measure of the workload, and focus on the balanced distribution of workload because in cloud systems the workload distribution is very important with respect to the system performance. For the balanced distribution of workload, we propose four methods: (1) round-robin data distribution, (2) round-robin query distribution, (3) data/query distribution via space partitioning and (4) skew-aware distribution. The roundrobin data distribution method focuses on a balanced distribution of event data whereas queries are replicated in all cluster nodes. In the round-robin query distribution method, queries are evenly distributed whereas the event is replicated. The data/query distribution via space partitioning distributes event data and queries based on their spatial attribute values. Lastly, the skew-aware distribution method considers the non-uniformity of event data and queries. With extensive experiments, we evaluate the performances of our proposed methods

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sungmin Yi

View field nearest neighbor: A novel type of spatial queries

Nearest close friend search in geo-social networks

Effective and efficient retrieval of structured entities

Moving view field nearest neighbor queries

On processing view field nearest neighbor queries on the R<sup>∗</sup>-tree

Reverse View Field Nearest Neighbor queries

SSFile: A novel column-store for efficient data analysis in Hadoop-based distributed systems

Load Balancing for Real-Time, Location-Based Event Processing on Cloud Systems

Contact Info

Product

Resources

About

Sungmin Yi

View field nearest neighbor: A novel type of spatial queries

Nearest close friend search in geo-social networks

Effective and efficient retrieval of structured entities

Moving view field nearest neighbor queries

On processing view field nearest neighbor queries on the R<sup>&#x2217;</sup>-tree

Reverse View Field Nearest Neighbor queries

SSFile: A novel column-store for efficient data analysis in Hadoop-based distributed systems

Load Balancing for Real-Time, Location-Based Event Processing on Cloud Systems

Contact Info

Product

Resources

About

On processing view field nearest neighbor queries on the R<sup>∗</sup>-tree