A single type of data store can hardly fulll every end-user requirements in the NoSQL world. Therefore, polyglot systems use dierent types of NoSQL datastores in combination. However, the heterogeneity of the data storage models makes managing the metadata a complex task in such systems, with only a handful of research carried out to address this. In this paper, we propose a hypergraph-based approach for representing the catalog of metadata in a polyglot system. Taking an existing common programming interface to NoSQL systems, we extend and formalize it as hypergraphs for managing metadata. Then, we dene design constraints and query transformation rules for three representative data store types. Furthermore, we propose a simple query rewriting algorithm using the catalog itself for these data store types and provide a prototype implementation. Finally, we show the feasibility of our approach on a use case of an existing polyglot system.
Abstract. Twitter is a social network that provides a powerful source of data. The analysis of those data offers many challenges among those stands out the opportunity to find the reputation of a product, of a person, or of any other entity of interest. Several tools for sentiment analysis have been built in order to calculate the general opinion of an entity using a static analysis of the sentiments expressed in tweets. However, entities are not static; they collaborate with other entities and get involved in events. A simple aggregation of sentiments is then not sufficient to represent this dynamism. In this paper, we present a new approach that identifies the reputation of an entity on the basis of the set of events it is involved into by providing a transparent and self explanatory way for interpreting reputation. In order to perform this analysis we define a new sampling method based on a tweet weighting to retrieve relevant information. In our experiments we show that the 90% of the reputation of the entity originates from the events it is involved into, especially in the case of entities that represent public figures.
Twitter is a social network that provides a powerful source of data. The analysis of those data offers many challenges among those stands out the opportunity to find reputation of a product, a person or any other entity of interest. Several approaches for sentiment analysis have been proposed in the literature to assess the general opinion expressed in tweets on an entity. Nevertheless, these methods aggregate sentiment scores retrieved from tweets, which is a static view to evaluate the overall reputation of an entity. The reputation of an entity is not static; entities collaborate with each other, and they get involved in different events over time. A simple aggregation of sentiment scores is then not sufficient to represent this dynamism. In this paper, we present a new approach to determine the reputation of an entity on the basis of the set of events in which it is involved. To achieve this, we propose a new sampling method driven by a tweet weighting measure to give a better quality and summary of the target entity. We introduce the concept of Frequent Named Entities to determine the events involving the target entity. Our evaluation achieved for different entities shows that 90% of the reputation of an entity originates from the events it is involved in and the breakdown into events allows interpreting the reputation in a transparent and self-explanatory way.
In this work, we propose HerM (Heterogeneous Distributed Model), a NoSQL data modeling approach which supports the use of multiple heterogeneous NoSQL systems in a distributed environment. We define the conceptual elements necessary for data modeling, and we identify optimized data distribution patterns. We implemented a flexible framework, where we deployed our proposed modeling strategies and that we evaluated comparing our approach against native the NoSQL data distribution methodology provided by the NoSQL databases MongoDB.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.