The term big data occurs more frequently now than ever before. A large number of field and subjects, ranging from everyday life to traditional research field (i.e., geography and transportation, biology and chemistry, medicine and rehabilitation), involve big data problems. The popularizing of various types of network has diversifie types, issues, and solutions for big data more than ever be-fore. In this paper, we review recent research in data types, storage models, privacy, data security, analysis methods, and applications related to network big data. Finally, we summarize the challenges and development of big data to predict current and future trends.
The goal of multi-winner elections is to choose a fixed-size committee based on voters’ preferences. An important concern in this setting is representation: large groups of voters with cohesive preferences should be adequately represented by the election winners. Recently, Aziz et al. proposed two axioms that aim to capture this idea: justified representation (JR) and its strengthening extended justified representation (EJR). In this paper, we extend the work of Aziz et al. in several directions. First, we answer an open question of Aziz et al., by showing that Reweighted Approval Voting satisfies JR for k = 3; 4; 5, but fails it for k >= 6. Second, we observe that EJR is incompatible with the Perfect Representation criterion, which is important for many applications of multi-winner voting, and propose a relaxation of EJR, which we call Proportional Justified Representation (PJR). PJR is more demanding than JR, but, unlike EJR, it is compatible with perfect representation, and a committee that provides PJR can be computed in polynomial time if the committee size divides the number of voters. Moreover, just like EJR, PJR can be used to characterize the classic PAV rule in the class of weighted PAV rules. On the other hand, we show that EJR provides stronger guarantees with respect to average voter satisfaction than PJR does.
With the eruption of online social networks, like Twitter and Facebook, a series of new APIs have appeared to allow access to the data that these new sources of information accumulate. One of most popular online social networks is the micro blogging site Twitter. Its APIs allow many machines to access the torrent simultaneously to Twitter data, listening to tweets and accessing other useful information such as user profiles. A number of tools have appeared for processing Twitter data with different algorithms and for different purposes. In this paper T Hoarder is described: a framework that enables tweet crawling, data filtering, and which is also able to display summarized and analytical information about the Twitter activity with respect to a certain topic or event in a web page. This information is updated on a daily basis. The tool has been validated with real use cases that allow making a series of analysis on the performance one may expect from this type of infrastructure.
SUMMARYService-based approach has been successfully applied to distributed environments, modelling them as pieces of functionality that exchange information by means of messages in order to achieve a common goal. The advantages of this approach can be also be applied to distributed real-time systems, increasing their flexibility and allowing the creation of new brand applications from existing services in the system. If this is an online process, then time-bounded composition algorithms are needed to not jeopardize the performance of the whole system. Different composition algorithms are studied and proposed, two of them optimal and another two based on heuristics. This paper presents an analytical solution that selects, depending on the structure of the application and on the load of the whole system, the most suitable composition algorithm to be executed in order to obtain a composed application in bounded time.
Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Abstract-Current trends in industrial systems opt for the use of different big-data engines as a mean to process huge amounts of data that cannot be processed with an ordinary infrastructure. The number of issues an industrial infrastructure has to face is large and includes challenges such as the definition of different efficient architecture setups for different applications, and the definition of specific models for industrial analytics. In this context, the article explores the development of a medium size big-data engine (i.e. implementation) able to improve performance in map-reduce computing by splitting the analytic into different segments that may be processed by the engine in parallel using a hierarchical model. This type of facility reduces end-to-end computation time for all segments with their results then merged with other information from other segments after their processing in parallel. This type of setup increases performance of current clusters improving I/O operations remarkably as empirical results revealed.
Abstract-Current infrastructures for developing big-data applications are able to process -via big-data analytics-huge amounts of data, using clusters of machines that collaborate to perform parallel computations. However, current infrastructures were not designed to work with the requirements of time-critical applications; they are more focused on general-purpose applications rather than time-critical ones. Addressing this issue from the perspective of the real-time systems community, this paper considers time-critical big-data. It deals with the definition of a time-critical big-data system from the point of view of requirements, analyzing the specific characteristics of some popular big-data applications. This analysis is complemented by the challenges stemmed from the infrastructures that support the applications, proposing an architecture and offering initial performance patterns that connect application costs with infrastructure performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.