Fact checking has captured the attention of the media and the public alike; it has also recently received strong attention from the computer science community, in particular from data and knowledge management, natural language processing and information retrieval; we denote these together under the term "content management". In this paper, we identify the fact checking tasks which can be performed with the help of content management technologies, and survey the recent research works in this area, before laying out some perspectives for the future. We hope our work will provide interested researchers, journalists and fact checkers with an entry point in the existing literature as well as help develop a roadmap for future research and development work.
In large-scale distributed information systems, where participants are autonomous and have special interests for some queries, query allocation is a challenge. Much work in this context has focused on distributing queries among providers in a way that maximizes overall performance (typically throughput and response time). However, preserving the participants' interests is also important. In this paper, we make the following contributions. First, we provide a model to define the participants' perception of the system regarding their interests and propose measures to evaluate the quality of query allocation methods. Then, we propose a framework for query allocation called Satisfactionbased Query Load Balancing (SQLB, for short), which dynamically trades consumers' interests for providers' interests based on their satisfaction. Finally, we compare SQLB, through experimentation, with two important baseline query allocation methods, namely Capacity based and Mariposa-like. The results demonstrate that SQLB yields high efficiency while satisfying the participants' interests and significantly outperforms the baseline methods.
Subgroup discovery (SD) is the task of discovering interpretable patterns in the data that stand out w.r.t. some property of interest. Discovering patterns that accurately discriminate a class from the others is one of the most common SD tasks. Standard approaches of the literature are based on local pattern discovery, which is known to provide an overwhelmingly large number of redundant patterns. To solve this issue, pattern set mining has been proposed: instead of evaluating the quality of patterns separately, one should consider the quality of a pattern set as a whole. The goal is to provide a small pattern set that is diverse and well-discriminant to the target class. In this work, we introduce a novel formulation of the task of diverse subgroup set discovery where both discriminative power and diversity of the subgroup set are incorporated in the same quality measure. We propose an efficient and parameter-free algorithm dubbed FSSD and based on a greedy scheme. FSSD uses several optimization strategies that enable to efficiently provide a high quality pattern set in a short amount of time.
When users need to perform a digital activity, they evaluate available systems according to their functionality, ease of use, QoS, and/or economical aspects. Recently, trust has become another key factor for such evaluation. Two main issues arise in the trust management research community. First, how to define the trust in an entity, knowing that this can be a person, a digital or a physical resource. Second, how to evaluate such value of trust in a system as a whole for a particular activity. Defining and evaluating trust in systems is an open problem because there is no consensus on the used approach. In this work we propose an approach applicable to any kind of system. The distinctive feature of our proposal is that, besides taking into account the trust in the different entities the user depends on to perform an activity, it takes into consideration the architecture of the system to determine its trust level. Our goal is to enable users to have a personal comparison between different systems for the same application needs and to choose the one satisfying their expectations. This paper introduces our approach, which is based on probability theory, and presents ongoing results.
In the last decade, stream processing has become a very active research domain motivated by the growing number of stream-based applications. These applications make use of continuous queries, which are processed by a stream processing engine (SPE) to generate timely results given the ephemeral input data. Variations of input data streams, in terms of both volume and distribution of values, have a large impact on computational resource requirements. Dynamic and Automatic Balanced Scaling for Storm (DABS-Storm) is an original solution for handling dynamic adaptation of continuous queries processing according to evolution of input stream properties, while controlling the system stability. Both fluctuations in data volume and distribution of values within data streams are handled by DABS-Storm to adjust the resources usage that best meets processing needs. To achieve this goal, the DABS-Storm holistic approach combines a proactive auto-parallelization algorithm with a latency-aware load balancing strategy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.