Prior work on using retrievability measures in the evaluation of information retrieval (IR) systems has laid out the foundations for investigating the relationship between retrieval eectiveness and retrieval bias. While various factors inuencing bias have been examined, there has been no work examining the impact of using bigram within the index on retrieval bias. Intuitively, how the documents are represented, and what terms they contain, will inuence whether they are retrievable or not. In this paper, we investigate how the bias of a system changes depending on how the documents are represented using unigrams, bigrams or both. Our analysis of three dierent retrieval models on three TREC collections, shows that using a bigram only representation results in the lowest bias compared to unigram only representation, but at the expense of retrieval eectiveness. However, when both representations are combined it results in reducing the overall bias, as well as increasing eectiveness. These ndings suggest that when conguring and indexing the collection, that the bag-of-words approach (unigrams), should be augmented with bigrams to create better and fairer retrieval systems.
The First Early Career Researchers Roundtable for Information Access Research Workshop , in conjunction with the Seventh ACM Conference on Human Information Interaction and Retrieval (CHIIR) 2022, looked into the future of research, collaborations, and self-development to ask the following. Where are the opportunities for researchers in a (post-)pandemic environment, especially for Early Career Researchers (ECRs)? What do we need to do to get there? Which practical implementations can the broader CHIIR community support? The workshop started with an invited talk. Instead of conventional paper presentations, the attendees discussed the lessons learned from working in a pandemic. This report, co-authored by the workshop's organisers and its participants, summarises the discussion. This report aims to provide the broader CHIIR community with feedback on the workshop and foster ideas raised by ECRs to support ECRs. Two primary outcomes are (i) ECRs are often enthusiastic about taking on roles within a community, but formal validation and recognition are needed for their efforts and (ii) that the role of a conference needs to be reevaluated optimising the benefits of attending the event. Date: 14 March 2022. Website: https://sites.google.com/view/ecrs4ir/home.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.