Hamed Zamani scite author profile

Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. e reason may be the complexity of the ranking problem, as it is not obvious how to learn from queries and documents when no supervised signal is available. Hence, in this paper, we propose to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources (e.g., click data). To this aim, we use the output of an unsupervised ranking model, such as BM25, as a weak supervision signal. We further train a set of simple yet e ective ranking models based on feed-forward neural networks. We study their e ectiveness under various learning scenarios (point-wise and pair-wise models) and using di erent input representations (i.e., from encoding querydocument pairs into dense/sparse vectors to using word embedding representation). We train our networks using tens of millions of training instances and evaluate it on two standard collections: a homogeneous news collection (Robust) and a heterogeneous large-scale web collection (ClueWeb). Our experiments indicate that employing proper objective functions and le ing the networks to learn the input representation based on weakly supervised data leads to impressive performance, with over 13% and 35% MAP improvements over the BM25 model on the Robust and the ClueWeb collections. Our ndings also suggest that supervised neural ranking models can greatly bene t from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models.

show abstract

Asking Clarifying Questions in Open-Domain Information-Seeking Conversations

Aliannejadi

et al. 2019

View full text Add to dashboard Cite

Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience. Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Asking clarifying questions is especially important in conversational systems since they can only return a limited number of (often only one) result(s).In this paper, we formulate the task of asking clarifying questions in open-domain information-seeking conversational systems. To this end, we propose an offline evaluation methodology for the task and collect a dataset, called Qulac, through crowdsourcing. Our dataset is built on top of the TREC Web Track 2009-2012 data and consists of over 10K question-answer pairs for 198 TREC topics with 762 facets. Our experiments on an oracle model demonstrate that asking only one good question leads to over 170% retrieval performance improvement in terms of P@1, which clearly demonstrates the potential impact of the task. We further propose a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval. In particular, our question selection model takes into account the original query and previous question-answer interactions while selecting the next question. Our model significantly outperforms competitive baselines. To foster research in this area, we have made Qulac publicly available.

show abstract

Current challenges and visions in music recommender systems research

Schedl

Zamani

Chen

et al. 2018

Int J Multimed Info Retr

222

153

View full text Add to dashboard Cite

Music recommender systems (MRS) have experienced a boom in recent years, thanks to the emergence and success of online streaming services, which nowadays make available almost all music in the world at the user's ngertip. While today's MRS considerably help users to nd interesting music in these huge catalogs, MRS research is still facing substantial challenges. In particular when it comes to build, incorporate, and evaluate recommendation strategies that integrate information beyond simple user-item interactions or content-based descriptors, but dig deep into the very essence of listener needs, preferences, and intentions, MRS research becomes a big endeavor and related publications quite sparse. e purpose of this trends and survey article is twofold. We rst identify and shed light on what we believe are the most pressing challenges MRS research is facing, from both academic and industry perspectives. We review the state of the art towards solving these challenges and discuss its limitations. Second, we detail possible future directions and visions we contemplate for the further evolution of the eld. e article should therefore serve two purposes: giving the interested reader an overview of current challenges in MRS research and providing guidance for young researchers by identifying interesting, yet under-researched, directions in the eld.

show abstract

Generating Clarifying Questions for Information Retrieval

et al. 2020

View full text Add to dashboard Cite

A Deep Look into neural ranking models for information retrieval

Guo

Fan

Pang

et al. 2020

Information Processing & Management

210

141

View full text Add to dashboard Cite

Ranking models lie at the heart of research on information retrieval (IR). During the past decades, different techniques have been proposed for constructing ranking models, from traditional heuristic methods, probabilistic methods, to modern machine learning methods. Recently, with the advance of deep learning technology, we have witnessed a growing body of work in applying shallow or deep neural networks to the ranking problem in IR, referred to as neural ranking models in this paper. The power of neural ranking models lies in the ability to learn from the raw text inputs for the ranking problem to avoid many limitations of hand-crafted features. Neural networks have sufficient capacity to model complicated tasks, which is needed to handle the complexity of relevance estimation in ranking. Since there have been a large variety of neural ranking models proposed, we believe it is the right time to summarize the current status, learn from existing methodologies, and gain some insights for future development. In contrast to existing reviews, in this survey, we will take a deep look into the neural ranking models from different dimensions to analyze their underlying assumptions, major design principles, and learning strategies. We compare these models through benchmark tasks to obtain a comprehensive empirical understanding of the existing techniques. We will also discuss what is missing in the current literature and what are the promising and desired future directions.

show abstract

Embedding-based Query Language Models

Zamani

Croft

2016

View full text Add to dashboard Cite

Asymptotically Efficient Target Localization From Bistatic Range Measurements in Distributed MIMO Radars

Amiri

Behnia

Zamani

2017

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

Relevance-based Word Embedding

Zamani

Croft

2017

121

View full text Add to dashboard Cite

Learning a high-dimensional dense representation for vocabulary terms, also known as a word embedding, has recently a racted much a ention in natural language processing and information retrieval tasks. e embedding vectors are typically learned based on term proximity in a large corpus. is means that the objective in well-known word embedding algorithms, e.g., word2vec, is to accurately predict adjacent word(s) for a given word or context. However, this objective is not necessarily equivalent to the goal of many information retrieval (IR) tasks. e primary objective in various IR tasks is to capture relevance instead of term proximity, syntactic, or even semantic similarity. is is the motivation for developing unsupervised relevance-based word embedding models that learn word representations based on query-document relevance information. In this paper, we propose two learning models with di erent objective functions; one learns a relevance distribution over the vocabulary set for each query, and the other classi es each term as belonging to the relevant or non-relevant class for each query. To train our models, we used over six million unique queries and the top ranked documents retrieved in response to each query, which are assumed to be relevant to the query. We extrinsically evaluate our learned word representation models using two IR tasks: query expansion and query classi cation. Both query expansion experiments on four TREC collections and query classi cation experiments on the KDD Cup 2005 dataset suggest that the relevance-based word embedding models signi cantly outperform state-of-the-art proximity-based embedding models, such as word2vec and GloVe. KEYWORDSWord representation, neural network, embedding vector, query expansion, query classi cation ACM Reference format:

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hamed Zamani

Neural Ranking Models with Weak Supervision

Asking Clarifying Questions in Open-Domain Information-Seeking Conversations

Current challenges and visions in music recommender systems research

Generating Clarifying Questions for Information Retrieval

A Deep Look into neural ranking models for information retrieval

Embedding-based Query Language Models

Asymptotically Efficient Target Localization From Bistatic Range Measurements in Distributed MIMO Radars

Relevance-based Word Embedding

Contact Info

Product

Resources

About