Identifying and labeling search tasks via query-based hawkes processes

Li, Liangda; Deng, Hongbo; Dong, Anlei; Chang, Yi; Zha, Hongyuan

doi:10.1145/2623330.2623679

Cited by 51 publications

(33 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Researchers usually estimate all M 2 influence parameters of a Hawkes process (e.g., [38,51]). However, in our setting, M > 10 6 , so there are O(10 12 ) influence parameters.…”

Section: Language Change As a Self-exciting Point Processmentioning

confidence: 99%

The Social Dynamics of Language Change in Online Networks

Goel

Soni

Goyal

et al. 2016

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Language change is a complex social phenomenon, revealing pathways of communication and sociocultural influence. But, while language change has long been a topic of study in sociolinguistics, traditional linguistic research methods rely on circumstantial evidence, estimating the direction of change from differences between older and younger speakers. In this paper, we use a data set of several million Twitter users to track language changes in progress. First, we show that language change can be viewed as a form of social influence: we observe complex contagion for phonetic spellings and "netspeak" abbreviations (e.g., lol), but not for older dialect markers from spoken language. Next, we test whether specific types of social network connections are more influential than others, using a parametric Hawkes process model. We find that tie strength plays an important role: densely embedded social ties are significantly better conduits of linguistic influence. Geographic locality appears to play a more limited role: we find relatively little evidence to support the hypothesis that individuals are more influenced by geographically local social ties, even in their usage of geographical dialect markers.

show abstract

“…Researchers usually estimate all M 2 influence parameters of a Hawkes process (e.g., [38,51]). However, in our setting, M > 10 6 , so there are O(10 12 ) influence parameters.…”

Section: Language Change As a Self-exciting Point Processmentioning

confidence: 99%

The Social Dynamics of Language Change in Online Networks

Goel

Soni

Goyal

et al. 2016

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…• QC-HTC/QC-WCC [20]: is series of methods viewed search task identi cation as the problem of best approximating the manually annotated tasks, and proposed both clustering and heuristic algorithms to solve the problem. • LDA-Hawkes [17]: a probabilistic method for identifying and labeling search tasks that model query temporal patterns using a special class of point process called Hawkes processes, and combine topic model with Hawkes processes for simultaneously identifying and labeling search tasks. • LDA Time-Window(TW): is model assumes queries belong to the same search task only if they lie in a xed or exible time window, and uses LDA to cluster queries into topics based on the query co-occurrences within the same time window.…”

Section: Search Task Identi Cationmentioning

confidence: 99%

Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach

Mehrotra

Yılmaz

2017

Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

View full text Add to dashboard Cite

A signi cant amount of search queries originate from some real world information need or tasks [13]. In order to improve the search experience of the end users, it is important to have accurate representations of tasks. As a result, signi cant amount of research has been devoted to extracting proper representations of tasks in order to enable search systems to help users complete their tasks, as well as providing the end user with be er query suggestions [9], for be er recommendations [41], for satisfaction prediction [36] and for improved personalization in terms of tasks [24,38]. Most existing task extraction methodologies focus on representing tasks as at structures. However, tasks o en tend to have multiple subtasks associated with them and a more naturalistic representation of tasks would be in terms of a hierarchy, where each task can be composed of multiple (sub)tasks. To this end, we propose an e cient Bayesian nonparametric model for extracting hierarchies of such tasks & subtasks. We evaluate our method based on real world query log data both through quantitative and crowdsourced experiments and highlight the importance of considering task/subtask hierarchies. KEYWORDS search tasks; bayesian non-parametrics; hierarchical model ACM Reference format:

show abstract

“…Li et al [16] also consider in-session tasks. They use query words, query co-occurrence, and the temporal sequence of queries as their main signals.…”

Section: In-session Tasksmentioning

confidence: 99%

“…In previous work [23,16], researchers often had human raters completely annotate search histories for a small number of users, and used that as training data. There are two reasons why this was not an option for us.…”

Section: Can We Annotate the Complete User History?mentioning

confidence: 99%

User Modeling for a Personal Assistant

Guha

Gupta

Raghunathan

et al. 2015

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining

View full text Add to dashboard Cite

We present a user modeling system that serves as the foundation of a personal assistant. The system ingests web search history for signed-in users, and identifies coherent contexts that correspond to tasks, interests, and habits. Unlike past work which focused on either in-session tasks or tasks over a few days, we look at several months of history in order to identify not just short-term tasks, but also long-term interests and habits. The features we use for identifying coherent contexts yield substantially higher precision and recall than past work. We also present an algorithm for identifying contexts that is 8 to 30 times faster than previous algorithms. The user modeling system has been deployed in production. It runs over hundreds of millions of users, and updates the models with a 10-minute latency. The contexts identified by the system serve as the foundation for generating recommendations in Google Now.

show abstract

Identifying and labeling search tasks via query-based hawkes processes

Cited by 51 publications

References 34 publications

The Social Dynamics of Language Change in Online Networks

The Social Dynamics of Language Change in Online Networks

Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach

User Modeling for a Personal Assistant

Contact Info

Product

Resources

About