2019
DOI: 10.1162/tacl_a_00287
|View full text |Cite
|
Sign up to set email alerts
|

Weakly Supervised Domain Detection

Abstract: In this paper we introduce domain detection as a new natural language processing task. We argue that the ability to detect textual segments which are domain-heavy, i.e., sentences or phrases which are representative of and provide evidence for a given domain could enhance the robustness and portability of various text classification applications. We propose an encoderdetector framework for domain detection and bootstrap classifiers with multiple instance learning (MIL). The model is hierarchically organized an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1

Relationship

4
1

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 31 publications
(42 reference statements)
0
6
0
Order By: Relevance
“…MIL is a machine learning framework where labels are associated with groups of instances (i.e., bags), while instance labels are unobserved (Keeler and Rumelhart, 1991). The goal is then to infer labels for bags (Dietterich et al, 1997;Maron and Ratan, 1998) or jointly for instances and bags (Zhou et al, 2009;Wei et al, 2014;Kotzias et al, 2015;Xu and Lapata, 2019;Angelidis and Lapata, 2018a). Our MIL model is an example of the latter variant.…”
Section: Controller Induction Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…MIL is a machine learning framework where labels are associated with groups of instances (i.e., bags), while instance labels are unobserved (Keeler and Rumelhart, 1991). The goal is then to infer labels for bags (Dietterich et al, 1997;Maron and Ratan, 1998) or jointly for instances and bags (Zhou et al, 2009;Wei et al, 2014;Kotzias et al, 2015;Xu and Lapata, 2019;Angelidis and Lapata, 2018a). Our MIL model is an example of the latter variant.…”
Section: Controller Induction Modelmentioning
confidence: 99%
“…We use max pooling since we want to isolate the most pertinent aspects for a given sentence; standard pooling methods such as mean and attention pooling (Angelidis and Lapata, 2018a;Xu and Lapata, 2019) assume that all instances of a bag contribute to its label. In Figure 1 (right) we illustrate our pooling mechanism and empirically show in experiments (see Section 5.1) it is superior to alternatives.…”
Section: Multiple Instance Poolingmentioning
confidence: 99%
“…Under this framework, all sentences within a document cluster, together with their query relevance, are jointly considered in estimating centrality. A vari-ety of approaches have been proposed to enhance the way relevance and centrality are estimated ranging from incorporating topic-sensitive information (Wan, 2008;Badrinath et al, 2011;Xu and Lapata, 2019), predictions about information certainty (Wan and Zhang, 2014), manifold-ranking algorithms (Wan et al, 2007;Wan and Xiao, 2009;Wan, 2009), and Wikipedia-based query expansion (Nastase, 2008). More recently, estimate the salience of text units within a sparsecoding framework by additionally taking into account reader comments (associated with news reports).…”
Section: Related Workmentioning
confidence: 99%
“…where φ ∈ (0, 1) controls the extent to which query-specific information influences sentence selection for the summarization task; andq is a distributional evidence vector which we obtain after normalizing the evidence scores q ∈ R 1×|V | obtained from the previous module (q = q/ |V | v q v ). Summary Generation In order to decide which sentences to include in the summary, a node's centrality is measured using a graph-based ranking algorithm (Erkan and Radev, 2004;Xu and Lapata, 2019 of a sentence. In the proposed algorithm, e * jointly expresses the importance of a sentence in the document and its semantic relation to the query as modulated the evidence estimator and controlled by φ.…”
Section: Centrality Estimatormentioning
confidence: 99%
“…We assume that each synopsis and review is a bag of instances (i.e., sentences in our task), where labels are assigned at the bag level. In such cases, a prediction is made for the bag by either learning to aggregate the instance level predictions (Keeler and Rumelhart, 1992;Dietterich et al, 1997;Maron and Ratan, 1998) or jointly learning the labels for instances and the bag (hua Zhou et al, 2009;Wei et al, 2014;Kotzias et al, 2015;Angelidis and Lapata, 2018;Xu and Lapata, 2019). In our setting, we choose the latter; i.e., we aggregate P (Y P ) for each sentence with the combined representation of X P S and X R to compute P (Y P |X).…”
Section: Learning the Predefined Tagsetmentioning
confidence: 99%