Weakly Supervised Domain Detection

Xu, Yumo; Lapata, Mirella

doi:10.1162/tacl_a_00287

Cited by 5 publications

(6 citation statements)

References 31 publications

(42 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…MIL is a machine learning framework where labels are associated with groups of instances (i.e., bags), while instance labels are unobserved (Keeler and Rumelhart, 1991). The goal is then to infer labels for bags (Dietterich et al, 1997;Maron and Ratan, 1998) or jointly for instances and bags (Zhou et al, 2009;Wei et al, 2014;Kotzias et al, 2015;Xu and Lapata, 2019;Angelidis and Lapata, 2018a). Our MIL model is an example of the latter variant.…”

Section: Controller Induction Modelmentioning

confidence: 99%

“…We use max pooling since we want to isolate the most pertinent aspects for a given sentence; standard pooling methods such as mean and attention pooling (Angelidis and Lapata, 2018a;Xu and Lapata, 2019) assume that all instances of a bag contribute to its label. In Figure 1 (right) we illustrate our pooling mechanism and empirically show in experiments (see Section 5.1) it is superior to alternatives.…”

Section: Multiple Instance Poolingmentioning

confidence: 99%

See 1 more Smart Citation

Aspect-Controllable Opinion Summarization

Amplayo¹,

Angelidis²,

Lapata³

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Self Cite

View full text Add to dashboard Cite

Recent work on opinion summarization produces general summaries based on a set of input reviews and the popularity of opinions expressed in them. In this paper, we propose an approach that allows the generation of customized summaries based on aspect queries (e.g., describing the location and room of a hotel). Using a review corpus, we create a synthetic training dataset of (review, summary) pairs enriched with aspect controllers which are induced by a multi-instance learning model that predicts the aspects of a document at different levels of granularity. We fine-tune a pretrained model using our synthetic dataset and generate aspect-specific summaries by modifying the aspect controllers. Experiments on two benchmarks show that our model outperforms the previous state of the art and generates personalized summaries by controlling the number of aspects discussed in them. HUMAN summariesGeneral Staff was service focused and very welcoming. Common areas of the hotel smelled fresh because of how clean everything was. The rooms were comfortable and came with a fridge and a microwave. Food, both hot and cold, was very well presented and fresh. The hotel was located within walking distance to the French quarter and felt very safe at night. Building It's older, looking at the hotel and lobby, but has lots of charm & character. Cleanliness The hotel's lounge, bathrooms, hallways, and even the bedding were all clean and even smelled fresh. Food The breakfast is very good and plentiful and was more than just continental, offering eggs, sausage and grits in addition to the usual waffles, cereal, and fruit. Location The location is very good, walking distance to all major sights in French quarter. Rooms The room is comfortable and equipped with just about everything anyone could need ... a refrigerator, microwave, desk, sofa, iron and ironing board, and hairdryer. The room was also spacious and the hotel was very quiet. Service Hotel staff were unbelievably friendly and helpful; they often went above and beyond to be accommodating. ACESUM summaries GeneralThe hotel is in a great location, close to the French quarter and the market. The room was clean and comfortable. Breakfast was good, and the staff was very helpful. There is a small restaurant in the lobby. Building The lobby is a bit small. The lobby area is a little bit dated, but the rooms are very comfortable. Cleanliness The room was clean and comfortable. The bathroom was very clean with a nice shower. Food The breakfast was very good, with a variety of choices. The breakfast buffet was good. Location The location is great, right in the heart of Bourbon street, and within walking distance of the French quarter. Rooms The room was very spacious and the bathroom was very nice. The room had a TV, a microwave, and a separate shower. There was a small fridge in the room, which was nice. Service The staff was very friendly and helpful.

show abstract

Section: Controller Induction Modelmentioning

confidence: 99%

Section: Multiple Instance Poolingmentioning

confidence: 99%

Aspect-Controllable Opinion Summarization

Amplayo¹,

Angelidis²,

Lapata³

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Self Cite

View full text Add to dashboard Cite

show abstract

“…Under this framework, all sentences within a document cluster, together with their query relevance, are jointly considered in estimating centrality. A vari-ety of approaches have been proposed to enhance the way relevance and centrality are estimated ranging from incorporating topic-sensitive information (Wan, 2008;Badrinath et al, 2011;Xu and Lapata, 2019), predictions about information certainty (Wan and Zhang, 2014), manifold-ranking algorithms (Wan et al, 2007;Wan and Xiao, 2009;Wan, 2009), and Wikipedia-based query expansion (Nastase, 2008). More recently, estimate the salience of text units within a sparsecoding framework by additionally taking into account reader comments (associated with news reports).…”

Section: Related Workmentioning

confidence: 99%

“…where φ ∈ (0, 1) controls the extent to which query-specific information influences sentence selection for the summarization task; andq is a distributional evidence vector which we obtain after normalizing the evidence scores q ∈ R 1×|V | obtained from the previous module (q = q/ |V | v q v ). Summary Generation In order to decide which sentences to include in the summary, a node's centrality is measured using a graph-based ranking algorithm (Erkan and Radev, 2004;Xu and Lapata, 2019 of a sentence. In the proposed algorithm, e * jointly expresses the importance of a sentence in the document and its semantic relation to the query as modulated the evidence estimator and controlled by φ.…”

Section: Centrality Estimatormentioning

confidence: 99%

Coarse-to-Fine Query Focused Multi-Document Summarization

Xu¹,

Lapata²

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Self Cite

View full text Add to dashboard Cite

We consider the problem of better modeling query-cluster interactions to facilitate query focused multi-document summarization. Due to the lack of training data, existing work relies heavily on retrieval-style methods for assembling query relevant summaries. We propose a coarse-to-fine modeling framework which employs progressively more accurate modules for estimating whether text segments are relevant, likely to contain an answer, and central. The modules can be independently developed and leverage training data if available. We present an instantiation of this framework with a trained evidence estimator which relies on distant supervision from question answering (where various resources exist) to identify segments which are likely to answer the query and should be included in the summary. Our framework 1 is robust across domains and query types (i.e., long vs short) and outperforms strong comparison systems on benchmark datasets.

show abstract

“…We assume that each synopsis and review is a bag of instances (i.e., sentences in our task), where labels are assigned at the bag level. In such cases, a prediction is made for the bag by either learning to aggregate the instance level predictions (Keeler and Rumelhart, 1992;Dietterich et al, 1997;Maron and Ratan, 1998) or jointly learning the labels for instances and the bag (hua Zhou et al, 2009;Wei et al, 2014;Kotzias et al, 2015;Angelidis and Lapata, 2018;Xu and Lapata, 2019). In our setting, we choose the latter; i.e., we aggregate P (Y P ) for each sentence with the combined representation of X P S and X R to compute P (Y P |X).…”

Section: Learning the Predefined Tagsetmentioning

confidence: 99%

Multi-view Story Characterization from Movie Plot Synopses and Reviews

Kar¹,

Aguilar

Lapata

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Self Cite

View full text Add to dashboard Cite

This paper considers the problem of characterizing stories by inferring properties such as theme and style using written synopses and reviews of movies. We experiment with a multi-label dataset of movie synopses and a tagset representing various attributes of stories (e.g., genre, type of events). Our proposed multi-view model encodes the synopses and reviews using hierarchical attention and shows improvement over methods that only use synopses. Finally, we demonstrate how can we take advantage of such a model to extract a complementary set of story-attributes from reviews without direct supervision. We have made our dataset and source code publicly available at https://ritual.uh.edu/ multiview-tag-2020. Children StoriesCinderella fantasy, cute, romantic, whimsical, psychedelic https://www.thefablecottage.com/english/cinderella Snow White and the Seven Dwarfs fantasy, psychedelic, romantic, good versus evil, whimsical https://www.storiestogrowby.org/story/ snow-white-and-the-seven-dwarfs-bedtime-stories-for-kids/

show abstract

Weakly Supervised Domain Detection

Cited by 5 publications

References 31 publications

Aspect-Controllable Opinion Summarization

Aspect-Controllable Opinion Summarization

Coarse-to-Fine Query Focused Multi-Document Summarization

Multi-view Story Characterization from Movie Plot Synopses and Reviews

Contact Info

Product

Resources

About