The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018
DOI: 10.18653/v1/d18-1229
|View full text |Cite
|
Sign up to set email alerts
|

Visual Supervision in Bootstrapped Information Extraction

Abstract: We challenge a common assumption in active learning, that a list-based interface populated by informative samples provides for efficient and effective data annotation. We show how a 2D scatterplot populated with diverse and representative samples can yield improved models given the same time budget. We consider this for bootstrapping-based information extraction, in particular named entity classification, where human and machine jointly label data. To enable effective data annotation in a scatterplot, we have … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 25 publications
(25 reference statements)
0
7
0
Order By: Relevance
“…In (Lison et al, 2020), the weak training data is created by broadly collecting available labeling rules from multiple sources, which demonstrates the importance of being able to automatically find new heuristics missed by human efforts. To find new heuristic rules on the basis of a relatively limited number of manually designed rules, previous studies have tried bootstrapping by relying on the co-occurrence, context and pattern features (Thelen and Riloff, 2002;Riloff et al, 2003;Yangarber, 2003;Shen et al, 2017;Tao et al, 2015;Berger et al, 2018;Yan et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…In (Lison et al, 2020), the weak training data is created by broadly collecting available labeling rules from multiple sources, which demonstrates the importance of being able to automatically find new heuristics missed by human efforts. To find new heuristic rules on the basis of a relatively limited number of manually designed rules, previous studies have tried bootstrapping by relying on the co-occurrence, context and pattern features (Thelen and Riloff, 2002;Riloff et al, 2003;Yangarber, 2003;Shen et al, 2017;Tao et al, 2015;Berger et al, 2018;Yan et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…However, those heuristic constraints are usually not flexible due to their requirement for expert efforts. In contrast, recent studies focus on learning the distance metrics to determine boundaries using weak supervision Berger et al, 2018;Zupon et al, 2019;Yan et al, 2020a). For example, Yan et al (2020a) propose an end-toend bootstrapping network learned by multi-view learning, and extend it by self-supervised and supervised pre-training (Yan et al, 2020b).…”
Section: Related Workmentioning
confidence: 99%
“…Unfortunately, these heuristic metrics heavily depend on the selected seeds, making the boundary biased and unreliable (Curran et al, 2007;McIntosh and Curran, 2009). Although some studies extend them with extra constraints (Carlson et al, 2010) or manual participants (Berger et al, 2018), the requirement of expert knowledge makes them ad-hoc and inflexible. Some studies try to learn the distance metrics (Zupon et al, 2019;Yan et al, 2020a), but they still suffer from weak supervision.…”
Section: Introductionmentioning
confidence: 99%
“…The pipelined methods (Riloff and Jones, 1999;Collins and Singer, 1999) mainly leverage direct co-occurrence information, which will easily lead to the semantic drifting problem (Curran et al, 2007). To resolve this problem, many pipelined methods are proposed, e.g., mutual exclusive bootstrapping (Curran et al, 2007;Curran, 2008, 2009;Gupta et al, 2018), bootstrapping using negative seeds (Yangarber et al, 2002;Shi et al, 2014), lexical and statistical features (Liao and Grishman, 2010;Gupta and Manning, 2014), word embeddings (Batista et al, 2015;Gupta and Manning, 2015;Zupon et al, 2019), active learning (Berger et al, 2018), lookahead search (Yan et al, 2019), etc. Recently Yan et al (2020 propose an end-to-end bootstrapping model and show its advantages in information leveraging and flexibility.…”
Section: Related Workmentioning
confidence: 99%