Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition

Ouchi, Hiroki; Suzuki, Jun; Kobayashi, Shunsuke; Yokoi, Sho; Kuribayashi, Tatsuki; Konno, Ryuto; Inui, Kentaro

doi:10.18653/v1/2020.acl-main.575

Cited by 24 publications

(23 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although this is not the first work that formulates NER as a span prediction problem (Jiang et al, 2020;Ouchi et al, 2020;Li et al, 2020;Mengge et al, 2020), we contribute by (1) exploring how different design choices influence the performance of SPANNER and (2) interpreting complementary strengths between SEQLAB and SPANNER with different design choices. In what follows, we first detail span prediction-based NER systems with the vanilla configuration and proposed advanced featurization.…”

Section: Span Prediction For Ne Recognitionmentioning

confidence: 99%

“…NER as Different Tasks Although NER is commonly formulated as a sequence labeling task (Chiu and Nichols, 2015;Huang et al, 2015;Ma and Hovy, 2016;Lample et al, 2016;Akbik et al, 2018;Peters et al, 2018;Devlin et al, 2018;Xia et al, 2019;Akbik et al, 2019;Luo et al, 2020;Lin et al, 2020), recently other new forms of frameworks have been explored and have shown impressive results. For example, (Jiang et al, 2020;Ouchi et al, 2020; shift NER from tokenlevel tagging to span-level prediction task while (Li et al, 2020;Mengge et al, 2020) conceptualize it as reading comprehension task. In this work we aim to interpret the complementarity between sequence labeling and span prediction.…”

Section: Related Workmentioning

confidence: 99%

“…The rapid evolution of neural architectures (Kalchbrenner et al, 2014a;Kim, 2014;Hochreiter and Schmidhuber, 1997) and large pre-trained models (Devlin et al, 2019;Lewis et al, 2020) not only drive the state-of-the-art performance of many NLP tasks (Devlin et al, 2019;Liu and Lapata, 2019) to a new level but also change the way how researchers formulate the task. For example, recent years have seen frequent paradigm shifts for the task of named entity recognition (NER) from token-level tagging, which conceptualize NER as a sequence labeling (SEQLAB) task (Chiu and Nichols, 2015;Huang et al, 2015;Ma and Hovy, 2016;Lample et al, 2016;Akbik et al, 2018;Peters et al, 2018;Devlin et al, 2018;Xia et al, 2019;Luo et al, 2020;Lin et al, 2020;, to span-level prediction (SPANNER) (Li et al, 2020;Mengge et al, 2020;Jiang et al, 2020;Ouchi et al, 2020;, which regards NER either as question answering (Li et al, 2020;Mengge et al, 2020), span classification (Jiang et al, 2020;Ouchi et al, 2020;Yamada et al, 2020), and dependency parsing tasks .…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

SpanNER: Named Entity Re-/Recognition as Span Prediction

Fu¹,

Huang²,

Liu³

2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Recent years have seen the paradigm shift of Named Entity Recognition (NER) systems from sequence labeling to span prediction. Despite its preliminary effectiveness, the span prediction model's architectural bias has not been fully understood. In this paper, we first investigate the strengths and weaknesses when the span prediction model is used for named entity recognition compared with the sequence labeling framework and how to further improve it, which motivates us to make complementary advantages of systems based on different paradigms. We then reveal that span prediction, simultaneously, can serve as a system combiner to re-recognize named entities from different systems' outputs. We experimentally implement 154 systems on 11 datasets, covering three languages, comprehensive results show the effectiveness of span prediction models that both serve as base NER systems and system combiners. We make all code and datasets available: https:// github.com/neulab/spanner, as well as an online system demo: http://spanner. sh. Our model also has been deployed into the EXPLAINABOARD (Liu et al., 2021) platform, which allows users to flexibly perform the system combination of top-scoring systems in an interactive way: http://explainaboard. nlpedia.ai/leaderboard/task-ner/.

show abstract

Section: Span Prediction For Ne Recognitionmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

SpanNER: Named Entity Re-/Recognition as Span Prediction

Fu¹,

Huang²,

Liu³

2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

show abstract

“…Regarding sentence encoders, recurrent neural nets (Huang et al, 2015;Chiu and Nichols, 2015 Lample et al, 2016;Lin et al, 2020) and convolutional neural nets (Strubell et al, 2017;Yang et al, 2018;Fu et al, 2020a) were widely used while transformer were also studied to get sentential representations (Yan et al, 2019;Yu et al, 2020). Some recent works consider the NER as a span classification Jiang et al, 2019;Mengge et al, 2020;Ouchi et al, 2020) task, unlike most works that view it as a sequence labeling task. To capture morphological information, some previous works introduced a character or subword-aware encoders with unsupervised pre-trained knowledge (Peters et al, 2018;Akbik et al, 2018;Devlin et al, 2018;Akbik et al, 2019;Yang et al, 2019;Lan et al, 2019).…”

Section: Related Workmentioning

confidence: 99%

Larger-Context Tagging: When and Why Does It Work?

Fu¹,

Feng²,

Zhang³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

The development of neural networks and pretraining techniques has spawned many sentence-level tagging systems that achieved superior performance on typical benchmarks. However, a relatively less discussed topic is what if more context information is introduced into current top-scoring tagging systems. Although several existing works have attempted to shift tagging systems from sentence-level to document-level, there is still no consensus conclusion about when and why it works, which limits the applicability of the larger-context approach in tagging tasks. In this paper, instead of pursuing a state-of-the-art tagging system by architectural exploration, we focus on investigating when and why the larger-context training, as a general strategy, can work.To this end, we conduct a thorough comparative study on four proposed aggregators for context information collecting and present an attribute-aided evaluation method to interpret the improvement brought by largercontext training. Experimentally, we set up a testbed based on four tagging tasks and thirteen datasets. Hopefully, our preliminary observations can deepen the understanding of larger-context training and enlighten more follow-up works on the use of contextual information.

show abstract

“…Compared to image recognition, there are much fewer studies on deep metric learning in natural language processing (NLP). As a few exceptions, Wiseman and Stratos (2019) and Ouchi et al (2020) developed neural models that have an instance-based inference process for sequence labeling tasks. They reported that their models have high explainability without sacrificing the prediction accuracy.…”

Section: Introductionmentioning

confidence: 99%

Instance-Based Neural Dependency Parsing

Ouchi¹,

Suzuki

Kobayashi

et al. 2021

Transactions of the Association for Computational Linguistics

Self Cite

View full text Add to dashboard Cite

Interpretable rationales for model predictions are crucial in practical applications. We develop neural models that possess an interpretable inference process for dependency parsing. Our models adopt instance-based inference, where dependency edges are extracted and labeled by comparing them to edges in a training set. The training edges are explicitly used for the predictions; thus, it is easy to grasp the contribution of each edge to the predictions. Our experiments show that our instance-based models achieve competitive accuracy with standard neural models and have the reasonable plausibility of instance-based explanations.

show abstract

Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition

Cited by 24 publications

References 35 publications

SpanNER: Named Entity Re-/Recognition as Span Prediction

SpanNER: Named Entity Re-/Recognition as Span Prediction

Larger-Context Tagging: When and Why Does It Work?

Instance-Based Neural Dependency Parsing

Contact Info

Product

Resources

About