Explaining and Improving BERT Performance on Lexical Semantic Change Detection

Laicher, Severin; Kurtyigit, Sinan; Schlechtweg, Dominik; Kuhn, Jonas; Walde, Sabine Schulte im

doi:10.48550/arxiv.2103.07259

Cited by 1 publication

(3 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Due to their contextual resolution, transformer models such as BERT are able to attend to more features of the text, including orthographic information (Laicher et al 2021) and, as we show in this paper, syntactic information. Furthermore, such models also encode positional information in the focal word embeddings and in those of the contextual sequence.…”

Section: A1 Challenges In the Application Of Deep Language Modelsmentioning

confidence: 90%

“…Since our training and inference task are identical, we do not risk a gap between training and analysis (Rogers, Kovaleva, and Rumshisky 2020) and the accuracy of our approach directly depends on the model's native training performance. On the other hand, any technique to increase performance of lexical substitutions (e.g., Amrami and Goldberg 2019;Laicher et al 2021;Schick and Schütze 2019) apply to our model as well.…”

Section: A1 Challenges In the Application Of Deep Language Modelsmentioning

confidence: 99%

“…The condition is sufficient for syntax and semantics jointly. Research indicates (Laicher et al 2021) that for contextual embeddings, orthographic similarity is a further condition. We do not observe such relations while employing our network approach.…”

Section: B Overview Of Replacement Ties B1 Notationmentioning

confidence: 99%

See 2 more Smart Citations

Text analysis and deep learning: A network approach

Marquart¹

2021

Preprint

View full text Add to dashboard Cite

Much information available to applied researchers is contained within written language or spoken text. Deep language models such as BERT have achieved unprecedented success in many applications of computational linguistics. However, much less is known about how these models can be used to analyze existing text. We propose a novel method that combines transformer models with network analysis to form a self-referential representation of language use within a corpus of interest. Our approach produces linguistic relations strongly consistent with the underlying model as well as mathematically well-defined operations on them, while reducing the amount of discretionary choices of representation and distance measures. It represents, to the best of our knowledge, the first unsupervised method to extract semantic networks directly from deep language models. We illustrate our approach in a semantic analysis of the term "founder". Using the entire corpus of Harvard Business Review from 1980 to 2020, we find that ties in our network track the semantics of discourse over time, and across contexts, identifying and relating clusters of semantic and syntactic relations. Finally, we discuss how this method can also complement and inform analyses of the behavior of deep learning models.

show abstract

Section: A1 Challenges In the Application Of Deep Language Modelsmentioning

confidence: 90%