Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences

Fan, Yunlong; Li, Bin; Sataer, Yikemaiti; Gao, Miao; Shi, Chuanqi; Cao, Siyi; Gao, Zhiqiang

doi:10.3390/app13169412

Cited by 3 publications

(6 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition to RST, Fan et al [24] also propose a novel clausal feature, HCA, which represents a complex sentence as a tree consisting of clause nodes and inter-clause relation edges. The HCA framework is based on English grammar [21], where clauses are elementary grammar units that center around a verb, and inter-clause relations can be classified into two categories:…”

Section: Hierarchical Clause Annotationmentioning

confidence: 99%

“…where the parameter matrix W O ∈ R hN×d , and h is the attention head number mapping to 16 inter-clause relations in HCA [24].…”

Section: Clause-relation-bound Attention Headmentioning

confidence: 99%

“…For the HCA tree of each sentence, we use the manually annotated HCA trees for AMR 2.0 provided by [24] and auto-annotated HCA trees for the remaining datasets, which were all generated by the HCA Segmenter and the HCA Parser proposed by [24]. The training, development, and test sets in both datasets are a random split, and therefore we take them as ID datasets as in previous works [15,17,18,20,45].…”

Section: Datasetsmentioning

confidence: 99%

“…Rhetorical structure theory (RST) [23] provides a general way to describe the coherence relations among clauses and some phrases, i.e., elementary discourse units, and postulates a hierarchical discourse structure called discourse tree. Except for RST, a novel clausal feature, hierarchical clause annotation (HCA) [24], also captures a tree structure of a complex sentence, where the leaves are segmented clauses and the edges are the inter-clause relations.…”

mentioning

confidence: 99%

“…Due to the better parsing performances of the clausal structure [24], we select and integrate the HCA trees of complex sentences to cure LDDs in AMR parsing. Specifically, we propose two HCA-based approaches, HCA-based self-attention (HCA-SA) and HCA-based curriculum learning (HCA-CL), to integrate the HCA trees as clausal features in the popular AMR parsing codebase, SPRING [15].…”

mentioning

confidence: 99%

See 4 more Smart Citations

Addressing Long-Distance Dependencies in AMR Parsing with Hierarchical Clause Annotation

Fan,

Li,

Sataer

et al. 2023

Electronics

View full text Add to dashboard Cite

Most natural language processing (NLP) tasks operationalize an input sentence as a sequence with token-level embeddings and features, despite its clausal structure. Taking abstract meaning representation (AMR) parsing as an example, recent parsers are empowered by transformers and pre-trained language models, but long-distance dependencies (LDDs) introduced by long sequences are still open problems. We argue that LDDs are not actually to blame for the sequence length but are essentially related to the internal clause hierarchy. Typically, non-verb words in a clause cannot depend on words outside of it, and verbs from different but related clauses have much longer dependencies than those in the same clause. With this intuition, we introduce a type of clausal feature, hierarchical clause annotation (HCA), into AMR parsing and propose two HCA-based approaches, HCA-based self-attention (HCA-SA) and HCA-based curriculum learning (HCA-CL), to integrate HCA trees of complex sentences for addressing LDDs. We conduct extensive experiments on two in-distribution (ID) AMR datasets (AMR 2.0 and AMR 3.0) and three out-of-distribution (OOD) ones (TLP, New3, and Bio). Experimental results show that our HCA-based approaches achieve significant and explainable improvements (0.7 Smatch score in both ID datasets; 2.3, 0.7, and 2.6 in three OOD datasets, respectively) against the baseline model and outperform the state-of-the-art (SOTA) model (0.7 Smatch score in the OOD dataset, Bio) when encountering sentences with complex clausal structures that introduce most LDD cases.

show abstract

Section: Hierarchical Clause Annotationmentioning

confidence: 99%

“…where the parameter matrix W O ∈ R hN×d , and h is the attention head number mapping to 16 inter-clause relations in HCA [24].…”

Section: Clause-relation-bound Attention Headmentioning

confidence: 99%

Section: Datasetsmentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

See 3 more Smart Citations

Addressing Long-Distance Dependencies in AMR Parsing with Hierarchical Clause Annotation

Fan,

Li,

Sataer

et al. 2023

Electronics

View full text Add to dashboard Cite

show abstract

Hierarchical information matters! Improving AMR parsing with multi-granularity representation interactions

Sataer,

Fan,

et al. 2024

Information Processing & Management

View full text Add to dashboard Cite

Addressing Long-Distance Dependencies in AMR Parsing with Hierarchical Clause Annotation

Fan¹,

Li²,

Sataer³

et al. 2023

Preprint

View full text Add to dashboard Cite

Most natural language processing (NLP) tasks operate an input sentence as a sequence with token-level embeddings and features, despite its clausal structures. Taking Abstract Meaning Representation (AMR) parsing as an example, recent parsers are empowered by Transformers and pre-trained language models, but long-distance dependencies (LDDs) introduced by long sequences are still open problems. We argue that LDDs are not superficially blamed on the sequence length but are essentially related to the internal clause hierarchy. Typically, non-verb words in a clause cannot depend on words outside, and verbs from different but related clauses have much longer dependencies than those in the same clause. With this intuition, we introduce a type of clausal feature, hierarchical clause annotation (HCA), into AMR parsing and propose two HCA-based approaches, HCA-based self-attention (HCA-SA) and HCA-based curriculum learning (HCA-CL), to integrate HCA trees of complex sentences for addressing LDDs. We conduct extensive experiments on two in-distribution (ID) AMR datasets (AMR 2.0 and AMR 3.0) and three out-of-distribution (OOD) ones (TLP, New3, and Bio). Experimental results show that our HCA-based approaches achieve significant and explainable improvements against the baseline model and outperform the state-of-the-art (SOTA) model when encountering sentences with complex clausal structures that introduce most LDD cases.

show abstract

Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences

Cited by 3 publications

References 39 publications

Addressing Long-Distance Dependencies in AMR Parsing with Hierarchical Clause Annotation

Addressing Long-Distance Dependencies in AMR Parsing with Hierarchical Clause Annotation

Hierarchical information matters! Improving AMR parsing with multi-granularity representation interactions

Addressing Long-Distance Dependencies in AMR Parsing with Hierarchical Clause Annotation

Contact Info

Product

Resources

About