Enhanced distance-aware self-attention and multi-level match for sentence semantic matching

Deng, Yao; Li, Xianfeng; Zhang, Mengyan; Lu, Xin; Sun, Xiaodong

doi:10.1016/j.neucom.2022.05.103

Cited by 9 publications

(4 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Wan et al put forward a model to match two sentences with multiple positional sentence representations, by the aggregation over interactions between different positional sentence representations and the employment of K-Max pooling, the semantic relationship between sentences can be well captured by its model design [30]. In order for more effectual extraction of sentence interactive features, in many recent studies on sentence matching, the cross-attention mechanism has been largely applied, intended for a more accurate sentence alignment, which is usually implemented by stacking multiple of them so that more in-depth interaction information can be possibly exploited [15], [31]- [33]. For instance, Chen et al utilized two Bi-LSTM (Bi-directional Long Short-Term Memory) in its model with the first to encode sentence semantic information, and the second to aggregate semantic information and sentence alignment information extracted by attention mechanism, so that more effective sentence representation vectors can be obtained for final semantic matching [3].…”

Section: B Matching-aggregation Based Methodsmentioning

confidence: 99%

“…Secondly, in recent years, due to wide application of at-tention mechanism in sentence matching models, it's not uncommon to find studies trying to address this issue by stacking multiple cross-attention layers so that implicit interactive features can be more profoundly extracted [15], [16]. However, for this interaction-layer-stacking strategy, there may exist some drawbacks.…”

Section: B Research Challengesmentioning

confidence: 99%

“…Once the interactive features retrieved at a low layer contain major errors, the error information may remain long in the network and be propagated all the way to the high layers, causing the alignments information at top layer to be less distinctive for final sentence matching. On the other hand, because the interactive representation of one sentence is obtained by dynamically attending to the information of another one, the interactive representation obtained at intermediate layers may vary a lot between layers, making it hard to really capture stable and effective semantic alignments information between sentences [9], [15]. Thirdly, given the advantages of multi-grained semantic information which has shown great potential in improving sentence matching accuracy, many researches try to utilize CNN (Convolutional Neural Network) to encode sentences at multiple granularities.…”

Section: B Research Challengesmentioning

confidence: 99%

“…To ensure that all the convoluted representations of each word can be fit to be fed into the identical subsequent layer, we impose the constraint that all convoluted representations live in the same space and have the same dimensionality by applying the same number of filters to each convolutional network. In addition, although the above encoding process using CNN network can already encode the contextual information to a certain extent, but the contextual features captured by CNN are mostly local semantics within a relatively small word span in sentence, for the complex long-distance dependency information, CNN is still unable to capture it, hence, in our work, in order to well capture such global contextual information, we follow the method in [15] to enhance the contextual representation of each word by a distance-aware self-attention mechanism, the equations are simplified as follows:…”

Section: B Sentence Contextual Encodingmentioning

confidence: 99%

See 3 more Smart Citations

Semantic Sentence Matching Based on Multiple Parallelly Organized Interaction Layers at Various Granularity Combinations With Two-Stage Aggregation Strategy

Wang,

2023

IEEE Access

View full text Add to dashboard Cite

Semantic sentence matching plays an essential role in resolving many problems in natural language processing (NLP) field, it has gained increasing research focus and shown great improvements in recent years. However, most currently existing researches are for English sentence matching, research on Chinese semantic matching are relatively less. Moreover, due to the rather complicated contextual expressions and grammatical structure of Chinese language, many existing models are still unable to quite effectively capture interaction information between sentences. Thus, in this work, we attempt to propose a novel deep model to better address Chinese semantic sentence matching. Specifically, the convolutional neural networks with various kernel sizes are first employed for the multi-granular contextual encoding of sentences, combined with multiple different cross-sentence alignment mechanisms, the semantic interactions can be more clearly and profoundly performed at various granularity combinations between sentences. Additionally, rather than serially stacking multiple interaction layers, we organize multiple interaction layers in a parallel manner, and by further introduction of attention pooling, the semantically aligned sentence attentive vectors would be adaptively aggregated from both perspectives of alignment mechanisms and granularity combinations, thus more stable and effective sentence interactive features can be extracted while attempting to alleviate potential sentence alignment error propagation issue existed in hierarchically stacked interaction structure. Finally, extensive experiments are conducted to evaluate the performance of our model, the experimental results demonstrate that our proposed approach outperforms many state-of-the-art models on sentence matching and is capable of gaining a more accurate understanding of semantic relationships between Chinese sentences.

show abstract