A New Evaluation Method: Evaluation Data and Metrics for Chinese Grammatical Error Correction

Lin, Nankai; Fu, Yingwen; Lin, Xiaotian; Yang, Ziyu; Jiang, Shengyi

doi:10.21203/rs.3.rs-2299197/v1

Cited by 4 publications

(3 citation statements)

References 8 publications

(11 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Wickstrøm et al [42] proposed a contrastive learning framework that enabled transfer learning clinical time series by exploiting a data augmentation scheme in which new samples were generated by mixing two data samples with a mixing component. For the task of Chinese spell-checking, Lin et al [43] proposed reverse contrastive learning which explicitly forced the model to minimize the distance in language representation space between similar sample pairs. In the context of MDD, we can anchor the transcription in order to generate the dissimilarity/similarity.…”

Section: Previous Methods For Mddmentioning

confidence: 99%

End-to-End Mispronunciation Detection and Diagnosis Using Transfer Learning

et al. 2023

View full text Add to dashboard Cite

As an indispensable module of computer-aided pronunciation training (CAPT) systems, mispronunciation detection and diagnosis (MDD) techniques have attracted a lot of attention from academia and industry over the past decade. To train robust MDD models, this technique requires massive human-annotated speech recordings which are usually expensive and even hard to acquire. In this study, we propose to use transfer learning to tackle the problem of data scarcity from two aspects. First, from audio modality, we explore the use of the pretrained model wav2vec2.0 for MDD tasks by learning robust general acoustic representation. Second, from text modality, we explore transferring prior texts into MDD by learning associations between acoustic and textual modalities. We propose textual modulation gates that assign more importance to the relevant text information while suppressing irrelevant text information. Moreover, given the transcriptions, we propose an extra contrastive loss to reduce the difference of learning objectives between the phoneme recognition and MDD tasks. Conducting experiments on the L2-Arctic dataset showed that our wav2vec2.0 based models outperformed the conventional methods. The proposed textual modulation gate and contrastive loss further improved the F1-score by more than 2.88% and our best model achieved an F1-score of 61.75%.

show abstract

Section: Previous Methods For Mddmentioning

confidence: 99%

End-to-End Mispronunciation Detection and Diagnosis Using Transfer Learning

et al. 2023

View full text Add to dashboard Cite

show abstract

“…The synthetic data was used to train the seq2seq Transformer model and scored as first on the GEC track and second on the GEC+Fluency track (where, which shared task, the name, and date are required). Lin et al (2023) introduced a synthetic training data based on confusion sets for the Philippines Tagalog GEC system and gained competitive performance on the Tagalog corpus. In Arabic GEC, Solyman et al (2021) introduces an unsupervised method to construct a large confusion sets-based synthetic data, which was used to train the SCUT-AGEC model.…”

Section: Related Workmentioning

confidence: 99%

Dynamic decoding and dual synthetic data for automatic correction of grammar in low-resource scenario

Musyafa,

Gao,

Solyman

et al. 2024

PeerJ Computer Science

View full text Add to dashboard Cite

Grammar error correction systems are pivotal in the field of natural language processing (NLP), with a primary focus on identifying and correcting the grammatical integrity of written text. This is crucial for both language learning and formal communication. Recently, neural machine translation (NMT) has emerged as a promising approach in high demand. However, this approach faces significant challenges, particularly the scarcity of training data and the complexity of grammar error correction (GEC), especially for low-resource languages such as Indonesian. To address these challenges, we propose InSpelPoS, a confusion method that combines two synthetic data generation methods: the Inverted Spellchecker and Patterns+POS. Furthermore, we introduce an adapted seq2seq framework equipped with a dynamic decoding method and state-of-the-art Transformer-based neural language models to enhance the accuracy and efficiency of GEC. The dynamic decoding method is capable of navigating the complexities of GEC and correcting a wide range of errors, including contextual and grammatical errors. The proposed model leverages the contextual information of words and sentences to generate a corrected output. To assess the effectiveness of our proposed framework, we conducted experiments using synthetic data and compared its performance with existing GEC systems. The results demonstrate a significant improvement in the accuracy of Indonesian GEC compared to existing methods.

show abstract

“…The length, width, surface, amenities, accessibility and surroundings of a greenway can affect the perception of access [55][56][57]. Factors such as sky openness, green visibility, visual complexity, skyline complexity and plant color richness can affect the perception of trail vision [58,59]. Essential green space features also include the presence of amenities [60] and aesthetic qualities [61].…”

Section: Linking Escape Motivation Trail Quality Perception and Resto...mentioning

confidence: 99%

Evaluation and Optimization of Restorative Environmental Perception of Treetop Trails: The Case of the Mountains-to-Sea Trail, Xiamen, China

et al. 2023

Land

View full text Add to dashboard Cite

A treetop trail is an elevated linear green open space that plays a key role in forming a scientifically rational urban space and meeting the growing leisure needs of the people. Taking the Mountains-to-Sea Trail in Xiamen, China as a case, and through 426 questionnaires, this study explores the dimensions of the perceived restorative environment components of greenway recreationists and impacts on behavioral intentions. The demographic factors lead us to the following three conclusions. First, from an age perspective, restorative environmental perceptions are strongest among those aged 60 and above and weakest among those aged 18–30. Second, in terms of place of permanent residence, local visitors have stronger restorative environmental perceptions than other city users. Third, in relation to the number of accompanying travelers, individuals who embark on solo journeys experience the most robust perception, while that diminishes as the count reaches three or more companions. A structural equation model (SEM) is used to present the quantitative relationship among avoidance motivation, treetop trail environmental quality, restorative environmental perception, place attachment, and loyalty. The results showed that users’ escape motivation has a direct and indirect positive correlation with restorative environmental perceptions, and environmental perceptions have a significant positive correlation with restorative environmental perceptions. Furthermore, their place attachment to the restorative nature of the treetop trails positively affected their loyalty. This study provides essential factors to consider when constructing treetop trails in high-density cities.

show abstract

A New Evaluation Method: Evaluation Data and Metrics for Chinese Grammatical Error Correction

Cited by 4 publications

References 8 publications

End-to-End Mispronunciation Detection and Diagnosis Using Transfer Learning

End-to-End Mispronunciation Detection and Diagnosis Using Transfer Learning

Dynamic decoding and dual synthetic data for automatic correction of grammar in low-resource scenario

Evaluation and Optimization of Restorative Environmental Perception of Treetop Trails: The Case of the Mountains-to-Sea Trail, Xiamen, China

Contact Info

Product

Resources

About