Kun Sun scite author profile

Wang

2023

Psychon Bull Rev

Predictions about upcoming content play an important role during language comprehension and processing. Semantic similarity as a metric has been used to predict how words are processed in context in language comprehension and processing tasks. This study proposes a novel, dynamic approach for computing contextual semantic similarity, evaluates the extent to which the semantic similarity measures computed using this approach can predict fixation durations in reading tasks recorded in a corpus of eye-tracking data, and compares the performance of these measures to that of semantic similarity measures computed using the cosine and Euclidean methods. Our results reveal that the semantic similarity measures generated by our approach are significantly predictive of fixation durations on reading and outperform those generated by the two existing approaches. The findings of this study contribute to a better understanding of how humans process words in context and make predictions in language comprehension and processing. The effective and interpretable approach to computing contextual semantic similarity proposed in this study can also facilitate further explorations of other experimental data on language comprehension and processing.

Bodo Winter, Sensory linguistics: Language, perception and metaphor

Rong

2020

Approaching The Double‐Nominal Construction In Mandarin Chinese Through The Semantic‐Cognitive Interaction

2018

Studia Linguistica

The double‐nominal construction (DNC), also called ‘topic construction’, is a common occurrence in Chinese and other East Asian languages. It is characterized by two initial NPs which appear before the predicate verb. The construction has mostly been analyzed using the syntactic angle singly approach. The topic (the initial nominal phrase, abbreviated as NP1) needs to syntactically establish some connection with the comment (the rest of the construction) but this has, unfortunately, not been the case due to numerous counterexamples. This construction is so complex that other factors have to be taken into account. This paper addresses the major concern about the problem of the two Initial NPs’ transposition in various DNCs, an area that does not appear to have previously been sufficiently explored. Compared with other languages, the transposition of two initial NPs in DNC is unique to Chinese. The transposition with two initial NPs in each type of DNC performs quite differently, so we should make finer classification for DNC reasonably. In order to propose reasonable classification of DNC, we need to clarify the relationship between NP1 and the rest of the construction. Meanwhile, in order to tackle the problem of the two NPs’ transposition in a special type of DNC called dangling topic construction, we propose a more reasonable and precise interpretation of the relationship between topic and comment in this construction using the event‐based model and the event integration. This study shows how, depending on the syntactic‐semantic behavior of NP1, Chinese DNCs can be classified into three types. Finally, based on the three types of DNC proposed, a semantic‐cognitive interaction helps to explain and resolve the problem of NPs’ transposition for each type. This study, therefore, provides a unified and more developed account of Chinese DNC. Consequently, a semantic‐cognitive approach is likely to shed more light on the notion of topic construction and help understand how Chinese native speakers comprehend its structure and construct the meaning.

The evolutionary pattern of language in scientific writings: A case study of Philosophical Transactions of Royal Society (1665–1869)

2020

Scientific writings, as one essential part of human culture, have evolved over centuries into their current form. Knowing how scientific writings evolved is particularly helpful in understanding how trends in scientific culture developed. It also allows us to better understand how scientific culture was interwoven with human culture generally. The availability of massive digitized texts and the progress in computational technologies today provide us with a convenient and credible way to discern the evolutionary patterns in scientific writings by examining the diachronic linguistic changes. The linguistic changes in scientific writings reflect the genre shifts that took place with historical changes in science and scientific writings. This study investigates a general evolutionary linguistic pattern in scientific writings. It does so by merging two credible computational methods: relative entropy; word-embedding concreteness and imageability. It thus creates a novel quantitative methodology and applies this to the examination of diachronic changes in the Philosophical Transactions of Royal Society (PTRS, 1665–1869). The data from two computational approaches can be well mapped to support the argument that this journal followed the evolutionary trend of increasing professionalization and specialization. But it also shows that language use in this journal was greatly influenced by historical events and other socio-cultural factors. This study, as a “culturomic” approach, demonstrates that the linguistic evolutionary patterns in scientific discourse have been interrupted by external factors even though this scientific discourse would likely have cumulatively developed into a professional and specialized genre. The approaches proposed by this study can make a great contribution to full-text analysis in scientometrics.

Quantitative Aspects of PDTB-Style Discourse Relations across Languages

Journal of Quantitative Linguistics

Zhang

2018

Using the Relative Entropy of Linguistic Complexity to Assess L2 Language Proficiency Development

Wang

2021

Entropy

This study applies relative entropy in naturalistic large-scale corpus to calculate the difference among L2 (second language) learners at different levels. We chose lemma, token, POS-trigram, conjunction to represent lexicon and grammar to detect the patterns of language proficiency development among different L2 groups using relative entropy. The results show that information distribution discrimination regarding lexical and grammatical differences continues to increase from L2 learners at a lower level to those at a higher level. This result is consistent with the assumption that in the course of second language acquisition, L2 learners develop towards a more complex and diverse use of language. Meanwhile, this study uses the statistics method of time series to process the data on L2 differences yielded by traditional frequency-based methods processing the same L2 corpus to compare with the results of relative entropy. However, the results from the traditional methods rarely show regularity. As compared to the algorithms in traditional approaches, relative entropy performs much better in detecting L2 proficiency development. In this sense, we have developed an effective and practical algorithm for stably detecting and predicting the developments in L2 learners’ language proficiency.

Hyphenation as a compounding technique in English

Baayen

2021

Language Sciences

The Integration Functions of Topic Chains in Chinese Discourse

2019

ALA

The topic chain, one of the essential organization devices in Chinese discourse, is highlighted by the use of many co-referential zero forms. Although the topic chain has been realized to play an important role in organizing discourse, few attempts have been made to explore how the topic chain forms an integrated and meaningful unit and facilitates discourse organization, which are called the “integration functions” of the topic chain in this paper. This study, based on a comprehensive review of topic chain studies, re-examines the core characteristics of the topic chain. After this, the integration functions of the topic chain are analysed from internal and external levels. The topic chain itself can manage its internally different clauses to form a cohesive, meaningful and unified unit. At this stage, this paper clearly demonstrates why so much information within a topic chain is assembled into such a compact structure. At the discourse level, one topic chain can associate with other topic chains or non-chain constructions to establish textual coherence. Making use of zero anaphora, co-reference, cognitive orders and other non-morph-syntactic devices, the topic chain can combine different discourse units together to construct Chinese discourse. The study provides a systematic and well-developed account of the integration functions for the topic chain, which is significant for a deeper understanding of the nature of the topic chain and how discourse coherence is established in Chinese.