Predictions about upcoming content play an important role during language comprehension and processing. Semantic similarity as a metric has been used to predict how words are processed in context in language comprehension and processing tasks. This study proposes a novel, dynamic approach for computing contextual semantic similarity, evaluates the extent to which the semantic similarity measures computed using this approach can predict fixation durations in reading tasks recorded in a corpus of eye-tracking data, and compares the performance of these measures to that of semantic similarity measures computed using the cosine and Euclidean methods. Our results reveal that the semantic similarity measures generated by our approach are significantly predictive of fixation durations on reading and outperform those generated by the two existing approaches. The findings of this study contribute to a better understanding of how humans process words in context and make predictions in language comprehension and processing. The effective and interpretable approach to computing contextual semantic similarity proposed in this study can also facilitate further explorations of other experimental data on language comprehension and processing.
The double‐nominal construction (DNC), also called ‘topic construction’, is a common occurrence in Chinese and other East Asian languages. It is characterized by two initial NPs which appear before the predicate verb. The construction has mostly been analyzed using the syntactic angle singly approach. The topic (the initial nominal phrase, abbreviated as NP1) needs to syntactically establish some connection with the comment (the rest of the construction) but this has, unfortunately, not been the case due to numerous counterexamples. This construction is so complex that other factors have to be taken into account. This paper addresses the major concern about the problem of the two Initial NPs’ transposition in various DNCs, an area that does not appear to have previously been sufficiently explored. Compared with other languages, the transposition of two initial NPs in DNC is unique to Chinese. The transposition with two initial NPs in each type of DNC performs quite differently, so we should make finer classification for DNC reasonably. In order to propose reasonable classification of DNC, we need to clarify the relationship between NP1 and the rest of the construction. Meanwhile, in order to tackle the problem of the two NPs’ transposition in a special type of DNC called dangling topic construction, we propose a more reasonable and precise interpretation of the relationship between topic and comment in this construction using the event‐based model and the event integration. This study shows how, depending on the syntactic‐semantic behavior of NP1, Chinese DNCs can be classified into three types. Finally, based on the three types of DNC proposed, a semantic‐cognitive interaction helps to explain and resolve the problem of NPs’ transposition for each type. This study, therefore, provides a unified and more developed account of Chinese DNC. Consequently, a semantic‐cognitive approach is likely to shed more light on the notion of topic construction and help understand how Chinese native speakers comprehend its structure and construct the meaning.
Scientific writings, as one essential part of human culture, have evolved over centuries into their current form. Knowing how scientific writings evolved is particularly helpful in understanding how trends in scientific culture developed. It also allows us to better understand how scientific culture was interwoven with human culture generally. The availability of massive digitized texts and the progress in computational technologies today provide us with a convenient and credible way to discern the evolutionary patterns in scientific writings by examining the diachronic linguistic changes. The linguistic changes in scientific writings reflect the genre shifts that took place with historical changes in science and scientific writings. This study investigates a general evolutionary linguistic pattern in scientific writings. It does so by merging two credible computational methods: relative entropy; word-embedding concreteness and imageability. It thus creates a novel quantitative methodology and applies this to the examination of diachronic changes in the Philosophical Transactions of Royal Society (PTRS, 1665–1869). The data from two computational approaches can be well mapped to support the argument that this journal followed the evolutionary trend of increasing professionalization and specialization. But it also shows that language use in this journal was greatly influenced by historical events and other socio-cultural factors. This study, as a “culturomic” approach, demonstrates that the linguistic evolutionary patterns in scientific discourse have been interrupted by external factors even though this scientific discourse would likely have cumulatively developed into a professional and specialized genre. The approaches proposed by this study can make a great contribution to full-text analysis in scientometrics.
This study applies relative entropy in naturalistic large-scale corpus to calculate the difference among L2 (second language) learners at different levels. We chose lemma, token, POS-trigram, conjunction to represent lexicon and grammar to detect the patterns of language proficiency development among different L2 groups using relative entropy. The results show that information distribution discrimination regarding lexical and grammatical differences continues to increase from L2 learners at a lower level to those at a higher level. This result is consistent with the assumption that in the course of second language acquisition, L2 learners develop towards a more complex and diverse use of language. Meanwhile, this study uses the statistics method of time series to process the data on L2 differences yielded by traditional frequency-based methods processing the same L2 corpus to compare with the results of relative entropy. However, the results from the traditional methods rarely show regularity. As compared to the algorithms in traditional approaches, relative entropy performs much better in detecting L2 proficiency development. In this sense, we have developed an effective and practical algorithm for stably detecting and predicting the developments in L2 learners’ language proficiency.
The topic chain, one of the essential organization devices in Chinese discourse, is highlighted by the use of many co-referential zero forms. Although the topic chain has been realized to play an important role in organizing discourse, few attempts have been made to explore how the topic chain forms an integrated and meaningful unit and facilitates discourse organization, which are called the “integration functions” of the topic chain in this paper. This study, based on a comprehensive review of topic chain studies, re-examines the core characteristics of the topic chain. After this, the integration functions of the topic chain are analysed from internal and external levels. The topic chain itself can manage its internally different clauses to form a cohesive, meaningful and unified unit. At this stage, this paper clearly demonstrates why so much information within a topic chain is assembled into such a compact structure. At the discourse level, one topic chain can associate with other topic chains or non-chain constructions to establish textual coherence. Making use of zero anaphora, co-reference, cognitive orders and other non-morph-syntactic devices, the topic chain can combine different discourse units together to construct Chinese discourse. The study provides a systematic and well-developed account of the integration functions for the topic chain, which is significant for a deeper understanding of the nature of the topic chain and how discourse coherence is established in Chinese.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.