Text readability is the problem of determining whether a text is suitable for a certain group of readers, and thus building a model to assess the readability of text yields great significance across the disciplines of science, publishing, and education. While text readability has attracted attention since the late nineteenth century for English and other popular languages, it remains relatively underexplored in Vietnamese. Previous studies on this topic in Vietnamese have only focused on the examination of shallow word-level features using surface statistics such as frequency and ratio. Hence, features at higher levels like sentence structure and meaning are still untapped. In this study, we propose the most comprehensive analysis of Vietnamese text readability to date, targeting features at all linguistic levels, ranging from the lexical and phrasal elements to syntactic and semantic factors. This work pioneers the investigation on the effects of multi-level linguistic features on text readability in the Vietnamese language.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.