2021
DOI: 10.2196/26310
|View full text |Cite
|
Sign up to set email alerts
|

Cancer Communication and User Engagement on Chinese Social Media: Content Analysis and Topic Modeling Study

Abstract: Background Cancer ranks among the most serious public health challenges worldwide. In China—the world’s most populous country—about one-quarter of the population consists of people with cancer. Social media has become an important platform that the Chinese public uses to express opinions. Objective We investigated cancer-related discussions on the Chinese social media platform Weibo (Sina Corporation) to identify cancer topics that generate the highest … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 16 publications
(10 citation statements)
references
References 29 publications
(31 reference statements)
0
8
0
Order By: Relevance
“…Before the analysis, we followed the standard preprocessing procedures designed in previous studies [ 37 , 38 ] to clean the data using Python 3.0 (Python Software Foundation) and to perform word part-of-speech tagging and text processing using the Python library spaCy [ 39 , 40 ]. Through data cleaning, we converted the words in the reviews into lowercase words; removed stop words, punctuation, numbers, and nonword characters; and stemmed the remaining text [ 41 ]. To generate more interpretable topics of high quality, we restricted the parts of speech of words to “noun” (NOUN), “verb” (VERB), “adjective” (ADJ), or “proper noun” (PROPN).…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…Before the analysis, we followed the standard preprocessing procedures designed in previous studies [ 37 , 38 ] to clean the data using Python 3.0 (Python Software Foundation) and to perform word part-of-speech tagging and text processing using the Python library spaCy [ 39 , 40 ]. Through data cleaning, we converted the words in the reviews into lowercase words; removed stop words, punctuation, numbers, and nonword characters; and stemmed the remaining text [ 41 ]. To generate more interpretable topics of high quality, we restricted the parts of speech of words to “noun” (NOUN), “verb” (VERB), “adjective” (ADJ), or “proper noun” (PROPN).…”
Section: Methodsmentioning
confidence: 99%
“…The statistical methods of unsupervised TM algorithms (which do not need prior labeling or annotations of the documents) were designed to analyze the words (terms) of the original texts to identify the themes (topics) running through a corpus [ 42 , 43 ]. These algorithms allow users to organize and summarize numerous documents that cannot be annotated manually [ 41 ], thereby revealing the hidden topics in the documents [ 43 ]. We adopted the LDA TM technique, which assumes that texts are generated from a mixture of topics [ 44 ].…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations