Background
From patient-reported surveys and individual interviews by health care providers, we attempted to identify the significant factors related to the improvement of distress and fatigue for cancer survivors by text analysis with machine learning techniques, as the secondary analysis using the single institute data from the Korean Cancer Survivorship Center Pilot Project.
Methods
Surveys and in-depth interviews from 322 cancer survivors were analyzed to identify their needs and concerns. Among the keywords in the surveys, including EQ-VAS, distress, fatigue, pain, insomnia, anxiety, and depression, distress and fatigue were focused. The interview transcripts were analyzed via Korean-based text analysis with machine learning techniques, based on the keywords used in the survey. Words were generated as vectors and similarity scores were calculated by the distance related to the text’s keywords and frequency. The keywords and selected high-ranked ten words for each keyword based on the similarity were then taken to draw a network map.
Results
Most participants were otherwise healthy females younger than 50 years suffering breast cancer who completed treatment less than 6 months ago. As the 1-month follow-up survey’s results, the improved patients were 56.5 and 58.4% in distress and fatigue scores, respectively. For the improvement of distress, dyspepsia (p = 0.006) and initial scores of distress, fatigue, anxiety, and depression (p < 0.001, < 0.001, 0.043, and 0.013, respectively) were significantly related. For the improvement of fatigue, economic state (p = 0.021), needs for rehabilitation (p = 0.035), initial score of fatigue (p < 0.001), any intervention (p = 0.017), and participation in family care program (p = 0.022) were significant. For the text analysis, Stress and Fatigue were placed at the center of the keyword network map, and words were intricately connected. From the regression anlysis combined survey scores and the quantitative variables from the text analysis, participation in family care programs and mention of family-related words were associated with the fatigue improvement (p = 0.033).
Conclusion
Common symptoms and practical issues were related to distress and fatigue in the survey. Through text analysis, however, we realized that the specific issues and their relationship such as family problem were more complicated. Although further research needs to explore the hidden problem in cancer patients, this study was meaningful to use personalized approach such as interviews.