BackgroundBreast cancer is one of the most common diseases in women worldwide. Many studies have been conducted to predict the survival indicators, however most of these analyses were predominantly performed using basic statistical methods. As an alternative, this study used machine learning techniques to build models for detecting and visualising significant prognostic indicators of breast cancer survival rate.MethodsA large hospital-based breast cancer dataset retrieved from the University Malaya Medical Centre, Kuala Lumpur, Malaysia (n = 8066) with diagnosis information between 1993 and 2016 was used in this study. The dataset contained 23 predictor variables and one dependent variable, which referred to the survival status of the patients (alive or dead). In determining the significant prognostic factors of breast cancer survival rate, prediction models were built using decision tree, random forest, neural networks, extreme boost, logistic regression, and support vector machine. Next, the dataset was clustered based on the receptor status of breast cancer patients identified via immunohistochemistry to perform advanced modelling using random forest. Subsequently, the important variables were ranked via variable selection methods in random forest. Finally, decision trees were built and validation was performed using survival analysis.ResultsIn terms of both model accuracy and calibration measure, all algorithms produced close outcomes, with the lowest obtained from decision tree (accuracy = 79.8%) and the highest from random forest (accuracy = 82.7%). The important variables identified in this study were cancer stage classification, tumour size, number of total axillary lymph nodes removed, number of positive lymph nodes, types of primary treatment, and methods of diagnosis.ConclusionInterestingly the various machine learning algorithms used in this study yielded close accuracy hence these methods could be used as alternative predictive tools in the breast cancer survival studies, particularly in the Asian region. The important prognostic factors influencing survival rate of breast cancer identified in this study, which were validated by survival curves, are useful and could be translated into decision support tools in the medical domain.
The reliable classification of benign and malignant lesions in breast ultrasound images can provide an effective and relatively low-cost method for the early diagnosis of breast cancer. The accuracy of the diagnosis is, however, highly dependent on the quality of the ultrasound systems and the experience of the users (radiologists). The use of deep convolutional neural network approaches has provided solutions for the efficient analysis of breast ultrasound images. In this study, we propose a new framework for the classification of breast cancer lesions with an attention module in a modified VGG16 architecture. The adopted attention mechanism enhances the feature discrimination between the background and targeted lesions in ultrasound. We also propose a new ensembled loss function, which is a combination of binary cross-entropy and the logarithm of the hyperbolic cosine loss, to improve the model discrepancy between classified lesions and their labels. This combined loss function optimizes the network more quickly. The proposed model outperformed other modified VGG16 architectures, with an accuracy of 93%, and also, the results are competitive with those of other state-of-the-art frameworks for the classification of breast cancer lesions. Our experimental results show that the choice of loss function is highly important and plays a key role in breast lesion classification tasks. Additionally, by adding an attention block, we could improve the performance of the model.
Pathology reports represent a primary source of information for cancer registries. University Malaya Medical Centre (UMMC) is a tertiary hospital responsible for training pathologists; thus narrative reporting becomes important. However, the unstructured free-text reports made the information extraction process tedious for clinical audits and data analysis-related research. This study aims to develop an automated natural language processing (NLP) algorithm to summarize the existing narrative breast pathology report from UMMC to a narrower structured synoptic pathology report with a checklist-style report template to ease the creation of pathology reports. The development of the rule-based NLP algorithm was based on the R programming language by using 593 pathology specimens from 174 patients provided by the Department of Pathology, UMMC. The pathologist provides specific keywords for data elements to define the semantic rules of the NLP. The system was evaluated by calculating the precision, recall, and F1-score. The proposed NLP algorithm achieved a micro-F1 score of 99.50% and a macro-F1 score of 98.97% on 178 specimens with 25 data elements. This achievement correlated to clinicians’ needs, which could improve communication between pathologists and clinicians. The study presented here is significant, as structured data is easily minable and could generate important insights.
Many efforts are currently underway around the world to improve public awareness about preventive measures and to disseminate appropriate information about COVID-19 in curbing the spread of the disease. This study aims to determine the level of awareness on control and prevention of COVID-19 among population in Malaysia. A cross-sectional study was conducted among 355 participants in between March 30th to May 21st, 2020. A set of questionnaire that consists of five main themes: (1) socio-demographics, (2) awareness, (3) knowledge, (4) attitudes, and (5) practices towards prevention and controlling COVID-19 were distributed via online using google forms. The overall Knowledge, Attitude and Practice (KAP) scores was analyzed based on Bloom’s cut-off point of 80%. The results of this study show that Malaysians’ awareness highly influence their knowledge, attitude, and practices in preventing and controlling COVID-19 spread. Although the results are reasonably good, it is recommended for further awareness to be undertaken to continuously raise the awareness level and to remove any negative stigma and attitude that consequently produce better practices to prevent the spread of this virus, so that Malaysia is capable of stopping the COVID-19 infectious virus. Virtual awareness programs should be conducted to provide the public with the most up-to-date information on infection control procedures and how to maintain a hygienic environment, as well as encourage people to adopt social distance and avoid social gatherings.
Automated artificial intelligence (AI) systems enable the integration of different types of data from various sources for clinical decision-making. The aim of this study is to propose a pipeline to develop a fully automated clinician-friendly AI-enabled database platform for breast cancer survival prediction. A case study of breast cancer survival cohort from the University Malaya Medical Centre was used to develop and evaluate the pipeline. A relational database and a fully automated system were developed by integrating the database with analytical modules (machine learning, automated scoring for quality of life, and interactive visualization). The developed pipeline, iSurvive has helped in enhancing data management as well as to visualize important prognostic variables and survival rates. The embedded automated scoring module demonstrated quality of life of patients whereas the interactive visualizations could be used by clinicians to facilitate communication with patients. The pipeline proposed in this study is a one-stop center to manage data, to automate analytics using machine learning, to automate scoring and to produce explainable interactive visuals to enhance clinician-patient communication along the survivorship period to modify behaviours that relate to prognosis. The pipeline proposed can be modelled on any disease not limited to breast cancer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.