Highlights
Radiographic chest images can be used to more accurately detect COVID-19 and assess disease severity. Among different imaging modalities, chest X-ray radiography has advantages of low cost, low radiation dose, wide accessibility and easy-to-operate in general or community hospitals.
This study aims to develop and test a new deep learning model of chest X-ray images to detect COVID-19 induced pneumonia. For this purpose, we assembled a relatively large chest X-ray image dataset involving 8,474 cases, which are divided into three groups of COVID-19 infected pneumonia, other community-acquired no-COVID-19 infected pneumonia, and normal (non-pneumonia) cases.
After applying a preprocessing algorithm to detect and remove diaphragm regions depicting on images, a histogram equalization algorithm and a bilateral filter are applied to process the original images to generate two sets of filtered images. Then, the original image plus these two filtered images are used as inputs of three channels of the CNN deep learning model, which increase learning information of the model.
In order to fully take advantages of the pre-optimized CNN models, this study uses a transfer learning method to build a new model to detect and classify COVID-19 infected pneumonia. A VGG16 based CNN model was originally trained using ImageNet and fine-tuned using chest X-ray images in this study.
To reduce the bias in training and testing the CNN model, dataset is randomly divided into 3 subsets namely, training, validation, and testing with respect to the same frequency of cases in each class in all three COVID-19 infected pneumonia, other community-acquired no-COVID-19 infected pneumonia, and normal (non-pneumonia) groups.
Testing on a subset of 2544 cases, the CNN model yields 94.5% accuracy in classifying three subsets of cases and 98.1% accuracy in detecting COVID-19 infected pneumonia cases, which are significantly higher than the model directly trained using the original images without applying two image preprocessing steps to remove diaphragm and generate two filtered images.
In order to automatically identify a set of effective mammographic image features and build an optimal breast cancer risk stratification model, this study aims to investigate advantages of applying a machine learning approach embedded with a locally preserving projection (LPP) based feature combination and regeneration algorithm to predict short-term breast cancer risk. A dataset involving negative mammograms acquired from 500 women was assembled. This dataset was divided into two age-matched classes of 250 high risk cases in which cancer was detected in the next subsequent mammography screening and 250 low risk cases, which remained negative. First, a computer-aided image processing scheme was applied to segment fibro-glandular tissue depicted on mammograms and initially compute 44 features related to the bilateral asymmetry of mammographic tissue density distribution between left and right breasts. Next, a multi-feature fusion based machine learning classifier was built to predict the risk of cancer detection in the next mammography screening. A leave-one-case-out (LOCO) cross-validation method was applied to train and test the machine learning classifier embedded with a LLP algorithm, which generated a new operational vector with 4 features using a maximal variance approach in each LOCO process. Results showed a 9.7% increase in risk prediction accuracy when using this LPP-embedded machine learning approach. An increased trend of adjusted odds ratios was also detected in which odds ratios increased from 1.0 to 11.2. This study demonstrated that applying the LPP algorithm effectively reduced feature dimensionality, and yielded higher and potentially more robust performance in predicting short-term breast cancer risk.
This study aims to investigate the feasibility of identifying a new quantitative imaging marker based on false-positives generated by a computer-aided detection (CAD) scheme to help predict short-term breast cancer risk. An image dataset including four view mammograms acquired from 1044 women was retrospectively assembled. All mammograms were originally interpreted as negative by radiologists. In the next subsequent mammography screening, 402 women were diagnosed with breast cancer and 642 remained negative. An existing CAD scheme was applied 'as is' to process each image. From CAD-generated results, four detection features including the total number of (1) initial detection seeds and (2) the final detected false-positive regions, (3) average and (4) sum of detection scores, were computed from each image. Then, by combining the features computed from two bilateral images of left and right breasts from either craniocaudal or mediolateral oblique view, two logistic regression models were trained and tested using a leave-one-case-out cross-validation method to predict the likelihood of each testing case being positive in the next subsequent screening. The new prediction model yielded the maximum prediction accuracy with an area under a ROC curve of AUC = 0.65 ± 0.017 and the maximum adjusted odds ratio of 4.49 with a 95% confidence interval of (2.95, 6.83). The results also showed an increasing trend in the adjusted odds ratio and risk prediction scores (p < 0.01). Thus, this study demonstrated that CAD-generated false-positives might include valuable information, which needs to be further explored for identifying and/or developing more effective imaging markers for predicting short-term breast cancer risk.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.