Mammographic density is an important risk factor for breast cancer. In recent research, percentage density assessed visually using visual analogue scales (VAS) showed stronger risk prediction than existing automated density measures, suggesting readers may recognize relevant image features not yet captured by hand-crafted algorithms. With deep learning, it may be possible to encapsulate this knowledge in an automatic method. We have built convolutional neural networks (CNN) to predict density VAS scores from full-field digital mammograms. The CNNs are trained using whole-image mammograms, each labeled with the average VAS score of two independent readers. Each CNN learns a mapping between mammographic appearance and VAS score so that at test time, they can predict VAS score for an unseen image. Networks were trained using 67,520 mammographic images from 16,968 women and for model selection we used a dataset of 73,128 images. Two case-control sets of contralateral mammograms of screen detected cancers and prior images of women with cancers detected subsequently, matched to controls on age, menopausal status, parity, HRT and BMI, were used for evaluating performance on breast cancer prediction. In the case-control sets, odd ratios of cancer in the highest versus lowest quintile of percentage density were 2.49 (95% CI: 1.59 to 3.96) for screendetected cancers and 4.16 (2.53 to 6.82) for priors, with matched concordance indices of 0.587 (0.542 to 0.627) and 0.616 (0.578 to 0.655), respectively. There was no significant difference between reader VAS and predicted VAS for the prior test set (likelihood ratio chi square, p ¼ 0.134). Our fully automated method shows promising results for cancer risk prediction and is comparable with human performance.
Abstract. We model a virtual scientific community in which authors publish and cite articles. Citations are attributed according to a preferential attachment mechanism. From the numerical simulations, the h-index can be computed. This bottom-up approach reproduces well real bibliometric data. We consider two versions of our model. (1) The single-scientist is controlled by two parameters which can be tuned to reproduce the value of the h-index of many real scientists. Moreover, this model shows how the h-index grows with the number of citations, for a fixed number of articles. We also define an average h-index that can be used to compare the scientific productivity of institutions of different sizes. (2) The multi-scientist model considers a population of scientists and allows us to study the impact of removing citations from the low h-index researchers on the community. Simulations on real bibilometric data, as well as the predictions of the model, show that the h-index eco-system can be strongly affected by such a filtering.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.