The strengths of the actually presented database-INbreast-relies on the fact that it was built with full-field digital mammograms (in opposition to digitized mammograms), it presents a wide variability of cases, and is made publicly available together with precise annotations. We believe that this database can be a reference for future works centered or related to breast cancer imaging.
The performance evaluation of imputation algorithms often involves the generation of missing values. Missing values can be inserted in only one feature (univariate configuration) or in several features (multivariate configuration) at different percentages (missing rates) and according to distinct missing mechanisms, namely, missing completely at random, missing at random, and missing not at random. Since the missing data generation process defines the basis for the imputation experiments (configuration, missing rate, and missing mechanism), it is essential that it is appropriately applied; otherwise, conclusions derived from ill-defined setups may be invalid. The goal of this paper is to review the different approaches to synthetic missing data generation found in the literature and discuss their practical details, elaborating on their strengths and weaknesses. Our analysis revealed that creating missing at random and missing not at random scenarios in datasets comprising qualitative features is the most challenging issue in the related work and, therefore, should be the focus of future work in the field. INDEX TERMS Data preprocessing, missing data, missing data generation, missing data mechanisms.
The cDNA coding for protein kinase CKla has been cloned from a Xenopus laevis cDNA library.The derived amino acid sequence of the protein contains 337 amino acids and has a calculated molecular mass of 38 874 Da. The sequence is identical to that of the human CKla and to the bovine CKla, except that it is 12 amino acids longer than the latter protein. Southern blotting with a 264-bp probe demonstrates that four or more fragments are obtained upon digestion of genomic DNA with EcoRl and Hind3, suggesting that X. Zaevis possesses a family of related CK1 genes. CKla was expressed in Escherichia coli as a glutathione transferase fusion protein (GT-CKIa) and certain of its characteristics were determined. The recombinant GT-CKla fusion protein was found to have apparent K,,, values for ATP (12 pM), casein (1.5 mg/ml) and the specific peptide substrate RRKDLHDDEEDEAMSITA (180 pM) which are similar to those of the rat liver CK1 enzyme. The recombinant CKla activity is weakly inhibited by heparin, but strongly inhibited by poly(GI~*":Tyr~~). This inhibition is competitive and shows an approximate K, of .5 pM. CKla can phosphorylate the tyrosine residues of poly(ClusO:Tyr'") and the tyrosine residue in the synthetic peptide RRREEEYEEEE. This kinase preparation also autophosphorylates in serine, threonine and weakly in tyrosine.
Breast cancer is one of the most mediated malignant diseases, because of its high incidence and prevalence, but principally due to its physical and psychological invasiveness. The study of this disease using computer science tools resorts often to the image segmentation operation. Image segmentation, although having been extensively studied, is still an open problem. Shortest path algorithms are extensively used to tackle this problem. There are, however, applications where the starting and ending positions of the shortest path need to be constrained, defining a closed contour enclosing a previously detected seed. Mass and calcification segmentation in mammograms and areola segmentation in digital images are two particular examples of interest within the field of breast cancer research. Usually the closed contour computation is addressed by transforming the image into polar coordinates, where the closed contour is transformed into an open contour between two opposite margins. In this work, after illustrating some of the limitations of this approach, we show how to compute the closed contour in the original coordinate space. After defining a directed acyclic graph appropriate for this task, we address the main difficulty in operating in the original coordinate space. Since small paths collapsing in the seed point are naturally favored, we modulate the cost of the edges to counterbalance this bias. A thorough evaluation is conducted with datasets from the breast cancer field. The algorithm is shown to be fast and reliable and suffers no loss in resolution.
We should not just insist on sensitivity in the segmentation phase because if we forgot FP rate, and our goal was just higher sensitivity, then the learning algorithm would be biased more toward false positives and the sensitivity would decrease dramatically in the false positive reduction phase. Therefore, we should consider the mass detection problem as a cost sensitive problem because misclassification costs are not the same in this type of problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.