Inês Domingues scite author profile

The performance evaluation of imputation algorithms often involves the generation of missing values. Missing values can be inserted in only one feature (univariate configuration) or in several features (multivariate configuration) at different percentages (missing rates) and according to distinct missing mechanisms, namely, missing completely at random, missing at random, and missing not at random. Since the missing data generation process defines the basis for the imputation experiments (configuration, missing rate, and missing mechanism), it is essential that it is appropriately applied; otherwise, conclusions derived from ill-defined setups may be invalid. The goal of this paper is to review the different approaches to synthetic missing data generation found in the literature and discuss their practical details, elaborating on their strengths and weaknesses. Our analysis revealed that creating missing at random and missing not at random scenarios in datasets comprising qualitative features is the most challenging issue in the related work and, therefore, should be the focus of future work in the field. INDEX TERMS Data preprocessing, missing data, missing data generation, missing data mechanisms.

show abstract

The Recombinant α Isoform of Protein Kinase CK1 from Xenopus laevis can Phosphorylate Tyrosine in Synthetic Substrates

Pulgar

Tapia

Vignolo

et al. 1996

European Journal of Biochemistry

View full text Add to dashboard Cite

The cDNA coding for protein kinase CKla has been cloned from a Xenopus laevis cDNA library.The derived amino acid sequence of the protein contains 337 amino acids and has a calculated molecular mass of 38 874 Da. The sequence is identical to that of the human CKla and to the bovine CKla, except that it is 12 amino acids longer than the latter protein. Southern blotting with a 264-bp probe demonstrates that four or more fragments are obtained upon digestion of genomic DNA with EcoRl and Hind3, suggesting that X. Zaevis possesses a family of related CK1 genes. CKla was expressed in Escherichia coli as a glutathione transferase fusion protein (GT-CKIa) and certain of its characteristics were determined. The recombinant GT-CKla fusion protein was found to have apparent K,,, values for ATP (12 pM), casein (1.5 mg/ml) and the specific peptide substrate RRKDLHDDEEDEAMSITA (180 pM) which are similar to those of the rat liver CK1 enzyme. The recombinant CKla activity is weakly inhibited by heparin, but strongly inhibited by poly(GI~*":Tyr~~). This inhibition is competitive and shows an approximate K, of .5 pM. CKla can phosphorylate the tyrosine residues of poly(ClusO:Tyr'") and the tyrosine residue in the synthetic peptide RRREEEYEEEE. This kinase preparation also autophosphorylates in serine, threonine and weakly in tyrosine.

show abstract

Closed Shortest Path in the Original Coordinates with an Application to Breast Cancer

Cardoso

Domingues

Oliveira

2015

Int. J. Patt. Recogn. Artif. Intell.

View full text Add to dashboard Cite

Breast cancer is one of the most mediated malignant diseases, because of its high incidence and prevalence, but principally due to its physical and psychological invasiveness. The study of this disease using computer science tools resorts often to the image segmentation operation. Image segmentation, although having been extensively studied, is still an open problem. Shortest path algorithms are extensively used to tackle this problem. There are, however, applications where the starting and ending positions of the shortest path need to be constrained, defining a closed contour enclosing a previously detected seed. Mass and calcification segmentation in mammograms and areola segmentation in digital images are two particular examples of interest within the field of breast cancer research. Usually the closed contour computation is addressed by transforming the image into polar coordinates, where the closed contour is transformed into an open contour between two opposite margins. In this work, after illustrating some of the limitations of this approach, we show how to compute the closed contour in the original coordinate space. After defining a directed acyclic graph appropriate for this task, we address the main difficulty in operating in the original coordinate space. Since small paths collapsing in the seed point are naturally favored, we modulate the cost of the edges to counterbalance this bias. A thorough evaluation is conducted with datasets from the breast cancer field. The algorithm is shown to be fast and reliable and suffers no loss in resolution.

show abstract

Assessment of a novel mass detection algorithm in mammograms

et al. 2013

View full text Add to dashboard Cite

We should not just insist on sensitivity in the segmentation phase because if we forgot FP rate, and our goal was just higher sensitivity, then the learning algorithm would be biased more toward false positives and the sensitivity would decrease dramatically in the false positive reduction phase. Therefore, we should consider the mass detection problem as a cost sensitive problem because misclassification costs are not the same in this type of problems.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Inês Domingues

Cross-Validation for Imbalanced Datasets: Avoiding Overoptimistic and Overfitting Approaches [Research Frontier]

INbreast

Using deep learning techniques in medical imaging: a systematic review of applications on CT and PET

Influence of Data Distribution in Missing Data Imputation

Generating Synthetic Missing Data: A Review by Missing Mechanism

The Recombinant α Isoform of Protein Kinase CK1 from Xenopus laevis can Phosphorylate Tyrosine in Synthetic Substrates

Closed Shortest Path in the Original Coordinates with an Application to Breast Cancer

Assessment of a novel mass detection algorithm in mammograms

Contact Info

Product

Resources

About