Both the problem of class imbalance in datasets and parameter selection of Support Vector Machine (SVM) are crucial to predict software defects. However, there is no one working to solve these problems synchronously at present. To tackle this problem, a hybrid multi-objective cuckoo search under-sampled software defect prediction model based on SVM (HMOCS-US-SVM) is proposed to solve synchronously above two problems. Firstly, a hybrid multi-objective cuckoo search with dynamical local search (HMOCS) is utilized to select synchronously the non-defective sampling and optimize the parameters of SVM. Then, three under-sampled methods for decision region range are proposed to select the non-defective modules. In the simulation, the three indicators, including the false positive rate (pf), the probability of detection (pd), and G-mean, are employed to measure the performance of the proposed algorithm. In addition, eight datasets from Promise database are selected to verify the proposed software defect predication model.Comparing with the result of eight prediction models, the proposed method comes into effect on solving software defect prediction problem. KEYWORDSclass imbalance, hybrid multi-objective cuckoo search, software defect prediction, SVM, under-sampled INTRODUCTIONWith the advancement of network society, the software has been applied widely in the areas of life, such as the banking systems, biopharmaceutical engineering, and traffic signal command. Therefore, an increasing number of attention has been paid to the quality of software products. 1Generally speaking, software quality mainly includes five aspects: reliability, understandability, availability, maintainability, and effectiveness. 2 It is specially said that the reliability plays an important factor in leading to the software defects. 3Software defects are the errors in the software development, which will lead to faults, failure, collapse, and even endanger the safety of human life and property. 4 Therefore, how to find defects as much as possible is particularly important. The core of software defect prediction (SDP) 5 is to extract the characteristic attributes as the obvious defect tendency of the historical software module, so as to predict the type or number of defects in the new software projects.Class imbalance (CIB) in datasets is an unavoidable problem in SDP, which shows that 80% of the defects are concentrated on 20% of the modules. 6 However, the traditional classification algorithm 7 is built on the relative balance of datasets, which not suitable for imbalanced datasets. It does mean that the classification algorithm is more inclined to the non-defected module. 8 Therefore, how to alleviate the imbalance of datasets is a major problem in SDP. To tackle the CIB problem, the existing research can be roughly divided into cost-sensitive method, 9 ensemble method, 10 and sampling method. 11• Cost-sensitive algorithms 12 solve the imbalanced problems by modifying algorithms, which means that the method improves the accuracy of classificatio...
Protein-protein interactions (PPIs) are useful for understanding signaling cascades, predicting protein function, associating proteins with disease and fathoming drug mechanism of action. Currently, only ∼ 10% of human PPIs may be known, and about one-third of human proteins have no known interactions. We introduce FpClass, a data mining-based method for proteome-wide PPI prediction. At an estimated false discovery rate of 60%, we predicted 250,498 PPIs among 10,531 human proteins; 10,647 PPIs involved 1,089 proteins without known interactions. We experimentally tested 233 high- and medium-confidence predictions and validated 137 interactions, including seven novel putative interactors of the tumor suppressor p53. Compared to previous PPI prediction methods, FpClass achieved better agreement with experimentally detected PPIs. We provide an online database of annotated PPI predictions (http://ophid.utoronto.ca/fpclass/) and the prediction software (http://www.cs.utoronto.ca/~juris/data/fpclass/).
Our data suggest that AR may provide another specific definition of breast cancer subtypes and reveal a potential role in DCIS progression. These findings may help develop new therapies.
Background: Experimental evidence suggests that matrix metalloproteinase-13 (MMP-13) protein may promote breast tumor progression. However, its relevance to the progression of human breast cancer is yet to be established. Furthermore, it is not clear whether MMP-13 can be used as an independent breast cancer biomarker. This study was conducted to assess the expression profile of MMP-13 protein in invasive breast carcinomas to determine its diagnostic and prognostic significance, as well as its correlation with other biomarkers including estrogen receptor (ER), progesterone receptor (PR), Her-2/neu, MMP-2, MMP-9, tissue inhibitor of MMP-1 and -2 (TIMP-1 and TIMP-2).
To understand the underlying mechanism(s) for the effect of exercise at different intensities on T cell and DNA vaccination responses, we treated mice in a training protocol with regular moderate-intensity exercise (MIE) or prolonged, exhaustive high-intensity exercise (HIE). After 6 weeks of training, splenocytes were isolated to evaluate cytokine expression and T-regulatory (Treg) cell proportion by RT-PCR and FACS, respectively. Another set of mice that completed the same training protocol were used to determine DNA vaccination responses. These mice were immunized three times with HBV DNA vaccine at 2-week intervals and euthanized on day 14 after the last immunization. Serum and splenocytes were isolated to determine humoral and cell-mediated immunity (CMI). Results showed that HIE increased anti-inflammatory cytokine expression and CD4(+) CD25(+) Treg cell proportion. Further, HIE decreased IFN-γ expression, T-lymphocyte proliferation, and antigen-specic cytotoxic response in HBV DNA vaccine-immunized mice. MIE did not change anti-inflammatory cytokine expression or CD4(+) CD25(+) Treg cell proportion but increased pro-inflammatory cytokine expression and augmented antigen-specific CMI. Thus, MIE lower the risk of cancer and infectious illness through enhancing the pro-inflammatory responses. By contrast, HIE might increase the risk of common infections, such as upper respiratory tract infection, due to an up-regulation of CD4(+) CD25(+) Treg cells and anti-inflammatory responses.
The axillary lymph node status remains the most valuable prognostic factor for breast cancer patients. However, approximately 20-30% of node-positive patients remain free of distant metastases within 15-30 years. It is important to develop molecular markers that are able to predict for the risk of distant metastasis and to develop patient-tailored therapy strategies. We hypothesize that the lymph node metastases may represent the most metastatic fraction of the primary cancers. Therefore, we sought to identify the differentially expressed genes by microarray between the primary tumors and their paired lymph node metastases samples collected from 26 patients. A set of 79 differentially expressed genes between primary cancers and metastasis samples was identified to correctly separate most of primary cancers from lymph node metastases. And decreased expression of matrix metalloproteinase 2, fibronectin, osteoblast specific factor 2, collagen type XI alpha 1 in lymph node metastases were further confirmed by real-time RT-PCR performed on 30 specimen pairs. This set of genes also classified 35 primary cancers into two groups with different prognosis: "high risk group" and "low risk group." Patients in "high risk group" had a 4.65-fold hazard ratio (95% CI 1.02-21.13, P = 0.047) to develop a distant metastasis within 43 months comparing with the "low risk group." This suggested that the gene signature consisting of 79 differentially expressed genes between primary cancers and lymph node metastases could also predict clinical outcome of node-positive patients, and that the molecular classification based on the gene signature could guide patient-tailored therapy.
Phyllodes tumor is an uncommon biphasic breast tumor, with the ability to recur and metastasize, and it behaves biologically like a stromal neoplasm. Traditionally, phyllodes tumors are graded by the use of a set of histologic data into benign, borderline, and malignant. In most series, all phyllodes tumors may recur, but only the borderline and malignant phyllodes tumors metastasize. On the basis of histologic features, prediction of behavior is difficult. The expression of many biological markers, including p53, hormone receptors, proliferation markers, angiogenesis group of markers, c-kit, CD10 and epidermal growth factor receptor have been explored, and many have been shown to be variably expressed, depending on the grade of the tumor. These markers are, however, of limited value in predicting the behavior of the tumor. Recently investigators have reported a plethora of genetic changes in phyllodes tumors, the most consistent of which seems to be 1q gain by comparative genomic hybridization. Some candidate genes have been mapped to various sites, and preliminary data suggest that some of these changes may be related to recurrence. It is foreseeable that more exciting data will be generated to help us to understand the etiology and pathogenesis of phyllodes tumor.
This paper reports the first part of a project that aims to develop a knowledge extraction and knowledge discovery system that extracts causal knowledge from textual databases. In this initial study, we develop a method to identify and extract cause-effect information that is explicitly expressed in medical abstracts in the Medline database. A set of graphical patterns were constructed that indicate the presence of a causal relation in sentences, and which part of the sentence represents the cause and which part represents the effect. The patterns are matched with the syntactic parse trees of sentences, and the parts of the parse tree that match with the slots in the patterns are extracted as the cause or the effect.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.