CLAG3 Self-Associates in Malaria Parasites and Quantitatively Determines Nutrient Uptake Channels at the Host Membrane

In k Nearest Neighbor (kNN) classifier, a query instance is classified based on the most frequent class of its nearest neighbors among the training instances. In imbalanced datasets, kNN becomes biased towards the majority instances of the training space. To solve this problem, we propose a method called Proximity weighted Evidential kNN classifier. In this method, each neighbor of a query instance is considered as a piece of evidence from which we calculate the probability of class label given feature values to provide more preference to the minority instances. This is then discounted by the proximity of the neighbor to prioritize the closer instances in the local neighborhood. These evidences are then combined using Dempster-Shafer theory of evidence. A rigorous experiment over 30 benchmark imbalanced datasets shows that our method performs better compared to 12 popular methods. In pairwise comparison of these 12 methods with our method, in the best case, our method wins in 29 datasets, and in the worst case it wins in least 19 datasets. More importantly, according to Friedman test the proposed method ranks higher than all other methods in terms of AUC at 5% level of significance.

show abstract

Discretization and Feature Selection Based on Bias Corrected Mutual Information Considering High-Order Dependencies

Roy

Sharmin

Ali

et al. 2020

View full text Add to dashboard Cite

Mutual Information (MI) based feature selection methods are popular due to their ability to capture the nonlinear relationship among variables. However, existing works rarely address the error (bias) that occurs due to the use of finite samples during the estimation of MI. To the best of our knowledge, none of the existing methods address the bias issue for the high-order interaction term which is essential for better approximation of joint MI. In this paper, we first calculate the amount of bias of this term. Moreover, to select features using χ 2 based search, we also show that this term follows χ 2 distribution. Based on these two theoretical results, we propose Discretization and feature Selection based on bias corrected Mutual information (DSbM). DSbM is extended by adding simultaneous forward selection and backward elimination (DSbM fb ). We demonstrate the superiority of DSbM over four state-of-the-art methods in terms of accuracy and the number of selected features on twenty benchmark datasets. Experimental results also demonstrate that DSbM outperforms the existing methods in terms of accuracy, Pareto Optimality and Friedman test. We also observe that compared to DSbM, in some dataset DSbM fb selects fewer features and increases accuracy.

show abstract

BFSp: A feature selection method for bug severity classification

Sharmin

Aktar

Ali

et al. 2017

View full text Add to dashboard Cite

SAL: An effective method for software defect prediction

Sharmin

Arefin

Abdullah-Al-Wadud

et al. 2015

View full text Add to dashboard Cite

For software quality assurance, software defect prediction (SDP) has drawn a great deal of attention in recent years. Its goal is to reduce verification cost, time and effort by predicting the defective modules efficiently. In SDP, proper attribute selection plays a significant role. However, selection of proper attributes and their representation in an efficient way are very challenging due to the lacking of standard set of attributes. To address these issues, we introduce Selection of Attribute with Log filtering (SAL) to select a proper set of attributes. Our proposed attribute selection process can effectively select the best set of attributes, which are relevant for the discrimination of defected and non-defected software modules. Further, we adopt log filtering to pre-process the input data. We have evaluated the proposed attribute selection method using several widely used publicly available datasets. The simulation results demonstrate that our method is more effective to improve the accuracy of SDP than the existing state-of-the-art methods.

show abstract

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sadia Sharmin

Simultaneous feature selection and discretization based on mutual information

Hateful Speech Detection in Public Facebook Pages for the Bengali Language

Spam Detection in Social Media Employing Machine Learning Tool for Text Mining

Attention-based convolutional neural network for Bangla sentiment analysis

A Proximity Weighted Evidential k Nearest Neighbor Classifier for Imbalanced Data

Discretization and Feature Selection Based on Bias Corrected Mutual Information Considering High-Order Dependencies

BFSp: A feature selection method for bug severity classification

SAL: An effective method for software defect prediction

Contact Info

Product

Resources

About