Tingting Pan scite author profile

Data imbalance is a thorny issue in machine learning. SMOTE is a famous oversampling method of imbalanced learning. However, it has some disadvantages such as sample overlapping, noise interference, and blindness of neighbor selection. In order to address these problems, we present a new oversampling method, OS-CCD, based on a new concept, the classification contribution degree. The classification contribution degree determines the number of synthetic samples generated by SMOTE for each positive sample. OS-CCD follows the spatial distribution characteristics of original samples on the class boundary, as well as avoids oversampling from noisy points. Experiments on twelve benchmark datasets demonstrate that OS-CCD outperforms six classical oversampling methods in terms of accuracy, F1-score, AUC, and ROC.

show abstract

A New Improved Learning Algorithm for Convolutional Neural Networks

Yang

Zhao

et al. 2020

Processes

View full text Add to dashboard Cite

The back-propagation (BP) algorithm is usually used to train convolutional neural networks (CNNs) and has made greater progress in image classification. It updates weights with the gradient descent, and the farther the sample is from the target, the greater the contribution of it to the weight change. However, the influence of samples classified correctly but that are close to the classification boundary is diminished. This paper defines the classification confidence as the degree to which a sample belongs to its correct category, and divides samples of each category into dangerous and safe according to a dynamic classification confidence threshold. Then a new learning algorithm is presented to penalize the loss function with danger samples but not all samples to enable CNN to pay more attention to danger samples and to learn effective information more accurately. The experiment results, carried out on the MNIST dataset and three sub-datasets of CIFAR-10, showed that for the MNIST dataset, the accuracy of Non-improve CNN reached 99.246%, while that of PCNN reached 99.3%; for three sub-datasets of CIFAR-10, the accuracies of Non-improve CNN are 96.15%, 88.93%, and 94.92%, respectively, while those of PCNN are 96.44%, 89.37%, and 95.22%, respectively.

show abstract

SOAR Improved Artificial Neural Network for Multistep Decision-making Tasks

et al. 2020

View full text Add to dashboard Cite

Artificial intelligent matching for scratches of semiconductor wafers based on a K-NN algorithm

Pan

Yang

et al. 2019

Surf. Topogr.: Metrol. Prop.

View full text Add to dashboard Cite

Scratches, those usually generated during polishing the silicon wafer surface, are one of the major yield loss factors in semiconductor manufacturing industry. In order to determine the source of the scratches in real time and reduce the yield loss, it is critical for manufacturers to match and identify the same type of scratches automatically. In this paper, an improved K nearest neighbors (KNN) algorithm to address this issue is presented. Firstly, a skeleton extraction method is used to depict the main lines of scratches. Then the clustering protocol is applied as a preliminary step to group these main lines so that some essential endpoints features of main lines, such as distance, slope and curvature, can be extracted. During feature extraction, a dynamic coordinate system is introduced and this greatly reduces the distortions arise due to the magnitude of tangent difference. An intelligent matching of similar scratches MSML-KNN algorithm is formulated. The experimental results show that the proposed matching method for wafer scratches has a good adaptability and robustness.

show abstract

Interpretability for Neural Networks from the Perspective of Probability Density

Pan

Zhao

et al. 2019

View full text Add to dashboard Cite

A new classifier for imbalanced data with iterative learning process and ensemble operating process

Pan¹,

Pedrycz²,

Yang³

et al. 2022

Knowledge-Based Systems

View full text Add to dashboard Cite

Cover Image

Qi¹,

Lv²,

Chen³

et al. 2022

Hematological Oncology

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tingting Pan

Learning imbalanced datasets based on SMOTE and Gaussian distribution

A New Oversampling Method Based on the Classification Contribution Degree

A New Improved Learning Algorithm for Convolutional Neural Networks

SOAR Improved Artificial Neural Network for Multistep Decision-making Tasks

Artificial intelligent matching for scratches of semiconductor wafers based on a K-NN algorithm

Interpretability for Neural Networks from the Perspective of Probability Density

A new classifier for imbalanced data with iterative learning process and ensemble operating process

Cover Image

Contact Info

Product

Resources

About