Szilvia Szeghalmy scite author profile

Szilvia Szeghalmy

5Publications

25Citation Statements Received

79Citation Statements Given

How they've been cited

How they cite others

113

Affiliations

University of Debrecen

Publications

Order By: Most citations

A Comparative Study of the Use of Stratified Cross-Validation and Distribution-Balanced Stratified Cross-Validation in Imbalanced Learning

Szeghalmy

Fazekas

2023

Sensors

View full text Add to dashboard Cite

Nowadays, the solution to many practical problems relies on machine learning tools. However, compiling the appropriate training data set for real-world classification problems is challenging because collecting the right amount of data for each class is often difficult or even impossible. In such cases, we can easily face the problem of imbalanced learning. There are many methods in the literature for solving the imbalanced learning problem, so it has become a serious question how to compare the performance of the imbalanced learning methods. Inadequate validation techniques can provide misleading results (e.g., due to data shift), which leads to the development of methods designed for imbalanced data sets, such as stratified cross-validation (SCV) and distribution optimally balanced SCV (DOB-SCV). Previous studies have shown that higher classification performance scores (AUC) can be achieved on imbalanced data sets using DOB-SCV instead of SCV. We investigated the effect of the oversamplers on this difference. The study was conducted on 420 data sets, involving several sampling methods and the DTree, kNN, SVM, and MLP classifiers. We point out that DOB-SCV often provides a little higher F1 and AUC values for classification combined with sampling. However, the results also prove that the selection of the sampler–classifier pair is more important for the classification performance than the choice between the DOB-SCV and the SCV techniques.

show abstract

Digital Measurement of Myelofibrosis Associated Platelet Derived Growth Factor Receptor Beta (PDGFR Beta) Expression in Bone Marrow Biopsies

Szeghalmy

Bedekovics

Méhes

et al. 2013

CIT

View full text Add to dashboard Cite

In daily routine the reticulin silver staining is used on bone marrow biopsy samples as a gold standard for the characterization of myelofibrosis, however this method does not provide information about the prefibrotic stage. Recently a specific immunohistochemical method was introduced which may overcome these weaknesses of reticulin staining. Activated fibroblasts responsible for stromal proliferation are highlighted by increased PDGFR β expression, which can be presented by immunohistochemistry in bone marrow samples. Using this staining the pre-fibrotic stage can become detectable and we have information about the disease activity. During development of new staining method it is important to prove its reliability and usability. In this paper we introduce a digital image processing method to measure paranchymal damage in digitalized histological slides that can aid correct interpretation of the staining.

show abstract

A Highly Adaptive Oversampling Approach to Address the Issue of Data Imbalance

Szeghalmy

Fazekas

2022

Computers

View full text Add to dashboard Cite

Data imbalance is a serious problem in machine learning that can be alleviated at the data level by balancing the class distribution with sampling. In the last decade, several sampling methods have been published to address the shortcomings of the initial ones, such as noise sensitivity and incorrect neighbor selection. Based on the review of the literature, it has become clear to us that the algorithms achieve varying performance on different data sets. In this paper, we present a new oversampler that has been developed based on the key steps and sampling strategies identified by analyzing dozens of existing methods and that can be fitted to various data sets through an optimization process. Experiments were performed on a number of data sets, which show that the proposed method had a similar or better effect on the performance of SVM, DTree, kNN and MLP classifiers compared with other well-known samplers found in the literature. The results were also confirmed by statistical tests.

show abstract

Detection of lanes and traffic signs painted on road using on-board camera

Bente

Szeghalmy

Fazekas

2018

View full text Add to dashboard Cite

Image analysis of platelet derived growth factor receptor-beta (PDGFRβ) expression to determine the grade and dynamics of myelofibrosis in bone marrow biopsy samples

et al. 2014

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.