Unbalanced data processing using oversampling: Machine Learning

Viloria, Amelec; Lezama, Omar Bonerge Píneda; Mercado-Caruzo, Nohora

doi:10.1016/j.procs.2020.07.018

Cited by 27 publications

(15 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…As one may observe, there were only 2 lane detection errors; for vehicle classification, in turn, there was a bias towards classifying vehicles as cars. We tried to circumvent this problem using techniques for dealing with unbalanced training sets (i.e., training sets in which classes have distinct frequencies), such as oversampling the least frequent classes [28], but the results did not change significantly.…”

Section: Resultsmentioning

confidence: 99%

UWB Radar Applied to Lane Occupation and Vehicle Classification

Perotoni¹,

Bordin²,

Castilho³

et al. 2023

JCIS

View full text Add to dashboard Cite

This article describes the use of a commercial UWB radar for vehicle classification and lane occupation detection using real-world data acquired in an urban environment. We compare two radar image processing schemes: one based on deep learning using raw data produced by the radar, and a second method employing traditional machine learning algorithms using features extracted from raw data. We verify experimentally that both schemes lead to reasonably accurate estimates without the need of large training sets.

show abstract

Section: Resultsmentioning

confidence: 99%

UWB Radar Applied to Lane Occupation and Vehicle Classification

Perotoni¹,

Bordin²,

Castilho³

et al. 2023

JCIS

View full text Add to dashboard Cite

show abstract

“…Explicitly, only 950 Eurobarometer survey participants admitted their participation in the undeclared economy from the supply side, which introduces a substantial risk of models being biased towards the negative outcome. 7 To address this issue, during the training phase, we applied the random oversampling scheme with weights inversely proportional to class frequencies (see Fernández Hilario et al 2018 ; Viloria et al 2020 ).…”

Section: Methodsmentioning

confidence: 99%

What do we really know about the drivers of undeclared work? An evaluation of the current state of affairs using machine learning

Franić

2022

AI & Soc

View full text Add to dashboard Cite

It is nowadays widely understood that undeclared work cannot be efficiently combated without a holistic view on the mechanisms underlying its existence. However, the question remains whether we possess all the pieces of the holistic puzzle . To fill the gap, in this paper, we test if the features so far known to affect the behaviour of taxpayers are sufficient to detect noncompliance with outstanding precision. This is done by training seven supervised machine learning models on the compilation of data from the 2019 Special Eurobarometer on undeclared work and relevant figures from other sources. The conducted analysis not only does attest to the completeness of our knowledge concerning the drivers of undeclared work but also paves the way for wide usage of artificial intelligence in monitoring and confronting this detrimental practice. The study, however, exposes the necessity of having at disposal considerably larger datasets compared to those currently available if successful real-world applications of machine learning are to be achieved in this field. Alongside the apparent theoretical contribution, this paper is thus also expected to be of particular importance for policymakers, whose efforts to tackle tax evasion will have to be expedited in the period after the COVID-19 pandemic.

show abstract

“…The imbalanced proportion of normal and DR images in Big Data has been identi ed as one of the main challenges for the algorithms. This can commonly cause over tting problems [11], as there is a high performance of DR grading in training data, but low performance in the testing data.…”

Section: A Datasetmentioning

confidence: 99%

Development of Revised ResNet-50 for Diabetic Retinopathy Detection

Lin

2023

Preprint

View full text Add to dashboard Cite

Diabetic retinopathy (DR) produces bleeding, exudation, and new blood vessel formation conditions. DR can damage the retinal blood vessels and cause vision loss or even blindness. If DR is detected early, ophthalmologists can use lasers to create tiny burns around the retinal tears to inhibit bleeding and prevent the formation of new blood vessels, in order to prevent deterioration of the disease. Imaging examination is a valuable technique for doctors in the prediction and treatment of diseases. The rapid improvement of deep learning has made image recognition an effective technology; it can avoid misjudgments caused by different doctors’ evaluations and help doctors to predict the condition quickly. Therefore, this paper adopts visualization and preprocessing in the ResNet-50 model to improve module calibration, to enable the model to predict DR accurately.

show abstract

Unbalanced data processing using oversampling: Machine Learning

Cited by 27 publications

References 16 publications

UWB Radar Applied to Lane Occupation and Vehicle Classification

UWB Radar Applied to Lane Occupation and Vehicle Classification

What do we really know about the drivers of undeclared work? An evaluation of the current state of affairs using machine learning

Development of Revised ResNet-50 for Diabetic Retinopathy Detection

Contact Info

Product

Resources

About