An oversampling method for multi-class imbalanced data based on composite weights

Deng, Mingyang; Guo, Yuanhao; Wang, Chang; Wu, Fuwei

doi:10.1371/journal.pone.0259227

Cited by 12 publications

(10 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The statistical level of significance was defined as α = 0.05 for all tests. In the implementation of the logistic regression the oversampling procedure was applied in some cases to improve the significance of imbalanced data [ 29 – 31 ]. Cases where oversampling was applied are marked with (by o.s.).…”

Section: Methodsmentioning

confidence: 99%

Association of medial collateral ligament complex injuries with anterior cruciate ligament ruptures based on posterolateral tibial plateau injuries

et al. 2023

View full text Add to dashboard Cite

Background The combined injury of the medial collateral ligament complex and the anterior cruciate ligament (ACL) is the most common two ligament injury of the knee. Additional injuries to the medial capsuloligamentous structures are associated with rotational instability and a high failure rate of ACL reconstruction. The study aimed to analyze the specific pattern of medial injuries and their associated risk factors, with the goal of enabling early diagnosis and initiating appropriate therapeutic interventions, if necessary. Results Between January 2017 and December 2018, 151 patients with acute ACL ruptures with a mean age of 32 ± 12 years were included in this study. The MRIs performed during the acute phase were analyzed by four independent investigators—two radiologists and two orthopedic surgeons. The trauma impact on the posterolateral tibial plateau and associated injuries to the medial complex (POL, dMCL, and sMCL) were examined and revealed an injury to the medial collateral ligament complex in 34.4% of the patients. The dMCL was the most frequently injured structure (92.2%). A dMCL injury was significantly associated with an increase in trauma severity at the posterolateral tibial plateau (p < 0.02) and additional injuries to the sMCL (OR 4.702, 95% CL 1.3–133.3, p = 0.03) and POL (OR 20.818, 95% CL 5.9–84.4, p < 0.0001). Isolated injuries to the sMCL were not observed. Significant risk factors for acquiring an sMCL injury were age (p < 0.01) and injury to the lateral meniscus (p < 0.01). Conclusion In about one-third of acute ACL ruptures the medial collateral ligament complex is also injured. This might be associated with an increased knee laxity as well as anteromedial rotational instability. Also, this might be associated with an increased risk for failure of revision ACL reconstruction. In addition, we show risk factors and predictors that point to an injury of medial structures and facilitate their diagnosis. This should help physicians and surgeons to precisely diagnose and to assess its scope in order to initiate proper therapies. With this in mind, we would like to draw attention to a frequently occurring combination injury, the so-called “unlucky triad” (ACL, MCL, and lateral meniscus). Level of evidence Level III Retrospective cohort study.

show abstract

Section: Methodsmentioning

confidence: 99%

Association of medial collateral ligament complex injuries with anterior cruciate ligament ruptures based on posterolateral tibial plateau injuries

et al. 2023

View full text Add to dashboard Cite

show abstract

“…Sorting the classes according to a hyperplane that depicts relative relationships among points concerning the influence of space surrounding each point can estimate whether it is a target or an outlier [42]. All these methods are strongly affected by many factors: the number of extracted classes and their belonging clusters [43], the local density estimation and the local reachability among connected points, boundaries that separate clusters [44], and local outliers [45].…”

Section: Related Workmentioning

confidence: 99%

Automatic Clustering and Classification of Coffee Leaf Diseases Based on an Extended Kernel Density Estimation Approach

Hasan

Yusuf

Rahim

et al. 2023

Plants

View full text Add to dashboard Cite

The current methods of classifying plan disease images are mainly affected by the training phase and the characteristics of the target dataset. Collecting plant samples during different leaf life cycle infection stages is time-consuming. However, these samples may have multiple symptoms that share the same features but with different densities. The manual labelling of such samples demands exhaustive labour work that may contain errors and corrupt the training phase. Furthermore, the labelling and the annotation consider the dominant disease and neglect the minor disease, leading to misclassification. This paper proposes a fully automated leaf disease diagnosis framework that extracts the region of interest based on a modified colour process, according to which syndrome is self-clustered using an extended Gaussian kernel density estimation and the probability of the nearest shared neighbourhood. Each group of symptoms is presented to the classifier independently. The objective is to cluster symptoms using a nonparametric method, decrease the classification error, and reduce the need for a large-scale dataset to train the classifier. To evaluate the efficiency of the proposed framework, coffee leaf datasets were selected to assess the framework performance due to a wide variety of feature demonstrations at different levels of infections. Several kernels with their appropriate bandwidth selector were compared. The best probabilities were achieved by the proposed extended Gaussian kernel, which connects the neighbouring lesions in one symptom cluster, where there is no need for any influencing set that guides toward the correct cluster. Clusters are presented with an equal priority to a ResNet50 classifier, so misclassification is reduced with an accuracy of up to 98%.

show abstract

“…Imbalanced can cause problems in the classification task because the model can overfit the majority class and under-fit the minority class [25]. To solve that problem, in this step, the re-sampling technique is applied [26]. The re-sampling technique that is applied is oversampling technique.…”

Section: Imbalance Datasetmentioning

confidence: 99%

Rule-based Disease Classification using Text Mining on Symptoms Extraction from Electronic Medical Records in Indonesian

Sangaji

Pamungkas

Nugroho

et al. 2022

KINETIK

View full text Add to dashboard Cite

Recently, electronic medical record (EMR) has become the source of many insights for clinicians and hospital management. EMR stores much important information and new knowledge regarding many aspects for hospital and clinician competitive advantage. It is valuable not only for mining data patterns saved in it regarding the patient symptoms, medication, and treatment, but also it is the box deposit of many new strategies and future trends in the medical world. However, EMR remains a challenge for many clinicians because of its unstructured form. Information extraction helps in finding valuable information in unstructured data. In this paper, information on disease symptoms in the form of text data is the focus of this study. Only the highest prevalence rate of diseases in Indonesia, such as tuberculosis, malignant neoplasm, diabetes mellitus, hypertensive, and renal failure, are analyzed. Pre-processing techniques such as data cleansing and correction play a significant role in obtaining the features. Since the amount of data is imbalanced, SMOTE technique is implemented to overcome this condition. The process of extracting symptoms from EMR data uses a rule-based algorithm. Two algorithms were implemented to classify the disease based on the features, namely SVM and Random Forest. The result showed that the rule-based symptoms extraction works well in extracting valuable information from the unstructured EMR. The classification performance on all algorithms with accuracy in SVM 78% and RF 89%.

show abstract

An oversampling method for multi-class imbalanced data based on composite weights

Cited by 12 publications

References 40 publications

Association of medial collateral ligament complex injuries with anterior cruciate ligament ruptures based on posterolateral tibial plateau injuries

Association of medial collateral ligament complex injuries with anterior cruciate ligament ruptures based on posterolateral tibial plateau injuries

Automatic Clustering and Classification of Coffee Leaf Diseases Based on an Extended Kernel Density Estimation Approach

Rule-based Disease Classification using Text Mining on Symptoms Extraction from Electronic Medical Records in Indonesian

Contact Info

Product

Resources

About