Chun-Bo Sim scite author profile

In this article we focus on the detection of possible outliers based on the widely used boxplot procedures. The outliers in a set of data are defined to be a subset of observations that appear to be inconsistent with the remaining observations. We identify the outliers by constructing a boxplot with its lower fence (LF) and upper fence (UF) either (a) satisfying the requirement that if the given sample is outlier-free, then the probability that one or more of the sample data would fall outside the region (LF, UF) is equal to a prescribed small value α, or (b) taken to be the tolerance limits, derived from an outlier-free random sample, within which a specified large proportion β of the sampled population would be asserted to fall with a given large probability γ . Exact expressions that can be routinely used to evaluate the constants needed in the construction of the boxplot's outlier region for samples taken from the family of location-scale distributions are obtained for both procedures. This article shows that the commonly constructed boxplot is in general inappropriate for detecting outliers in the normal and especially the exponential samples. We recommend that the graphical boxplot be constructed based on the knowledge of the underlying distribution of the dataset and by controling the risk of labeling regular observations as outliers.

show abstract

Prediction Data Processing Scheme using an Artificial Neural Network and Data Clustering for Big Data

Jung

Kim

Sim

2016

IJECE

View full text Add to dashboard Cite

Various types of derivative information have been increasing exponentially, based on mobile devices and social networking sites (SNSs), and the information technologies utilizing them have also been developing rapidly. Technologies to classify and analyze such information are as important as data generation. This study concentrates on data clustering through principal component analysis and K-means algorithms to analyze and classify user data efficiently. We propose a technique of changing the cluster choice before cluster processing in the existing K-means practice into a variable cluster choice through principal component analysis, and expanding the scope of data clustering. The technique also applies an artificial neural network learning model for user recommendation and prediction from the clustered data. The proposed processing model for predicted data generated results that improved the existing artificial neural network–based data clustering and learning model by approximately 9.25%.

show abstract

A Novel on Automatic K Value for Efficiency Improvement of K-means Clustering

Jung

Kim

Lim

et al. 2017

View full text Add to dashboard Cite

Compression artifacts removal by signal adaptive weighted sum technique

Kim

Sim

2011

IEEE Trans. Consumer Electron.

View full text Add to dashboard Cite

Prediction Data Processing Scheme using an Artificial Neural Network and Data Clustering for Big Data

Jung¹,

Kim²,

Sim³

2016

IJECE

View full text Add to dashboard Cite

Various types of derivative information have been increasing exponentially, based on mobile devices and social networking sites (SNSs), and the information technologies utilizing them have also been developing rapidly. Technologies to classify and analyze such information are as important as data generation. This study concentrates on data clustering through principal component analysis and K-means algorithms to analyze and classify user data efficiently. We propose a technique of changing the cluster choice before cluster processing in the existing K-means practice into a variable cluster choice through principal component analysis, and expanding the scope of data clustering. The technique also applies an artificial neural network learning model for user recommendation and prediction from the clustered data. The proposed processing model for predicted data generated results that improved the existing artificial neural network-based data clustering and learning model by approximately 9.25%.

show abstract

A Study on Mushroom Growth Environment Analysis System based on Machine Learning for Efficient Operation of Mushroom Plantation

Kim¹,

Jung²,

Sim³

2017

View full text Add to dashboard Cite

Abstract. There are many technology researches being conducted based on integration with the artificial intelligence technologies in the 6 th Industry, a recent topic of ICT convergence technology in the agriculture and life industry. Of the artificial intelligence technologies, machine learning, in particular, requires Big Data analysis techniques in the agriculture and life industry with various types of data and uses many different methodologies. This study set out to propose a Big Data-based integrated system to manage and analyze a mushroom growth environment for the efficient management of mushroom plantation.

show abstract

A Study on Mushroom Pest and Diseases Analysis System Implementation based on Convolutional Neural Networks for Smart Farm

Kim¹,

Jung²,

So³

et al. 2017

IJCA

View full text Add to dashboard Cite

A Study on an Enhanced Autonomous Driving Simulation Model Based on Reinforcement Learning Using a Collision Prevention Model

2021

View full text Add to dashboard Cite

This paper set out to revise and improve existing autonomous driving models using reinforcement learning, thus proposing a reinforced autonomous driving prediction model. The paper conducted training for a reinforcement learning model using DQN, a reinforcement learning algorithm. The main aim of this paper was to reduce the time spent on training and improve self-driving performance. Rewards for reinforcement learning agents were developed to mimic human driving behavior as much as possible. High rewards were given for greater distance travelled within lanes and higher speed. Negative rewards were given when a vehicle crossed into other lanes or had a collision. Performance evaluation was carried out in urban environments without pedestrians. The performance test results show that the model with the collision prevention model exhibited faster performance improvement within the same time compared to when the model was not applied. However, vulnerabilities to factors such as pedestrians and vehicles approaching from the side were not addressed, and the lack of stability in the definition of compensation functions and limitations with respect to the excessive use of memory were shown.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Chun-Bo Sim

Outlier Labeling With Boxplot Procedures

Prediction Data Processing Scheme using an Artificial Neural Network and Data Clustering for Big Data

A Novel on Automatic K Value for Efficiency Improvement of K-means Clustering

Compression artifacts removal by signal adaptive weighted sum technique

Prediction Data Processing Scheme using an Artificial Neural Network and Data Clustering for Big Data

A Study on Mushroom Growth Environment Analysis System based on Machine Learning for Efficient Operation of Mushroom Plantation

A Study on Mushroom Pest and Diseases Analysis System Implementation based on Convolutional Neural Networks for Smart Farm

A Study on an Enhanced Autonomous Driving Simulation Model Based on Reinforcement Learning Using a Collision Prevention Model

Contact Info

Product

Resources

About