Chin-Kuan Ho scite author profile

Chin-Kuan Ho

5Publications

14Citation Statements Received

312Citation Statements Given

How they've been cited

How they cite others

230

312

Affiliations

Asia Pacific University of Technology & Innovation, Multimedia University

Publications

Order By: Most citations

A Hybrid Distance Measure for Clustering Expressed Sequence Tags Originating from the Same Gene Family

Phon-Amnuaisuk

2012

PLoS ONE

View full text Add to dashboard Cite

BackgroundClustering is a key step in the processing of Expressed Sequence Tags (ESTs). The primary goal of clustering is to put ESTs from the same transcript of a single gene into a unique cluster. Recent EST clustering algorithms mostly adopt the alignment-free distance measures, where they tend to yield acceptable clustering accuracies with reasonable computational time. Despite the fact that these clustering methods work satisfactorily on a majority of the EST datasets, they have a common weakness. They are prone to deliver unsatisfactory clustering results when dealing with ESTs from the genes derived from the same family. The root cause is the distance measures applied on them are not sensitive enough to separate these closely related genes.Methodology/Principal FindingsWe propose a hybrid distance measure that combines the global and local features extracted from ESTs, with the aim to address the clustering problem faced by ESTs derived from the same gene family. The clustering process is implemented using the DBSCAN algorithm. We test the hybrid distance measure on the ten EST datasets, and the clustering results are compared with the two alignment-free EST clustering tools, i.e. wcd and PEACE. The clustering results indicate that the proposed hybrid distance measure performs relatively better (in terms of clustering accuracy) than both EST clustering tools.Conclusions/SignificanceThe clustering results provide support for the effectiveness of the proposed hybrid distance measure in solving the clustering problem for ESTs that originate from the same gene family. The improvement of clustering accuracies on the experimental datasets has supported the claim that the sensitivity of the hybrid distance measure is sufficient to solve the clustering problem.

show abstract

Front-end deep learning web apps development and deployment: a review

Goh

Abas

2022

Appl Intell

View full text Add to dashboard Cite

Machine learning and deep learning models are commonly developed using programming languages such as Python, C++, or R and deployed as web apps delivered from a back-end server or as mobile apps installed from an app store. However, recently front-end technologies and JavaScript libraries, such as TensorFlow.js, have been introduced to make machine learning more accessible to researchers and end-users. Using JavaScript, TensorFlow.js can define, train, and run new or existing, pre-trained machine learning models entirely in the browser from the client-side, which improves the user experience through interaction while preserving privacy. Deep learning models deployed on front-end browsers must be small, have fast inference, and ideally be interactive in real-time. Therefore, the emphasis on development and deployment is different. This paper aims to review the development and deployment of these deep-learning web apps to raise awareness of the recent advancements and encourage more researchers to take advantage of this technology for their own work. First, the rationale behind the deployment stack (front-end, JavaScript, and TensorFlow.js) is discussed. Then, the development approach for obtaining deep learning models that are optimized and suitable for front-end deployment is then described. The article also provides current web applications divided into seven categories to show deep learning potential on the front end. These include web apps for deep learning playground, pose detection and gesture tracking, music and art creation, expression detection and facial recognition, video segmentation, image and signal analysis, healthcare diagnosis, recognition, and identification.

show abstract

A new optimization driven clustering algorithm for large circuits

Ding¹,

Ho²,

Irwin³

View full text Add to dashboard Cite

Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems

Maw

Haw

2021

F1000Res

View full text Add to dashboard Cite

Background: Customer churn prediction (CCP) refers to detecting which customers are likely to cancel the services provided by a service provider, for example, internet services. The class imbalance problem (CIP) in machine learning occurs when there is a huge difference in the samples of positive class compared to the negative class. It is one of the major obstacles in CCP as it deteriorates performance in the classification process. Utilizing data sampling techniques (DSTs) helps to resolve the CIP to some extent. Methods: In this paper, we review the effect of using DSTs on algorithmic fairness, i.e., to investigate whether the results pose any discrimination between male and female groups and compare the results before and after using DSTs. Three real-world datasets with unequal balancing rates were prepared and four ubiquitous DSTs were applied to them. Six popular classification techniques were utilized in the classification process. Both classifier’s performance and algorithmic fairness are evaluated with notable metrics. Results: The results indicated that Random Forest classifier outperforms other classifiers in all three datasets and, using SMOTE and ADASYN techniques cause more discrimination in the female group. The rate of unintentional discrimination seems to be higher in the original data of extremely unbalanced datasets under the following classifiers: Logistics Regression, LightGBM, and XGBoost. Conclusions: Algorithmic fairness has become a broadly studied area in recent years, yet there is a very little systematic study on the effect of using DSTs on algorithmic fairness. This study presents important findings to further the use of algorithmic fairness in CCP research.

show abstract

Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems

Maw

Haw

2022

F1000Res

View full text Add to dashboard Cite

Background: Customer churn prediction (CCP) refers to detecting which customers are likely to cancel the services provided by a service provider, for example, internet services. The class imbalance problem (CIP) in machine learning occurs when there is a huge difference in the samples of the positive class compared to the negative class. It is one of the major obstacles in CCP as it deteriorates performance in the classification process. Utilizing data sampling techniques (DSTs) helps to resolve the CIP to some extent. Methods: In this paper, we review the effect of using DSTs on algorithmic fairness, i.e., to investigate whether the results pose any discrimination between male and female groups and compare the results before and after using DSTs. Three real-world datasets with unequal balancing rates were prepared and four ubiquitous DSTs were applied to them. Six popular classification techniques were utilized in the classification process. Both classifier’s performance and algorithmic fairness are evaluated with notable metrics. Results: The results indicated that the Random Forest classifier outperforms other classifiers in all three datasets and, that using SMOTE and ADASYN techniques causes more discrimination in the female group. The rate of unintentional discrimination seems to be higher in the original data of extremely unbalanced datasets under the following classifiers: Logistics Regression, LightGBM, and XGBoost. Conclusions: Algorithmic fairness has become a broadly studied area in recent years, yet there is very little systematic study on the effect of using DSTs on algorithmic fairness. This study presents important findings to further the use of algorithmic fairness in CCP research.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Chin-Kuan Ho

A Hybrid Distance Measure for Clustering Expressed Sequence Tags Originating from the Same Gene Family

Front-end deep learning web apps development and deployment: a review

A new optimization driven clustering algorithm for large circuits

Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems

Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems

Contact Info

Product

Resources

About