Hamed Valizadegan scite author profile

We study the problem of learning classification models from complex multivariate temporal data encountered in electronic health record systems. The challenge is to define a good set of features that are able to represent well the temporal aspect of the data. Our method relies on temporal abstractions and temporal pattern mining to extract the classification features. Temporal pattern mining usually returns a large number of temporal patterns, most of which may be irrelevant to the classification task. To address this problem, we present the Minimal Predictive Temporal Patterns framework to generate a small set of predictive and non-spurious patterns. We apply our approach to the real-world clinical task of predicting patients who are at risk of developing heparin induced thrombocytopenia. The results demonstrate the benefit of our approach in efficiently learning accurate classifiers, which is a key step for developing intelligent clinical monitoring systems.

show abstract

A Pattern Mining Approach for Classifying Multivariate Temporal Data

Batal

Valizadegan

Cooper

et al. 2011

View full text Add to dashboard Cite

We study the problem of learning classification models from complex multivariate temporal data encountered in electronic health record systems. The challenge is to define a good set of features that are able to represent well the temporal aspect of the data. Our method relies on temporal abstractions and temporal pattern mining to extract the classification features. Temporal pattern mining usually returns a large number of temporal patterns, most of which may be irrelevant to the classification task. To address this problem, we present the minimal predictive temporal patterns framework to generate a small set of predictive and non-spurious patterns. We apply our approach to the real-world clinical task of predicting patients who are at risk of developing heparin induced thrombocytopenia. The results demonstrate the benefit of our approach in learning accurate classifiers, which is a key step for developing intelligent clinical monitoring systems.

show abstract

Semi-Supervised Boosting for Multi-Class Classification

Valizadegan

Jin

Jain

View full text Add to dashboard Cite

Abstract. Most semi-supervised learning algorithms have been designed for binary classification, and are extended to multi-class classification by approaches such as one-against-the-rest. The main shortcoming of these approaches is that they are unable to exploit the fact that each example is only assigned to one class. Additional problems with extending semi-supervised binary classifiers to multi-class problems include imbalanced classification and different output scales of different binary classifiers. We propose a semi-supervised boosting framework, termed Multi-Class Semi-Supervised Boosting (MCSSB), that directly solves the semi-supervised multi-class learning problem. Compared to the existing semi-supervised boosting methods, the proposed framework is advantageous in that it exploits both classification confidence and similarities among examples when deciding the pseudo-labels for unlabeled examples. Empirical study with a number of UCI datasets shows that the proposed MCSSB algorithm performs better than the state-of-theart boosting algorithms for semi-supervised learning.

show abstract

ExoMiner: A Highly Accurate and Explainable Deep Learning Classifier That Validates 301 New Exoplanets

Valizadegan

Martinho

Wilkens

et al. 2022

ApJ

View full text Add to dashboard Cite

The Kepler and Transiting Exoplanet Survey Satellite (TESS) missions have generated over 100,000 potential transit signals that must be processed in order to create a catalog of planet candidates. During the past few years, there has been a growing interest in using machine learning to analyze these data in search of new exoplanets. Different from the existing machine learning works, ExoMiner, the proposed deep learning classifier in this work, mimics how domain experts examine diagnostic tests to vet a transit signal. ExoMiner is a highly accurate, explainable, and robust classifier that (1) allows us to validate 301 new exoplanets from the MAST Kepler Archive and (2) is general enough to be applied across missions such as the ongoing TESS mission. We perform an extensive experimental study to verify that ExoMiner is more reliable and accurate than the existing transit signal classifiers in terms of different classification and ranking metrics. For example, for a fixed precision value of 99%, ExoMiner retrieves 93.6% of all exoplanets in the test set (i.e., recall = 0.936), while this rate is 76.3% for the best existing classifier. Furthermore, the modular design of ExoMiner favors its explainability. We introduce a simple explainability framework that provides experts with feedback on why ExoMiner classifies a transit signal into a specific class label (e.g., planet candidate or not planet candidate).

show abstract

Learning classification models with soft-label information

Nguyen

Valizadegan

Hauskrecht

2014

J Am Med Inform Assoc

View full text Add to dashboard Cite

show abstract

Learning classification models from multiple experts

Valizadegan¹,

Nguyen²,

Hauskrecht³

2013

Journal of Biomedical Informatics

View full text Add to dashboard Cite

Building classification models from clinical data using machine learning methods often relies on labeling of patient examples by human experts. Standard machine learning framework assumes the labels are assigned by a homogeneous process. However, in reality the labels may come from multiple experts and it may be difficult to obtain a set of class labels everybody agrees on; it is not uncommon that different experts have different subjective opinions on how a specific patient example should be classified. In this work we propose and study a new multi-expert learning framework that assumes the class labels are provided by multiple experts and that these experts may differ in their class label assessments. The framework explicitly models different sources of disagreements and lets us naturally combine labels from different human experts to obtain: (1) a consensus classification model representing the model the group of experts converge to, as well as, and (2) individual expert models. We test the proposed framework by building a model for the problem of detection of the Heparin Induced Thrombocytopenia (HIT) where examples are labeled by three experts. We show that our framework is superior to multiple baselines (including standard machine learning framework in which expert differences are ignored) and that our framework leads to both improved consensus and individual expert models.

show abstract

Vector Autoregressive Model-Based Anomaly Detection in Aviation Systems

Melnyk

Matthews

Valizadegan

et al. 2016

Journal of Aerospace Information Systems

View full text Add to dashboard Cite

Detecting anomalies in datasets, where each data object is a multivariate time series (MTS), possibly of different length for each data object, is emerging as a key problem in certain domains. We consider the problem in the context of aviation safety, where data objects are flights of various durations, and the MTS corresponds to sensor readings. The goal then is to detect anomalous flight segments, due to mechanical, environmental, or human factors. In this paper, we present a general framework for anomaly detection in such settings, by representing each MTS using a vector autoregressive exogenous (VARX) model, constructing a distance matrix among the objects based on their respective VARX models, and finally detecting anomalies based on the object dissimilarities. The framework is scalable, due to the inherent parallel nature of most computations, and can be used to perform online anomaly detection. Experimental results on a real flight dataset illustrate that the framework can detect different types of multivariate anomalies along with the key parameters involved.

show abstract

Kernel Based Detection of Mislabeled Training Examples

Valizadegan

Tan

2007

View full text Add to dashboard Cite

The problem of identifying mislabeled training examples has been examined in several studies, with a variety of approaches developed for editing the training data to obtain better classifiers. Many of these approaches involve applying an individual or an ensemble of classifiers to the training set and filtering the mislabeled examples based on their consistency with respect to the classifier's outputs. In this study, we formulate mislabeled detection as an optimization problem and introduce a kernel-based approach for filtering the mislabeled examples. Experimental results using a variety of data sets from the UCI data repository demonstrate the effectiveness of our proposed method, compared to existing nearest-neighbor and ensemble-based filtering schemes.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.