Andrew Yatsko scite author profile

Missing values may be present in data without undermining its use for diagnostic / classification purposes but compromise application of readily available software. Surrogate entries can remedy the situation, although the outcome is generally unknown. Discretization of continuous attributes renders all data nominal and is helpful in dealing with missing values; particularly, no special handling is required for different attribute types. A number of classifiers exist or can be reformulated for this representation. Some classifiers can be reinvented as data completion methods. In this work the Decision Tree, Nearest Neighbour, and Naive Bayesian methods are demonstrated to have the required aptness. An approach is implemented whereby the entered missing values are not necessarily a close match of the true data; however, they intend to cause the least hindrance for classification. The proposed techniques find their application particularly in medical diagnostics. Where clinical data represents a number of related conditions, taking Cartesian product of class values of the underlying sub-problems allows narrowing down of the selection of missing value substitutes. Real-world data examples, some publically available, are enlisted for testing. The proposed and benchmark methods are compared by classifying the data before and after missing value imputation, indicating a significant improvement.

show abstract

Data-analytically derived flexible HbA1c thresholds for type 2 diabetes mellitus diagnostic

Stranieri

Yatsko

Jelinek

et al. 2015

AIR

View full text Add to dashboard Cite

Glycated haemoglobin (HbA1c) is now more commonly used as an alternative test to the fasting plasma glucose and oral glucose tolerance tests for the identification of Type 2 Diabetes Mellitus (T2DM) because it is easily obtained using the point-of-care technology and represents long-term blood sugar levels. According to WHO guidelines, HbA1c values of 6.5% or above are required for a diagnosis of T2DM. However outcomes of a large number of trials with HbA1c have been inconsistent across the clinical spectrum and further research is required to determine the efficacy of HbA1c testing in identification of T2DM. Medical records from a diabetes screening program in Australia illustrate that many patients could be classified as diabetics if other clinical indicators are included, even though the HbA1c result does not exceed 6.5%. This suggests that a cutoff for the general population of 6.5% may be too simple and miss individuals at risk or with already overt, undiagnosed diabetes. In this study, data mining algorithms have been applied to identify markers that can be used with HbA1c. The results indicate that T2DM is best classified by HbA1c at 6.2% -a cutoff level lower than the currently recommended one, which can be even less, having assumed the threshold flexibility, if additionally to HbA1c being high the rule is conditioned on oxidative stress or inflammation being present, atherogenicity or adiposity being high, or hypertension being diagnosed, etc.

show abstract

Data analytics identify glycated haemoglobin co-markers for type 2 diabetes mellitus diagnosis

Jelinek

Stranieri

Yatsko

et al. 2016

Computers in Biology and Medicine

View full text Add to dashboard Cite

Indexing adult obesity by waist-to-height and weight-to-height ratios

Yatsko

2017

JBEI

View full text Add to dashboard Cite

To date a vast evidence exists that the waist circumference to height ratio (WCHR) provides a better measure of obesity comparing to the body mass index (BMI). While weight and height are routinely obtained to calculate BMI, waist circumference, despite easily acquired, is often overlooked because the screening protocols, particularly for diabetes, demand BMI. This creates an obstacle for application of WCHR -a more definite measure than BMI for diagnostic of many linked to obesity metabolic disorders such as diabetes, cardiovascular disease and hypertension. This article is intended to fill the gap in the literature by providing a conversion from BMI to WCHR for five adult age categories. A strong linearity between the measures is demonstrated and equivalent to BMI WCHR thresholds are provided to identify normality, overweight as well as obesity and other points. The analysis is based on the data from National Health and Nutrition Examination Survey (NHANES). Different forms of BMI are also discussed and a strong linearity between them is demonstrated. An obesity index based on simple weight to height ratio to match the standard levels is proposed. The equivalence between the proposed and existing obesity indices is tested on the original data with promising results.

show abstract

Capped K-NN Editing in Definition Lacking Environments

Stranieri¹,

Yatsko²,

Golden³

et al. 2013

JPRR

View full text Add to dashboard Cite

Missing Data Imputation for Individualised CVD Diagnostic and Treatment

Venkatraman¹,

Yatsko

Stranieri

et al. 2016

View full text Add to dashboard Cite

show abstract

Weighting features by the value displacement rebound

Yatsko

2020

AIR

View full text Add to dashboard Cite

Learning from examples draws on similarity, a concept which formalisation leads to the notion of instance space. Continuous spaces are easier to embrace since, unlike discrete, they often can be seen as hyper-constructs of 3D. Unsurprisingly, the instance-based learning methods are more developed for continuous domains than for discrete ones. The value difference metric (VDM) is one of the few examples of metrics for discrete spaces. Mixed reports about utility of VDM exist. In this paper VDM is compared with another approach where data features are weighted by the Information Gain. Some vulnerabilities of VDM are identified. A weighting method, nothing like VDM, although inspired by the former, is proposed. The results are in favour of the new weighting scheme with illustration of utility for health diagnostics.

show abstract

Personalised measures of obesity using waist to height ratios from an Australian health screening program

et al. 2019

View full text Add to dashboard Cite

Objectives The aim of the current study is to generate waist circumference to height ratio cut-off values for obesity categories from a model of the relationship between body mass index and waist circumference to height ratio. We compare the waist circumference to height ratio discovered in this way with cut-off values currently prevalent in practice that were originally derived using pragmatic criteria. Method Personalized data including age, gender, height, weight, waist circumference and presence of diabetes, hypertension and cardiovascular disease for 847 participants over eight years were assembled from participants attending a rural Australian health review clinic (DiabHealth). Obesity was classified based on the conventional body mass index measure (weight/height 2 ) and compared to the waist circumference to height ratio. Correlations between the measures were evaluated on the screening data, and independently on data from the National Health and Nutrition Examination Survey that included age categories. Results This article recommends waist circumference to height ratio cut-off values based on an Australian rural sample and verified using the National Health and Nutrition Examination Survey database that facilitates the classification of obesity in clinical practice. Gender independent cut-off values are provided for waist circumference to height ratio that identify healthy (waist circumference to height ratio ≥0.45), overweight (0.53) and the three obese (0.60, 0.68, 0.75) categories verified on the National Health and Nutrition Examination Survey dataset. A strong linearity between the waist circumference to height ratio and the body mass index measure is demonstrated. Conclusion The recommended waist circumference to height ratio cut-off values provided a useful index for assessing stages of obesity and risk of chronic disease for improved healthcare in clinical practice.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Andrew Yatsko

Diagnostic with incomplete nominal/discrete data

Data-analytically derived flexible HbA1c thresholds for type 2 diabetes mellitus diagnostic

Data analytics identify glycated haemoglobin co-markers for type 2 diabetes mellitus diagnosis

Indexing adult obesity by waist-to-height and weight-to-height ratios

Capped K-NN Editing in Definition Lacking Environments

Missing Data Imputation for Individualised CVD Diagnostic and Treatment

Weighting features by the value displacement rebound

Personalised measures of obesity using waist to height ratios from an Australian health screening program

Contact Info

Product

Resources

About