Deep neural networks can extract clinical information, such as diabetic retinopathy status and individual characteristics (e.g. age and sex), from retinal images. Here, we report the first study to train deep learning models with retinal images from 3,000 Qatari citizens participating in the Qatar Biobank study. We investigated whether fundus images can predict cardiometabolic risk factors, such as age, sex, blood pressure, smoking status, glycaemic status, total lipid panel, sex steroid hormones and bioimpedance measurements. Additionally, the role of age and sex as mediating factors when predicting cardiometabolic risk factors from fundus images was studied. Predictions at person-level were made by combining information of an optic disc centred and a macula centred image of both eyes with deep learning models using the MobileNet-V2 architecture. An accurate prediction was obtained for age (mean absolute error (MAE): 2.78 years) and sex (area under the curve: 0.97), while an acceptable performance was achieved for systolic blood pressure (MAE: 8.96 mmHg), diastolic blood pressure (MAE: 6.84 mmHg), Haemoglobin A1c (MAE: 0.61%), relative fat mass (MAE: 5.68 units) and testosterone (MAE: 3.76 nmol/L). We discovered that age and sex were mediating factors when predicting cardiometabolic risk factors from fundus images. We have found that deep learning models indirectly predict sex when trained for testosterone. For blood pressure, Haemoglobin A1c and relative fat mass an influence of age and sex was observed. However, achieved performance cannot be fully explained by the influence of age and sex. In conclusion we confirm that age and sex can be predicted reliably from a fundus image and that unique information is stored in the retina that relates to blood pressure, Haemoglobin A1c and relative fat mass. Future research should focus on stratification when predicting person characteristics from a fundus image.
Introduction The eye offers potential for the diagnosis of Alzheimer’s disease (AD) with retinal imaging techniques being explored to quantify amyloid accumulation and aspects of neurodegeneration. To assess these changes, this proof-of-concept study combined hyperspectral imaging and optical coherence tomography to build a classification model to differentiate between AD patients and controls. Methods In a memory clinic setting, patients with a diagnosis of clinically probable AD (n = 10) or biomarker-proven AD (n = 7) and controls (n = 22) underwent non-invasive retinal imaging with an easy-to-use hyperspectral snapshot camera that collects information from 16 spectral bands (460–620 nm, 10-nm bandwidth) in one capture. The individuals were also imaged using optical coherence tomography for assessing retinal nerve fiber layer thickness (RNFL). Dedicated image preprocessing analysis was followed by machine learning to discriminate between both groups. Results Hyperspectral data and retinal nerve fiber layer thickness data were used in a linear discriminant classification model to discriminate between AD patients and controls. Nested leave-one-out cross-validation resulted in a fair accuracy, providing an area under the receiver operating characteristic curve of 0.74 (95% confidence interval [0.60–0.89]). Inner loop results showed that the inclusion of the RNFL features resulted in an improvement of the area under the receiver operating characteristic curve: for the most informative region assessed, the average area under the receiver operating characteristic curve was 0.70 (95% confidence interval [0.55, 0.86]) and 0.79 (95% confidence interval [0.65, 0.93]), respectively. The robust statistics used in this study reduces the risk of overfitting and partly compensates for the limited sample size. Conclusions This study in a memory-clinic-based cohort supports the potential of hyperspectral imaging and suggests an added value of combining retinal imaging modalities. Standardization and longitudinal data on fully amyloid-phenotyped cohorts are required to elucidate the relationship between retinal structure and cognitive function and to evaluate the robustness of the classification model.
Metrics that capture changes in the retinal microvascular structure are relevant in the context of cardiometabolic disease development. The microvascular topology is typically quantified using monofractals, although it obeys more complex multifractal rules. We study mono-and multifractals of the retinal microvasculature in relation to cardiometabolic factors. Methods: The cross-sectional retrospective study used data from 3000 Middle Eastern participants in the Qatar Biobank. A total of 2333 fundus images (78%) passed quality control and were used for further analysis. The monofractal (D f ) and five multifractal metrics were associated with cardiometabolic factors using multiple linear regression and were studied in clinically relevant subgroups. Results: D f and multifractals are lowered in function of age, and D f is lower in males compared to females. In models corrected for age and sex, D f is significantly associated with BMI, insulin, systolic blood pressure, glycated haemoglobin (HbA1c), albumin, LDL and total cholesterol concentrations. Multifractals are negatively associated with systolic and diastolic blood pressure, glucose and the WHO/ISH cardiovascular risk score. D f was higher, and multifractal curve asymmetry was lower in diabetic patients (HbA1c > 6.5%) compared to healthy individuals (HbA1c < 5.7%). Insulin resistance (insulin ≥ 23 mcU/mL) was associated with significantly lower D f values. Conclusion: One or more fractal metrics are in association with sex, age, BMI, systolic and diastolic blood pressure and biochemical blood measurements in a Middle Eastern population study. Follow-up studies aiming at investigating retinal microvascular changes in relation to cardiometabolic risk should analyse both monofractal and multifractal metrics for a more comprehensive microvascular picture.
Clustering is inherently ill-posed: there often exist multiple valid clusterings of a single dataset, and without any additional information a clustering system has no way of knowing which clustering it should produce. This motivates the use of constraints in clustering, as they allow users to communicate their interests to the clustering system. Active constraint-based clustering algorithms select the most useful constraints to query, aiming to produce a good clustering using as few constraints as possible. We propose COBRA, an active method that first over-clusters the data by running K-means with a K that is intended to be too large, and subsequently merges the resulting small clusters into larger ones based on pairwise constraints. In its merging step, COBRA is able to keep the number of pairwise queries low by maximally exploiting constraint transitivity and entailment. We experimentally show that COBRA outperforms the state of the art in terms of clustering quality and runtime, without requiring the number of clusters in advance.
Abstract-Semi-supervised clustering methods incorporate a limited amount of supervision into the clustering process. Typically, this supervision is provided by the user in the form of pairwise constraints. Existing methods use such constraints in one of the following ways: they adapt their clustering procedure, their similarity metric, or both. All of these approaches operate within the scope of individual clustering algorithms. In contrast, we propose to use constraints to choose between clusterings generated by very different unsupervised clustering algorithms, run with different parameter settings. We empirically show that this simple approach often outperforms existing semi-supervised clustering methods.
Constraint-based clustering algorithms exploit background knowledge to construct clusterings that are aligned with the interests of a particular user. This background knowledge is often obtained by allowing the clustering system to pose pairwise queries to the user: should these two elements be in the same cluster or not? Answering yes results in a must-link constraint, no in a cannot-link. Ideally, the user should be able to answer a couple of these queries, inspect the resulting clustering, and repeat these two steps until a satisfactory result is obtained. Such an interactive clustering process requires the clustering system to satisfy three requirements: (1) it should be able to present a reasonable (intermediate) clustering to the user at any time, (2) it should produce good clusterings given few queries, i.e. it should be query-efficient, and (3) it should be time-efficient. We present COBRAS, an approach to clustering with pairwise constraints that satisfies these requirements. COBRAS constructs clusterings of super-instances, which are local regions in the data in which all instances are assumed to belong to the same cluster. By dynamically refining these super-instances during clustering, COBRAS is able to produce clusterings at increasingly fine-grained levels of granularity. It quickly produces good high-level clusterings, and is able to refine them to find more detailed structure as more queries are answered. In our experiments we demonstrate that COBRAS is the only method able to produce good solutions at all stages of the clustering process at fast runtimes, and hence the most suitable method for interactive clustering.
Clustering is ubiquitous in data analysis, including analysis of time series. It is inherently subjective: different users may prefer different clusterings for a particular dataset. Semi-supervised clustering addresses this by allowing the user to provide examples of instances that should (not) be in the same cluster. This paper studies semi-supervised clustering in the context of time series. We show that COBRAS, a stateof-the-art semi-supervised clustering method, can be adapted to this setting. We refer to this approach as COBRAS TS . An extensive experimental evaluation supports the following claims: (1) COBRAS TS far outperforms the current state of the art in semi-supervised clustering for time series, and thus presents a new baseline for the field; (2) COBRAS TS can identify clusters with separated components; (3) COBRAS TS can identify clusters that are characterized by small local patterns; (4) a small amount of semi-supervision can greatly improve clustering quality for time series; (5) the choice of the clustering algorithm matters (contrary to earlier claims in the literature).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.