An estimated 3 billion people lack access to dermatological care globally. Artificial intelligence (AI) may aid in triaging skin diseases and identifying malignancies. However, most AI models have not been assessed on images of diverse skin tones or uncommon diseases. Thus, we created the Diverse Dermatology Images (DDI) dataset—the first publicly available, expertly curated, and pathologically confirmed image dataset with diverse skin tones. We show that state-of-the-art dermatology AI models exhibit substantial limitations on the DDI dataset, particularly on dark skin tones and uncommon diseases. We find that dermatologists, who often label AI datasets, also perform worse on images of dark skin tones and uncommon diseases. Fine-tuning AI models on the DDI images closes the performance gap between light and dark skin tones. These findings identify important weaknesses and biases in dermatology AI that should be addressed for reliable application to diverse patients and diseases.
Datacenter disaggregation provides numerous benets to both the datacenter operator and the application designer. However switching from the server-centric model to a disaggregated model requires developing new programming abstractions that can achieve high performance while beneting from the greater elasticity. To explore the limits of datacenter disaggregation, we study an application area that near-maximally benets from current server-centric datacenters: dense linear algebra. We build NumPyWren, a system for linear algebra built on a disaggregated serverless programming model, and LAmbdaPACK, a companion domainspecic language designed for serverless execution of highly parallel linear algebra algorithms. We show that, for a number of linear algebra algorithms such as matrix multiply, singular value decomposition, Cholesky decomposition, and QR decomposition, NumPyWren's performance (completion time) is within a factor of 2 of optimized server-centric MPI implementations, and has up to 15 % greater compute eciency (total CPU-hours), while providing fault tolerance.
A continuing mystery in understanding the empirical success of deep neural networks is their ability to achieve zero training error and generalize well, even when the training data is noisy and there are more parameters than data points. We investigate this overparameterized regime in linear regression, where all solutions that minimize training error interpolate the data, including noise. We characterize the fundamental generalization (mean-squared) error of any interpolating solution in the presence of noise, and show that this error decays to zero with the number of features. Thus, overparameterization can be explicitly beneficial in ensuring harmless interpolation of noise. We discuss two root causes for poor generalization that are complementary in nature -signal "bleeding" into a large number of alias features, and overfitting of noise by parsimonious feature selectors. For the sparse linear model with noise, we provide a hybrid interpolating scheme that mitigates both these issues and achieves order-optimal MSE over all possible interpolating solutions. arXiv:1903.09139v2 [cs.LG] 9 Sep 2019 2. We provide a Fourier-theoretic interpretation of concurrent analyses [6-10] of the minimum 2 -norm interpolator.3. We show (Theorem 2) that parsimonious interpolators (like the 1 -minimizing interpolator and its relatives) suffer the complementary problem of overfitting pure noise.4. We construct two-step hybrid interpolators that successfully recover signal and harmlessly fit noise, achieving the order-optimal rate of test MSE among all interpolators (Proposition 1 and all its corollaries). Related workWe discuss prior work in three categories: a) overparameterization in deep neural networks, b) interpolation of high-dimensional data using kernels, and c) high-dimensional linear regression. We then recap work on overparameterized linear regression that is concurrent to ours. Recent interest in overparameterizationConventional statistical wisdom is that using more parameters in one's model than data points leads to poor generalization. This wisdom is corroborated in theory by worst-case generalization bounds on such overparameterized models following from VC-theory in classification [2] and ill-conditioning in least-squares regression [5]. It is, however, contradicted in practice by the notable recent trend of empirically successful overparameterized deep neural networks. For example, the commonly used CIFAR-10 dataset contains 60000 images, but the number of parameters in all the neural networks achieving state-of-the-art performance on CIFAR-10 is at least 1.5 million [4]. These neural networks have the ability to memorize pure noisesomehow, they are still able to generalize well when trained with meaningful data.Since the publication of this observation [4,11], the machine learning community has seen a flurry of activity to attempt to explain this phenomenon, both for classification and regression problems, in neural networks. The problem is challenging for three core reasons 2 :1. The optimization landscape for l...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.