We describe a new approach that improves the training of generative adversarial nets (GANs) for synthesizing diverse images from a text input. Our approach is based on the conditional version of GANs and expands on previous work leveraging an auxiliary task in the discriminator. Our generated images are not limited to certain classes and do not suffer from mode collapse while semantically matching the text input. A key to our training methods is how to form positive and negative training examples with respect to the class label of a given image. Instead of selecting random training examples, we perform negative sampling based on the semantic distance from a positive example in the class. We evaluate our approach using the Oxford-102 flower dataset, adopting the inception score and multi-scale structural similarity index (MS-SSIM) metrics to assess discriminability and diversity of the generated images. The empirical results indicate greater diversity in the generated images, especially when we gradually select more negative training examples closer to a positive example in the semantic space.
Recent approaches in generative adversarial networks (GANs) can automatically synthesize realistic images from descriptive text. Despite the overall fair quality, the generated images often expose visible flaws that lack structural definition for an object of interest. In this paper, we aim to extend state of the art for GAN-based text-to-image synthesis by improving perceptual quality of generated images. Differentiated from previous work, our synthetic image generator optimizes on perceptual loss functions that measure pixel, feature activation, and texture differences against a natural image. We present visually more compelling synthetic images of birds and flowers generated from text descriptions in comparison to some of the most prominent existing work.
We present a clustering-based language model using word embeddings for text readability prediction. Presumably, an Euclidean semantic space hypothesis holds true for word embeddings whose training is done by observing word co-occurrences. We argue that clustering with word embeddings in the metric space should yield feature representations in a higher semantic space appropriate for text regression. Also, by representing features in terms of histograms, our approach can naturally address documents of varying lengths. An empirical evaluation using the Common Core Standards corpus reveals that the features formed on our clustering-based language model signi cantly improve the previously known results for the same corpus in readability prediction. We also evaluate the task of sentence matching based on semantic relatedness using the Wiki-SimpleWiki corpus and nd that our features lead to superior matching performance.
BACKGROUND:In-field triage tools for trauma patients are limited by availability of information, linear risk classification, and a lack of confidence reporting. We therefore set out to develop and test a machine learning algorithm that can overcome these limitations by accurately and confidently making predictions to support in-field triage in the first hours after traumatic injury. METHODS:Using an American College of Surgeons Trauma Quality Improvement Program-derived database of truncal and junctional gunshot wound (GSW) patients (aged 16-60 years), we trained an information-aware Dirichlet deep neural network (field artificial intelligence triage). Using supervised training, field artificial intelligence triage was trained to predict shock and the need for major hemorrhage control procedures or early massive transfusion (MT) using GSW anatomical locations, vital signs, and patient information available in the field. In parallel, a confidence model was developed to predict the true-class probability (scale of 0-1), indicating the likelihood that the prediction made was correct, based on the values and interconnectivity of input variables. RESULTS:A total of 29,816 patients met all the inclusion criteria. Shock, major surgery, and early MT were identified in 13.0%, 22.4%, and 6.3% of the included patients, respectively. Field artificial intelligence triage achieved mean areas under the receiver operating characteristic curve of 0.89, 0.86, and 0.82 for prediction of shock, early MT, and major surgery, respectively, for 80/20 train-test splits over 1,000 epochs. Mean predicted true-class probability for errors/correct predictions was 0.25/0.87 for shock, 0.30/0.81 for MT, and 0.24/0.69 for major surgery. CONCLUSION:Field artificial intelligence triage accurately identifies potential shock in truncal GSW patients and predicts their need for MT and major surgery, with a high degree of certainty. The presented model is an important proof of concept. Future iterations will use an expansion of databases to refine and validate the model, further adding to its potential to improve triage in the field, both in civilian and military settings.
Coherent change detection using paired synthetic aperture radar images is often performed using a classical coherence estimator that is invariant to the true variances of the populations underlying each paired sample. While attractive, this estimator is biased and requires a significant number of samples to yield good performance. Increasing sample size often results in decreased image resolution. Thus, we propose use of Berger's coherence estimate because with the same number of pixels, the estimator effectively doubles the sample support without sacrificing resolution when the underlying population variances are equal or near equal. A potential drawback of this approach is that it is not invariant since its distribution depends on the pixel pair population variances. While Berger's estimator is inherently sensitive to the inequality of population variances, we propose a method of insulating the detector from this acuity. A two-stage change statistic is introduced to combine a non-coherent intensity change statistic given by the sample variance ratio followed by the alternative Berger estimator which assumes equal population variances. The first stage detector identifies pixel pairs that have non-equal variances as changes caused by the displacement of sizable object. The pixel pairs that are identified to have equal or near equal variances in the first stage are used as an input to the second stage. The second stage test uses the alternative Berger coherence estimator to detect subtle changes such as tire tracks and footprints. We show experimentally that the proposed method yields higher contrast SAR change detection images This work is sponsored by the United States Air Force under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the United States Government.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.