We present an approach to improve statistical machine translation of image descriptions by multimodal pivots defined in visual space. The key idea is to perform image retrieval over a database of images that are captioned in the target language, and use the captions of the most similar images for crosslingual reranking of translation outputs. Our approach does not depend on the availability of large amounts of in-domain parallel data, but only relies on available large datasets of monolingually captioned images, and on state-ofthe-art convolutional neural networks to compute image similarities. Our experimental evaluation shows improvements of 1 BLEU point over strong baselines.
Cross-lingual information retrieval (CLIR) is a document retrieval task where the documents are written in a language different from that of the user's query. This is a challenging problem for data-driven approaches due to the general lack of labeled training data. We introduce a large-scale dataset derived from Wikipedia to support CLIR research in 25 languages. Further, we present a simple yet effective neural learning-to-rank model that shares representations across languages and reduces the data requirement. This model can exploit training data in, for example, Japanese-English CLIR to improve the results of Swahili-English CLIR.
This paper describes the system submitted by the University of Heidelberg to the Shared Task on Word-level Quality Estimation at the 2015 Workshop on Statistical Machine Translation. The submitted system combines a continuous space deep neural network, that learns a bilingual feature representation from scratch, with a linear combination of the manually defined baseline features provided by the task organizers. A combination of these orthogonal information sources shows significant improvements over the combined systems, and produces very competitive F 1-scores for predicting word-level translation quality.
Sepsis is the leading cause of death in non-coronary intensive care units. Moreover, a delay of antibiotic treatment of patients with severe sepsis by only few hours is associated with increased mortality. This insight makes accurate models for early prediction of sepsis a key task in machine learning for healthcare. Previous approaches have achieved high AUROC by learning from electronic health records where sepsis labels were defined automatically following established clinical criteria. We argue that the practice of incorporating the clinical criteria that are used to automatically define ground truth sepsis labels as features of severity scoring models is inherently circular and compromises the validity of the proposed approaches. We propose to create an independent ground truth for sepsis research by exploiting implicit knowledge of clinical practitioners via an electronic questionnaire which records attending physicians' daily judgements of patients' sepsis status. We show that despite its small size, our dataset allows to achieve state-of-the-art AUROC scores. An inspection of learned weights for standardized features of the linear model lets us infer potentially surprising feature contributions and allows to interpret seemingly counterintuitive findings.
We present an approach to cross-language retrieval that combines dense knowledgebased features and sparse word translations. Both feature types are learned directly from relevance rankings of bilingual documents in a pairwise ranking framework. In large-scale experiments for patent prior art search and cross-lingual retrieval in Wikipedia, our approach yields considerable improvements over learningto-rank with either only dense or only sparse features, and over very competitive baselines that combine state-of-the-art machine translation and retrieval.
Background: Multidrug-resistant (MDR) pathogens represent an emerging challenge in end-stage liver disease and in liver transplant recipients. Methods: We evaluated the impact of MDR bacteria upon clinical outcomes in patients with end-stage liver disease (n = 777) at the time of enrollment on the liver transplant (LTx) waiting list, after first LTx (n = 645), and after second LTx (n = 128). Results: Colonization/infection with MDR bacteria was present in 72/777 patients on the waiting list, in 98/645 patients at first LTx, and in 46/128 patients at second LTx. While on the LTx waiting list, the time until first hydropic decompensation (p = 0.021), hepatic encephalopathy (p < 0.001) and hepatorenal syndrome (p < 0.001) was reduced in the presence of MDR bacteria, which remained an independent risk factor of poor survival in multivariate analysis (p < 0.001). Following first and second liver transplant, MDR bacteria were associated with an increased risk of infection-related deaths (first LTx: p < 0.001; second LTx: p = 0.037) and reduced actuarial survival (first LTx: p < 0.001; second LTx: p = 0.046). Conclusions: We showed that MDR pathogens are associated with poor outcomes before, after first and after recurrent LTx.
Background Sepsis is the leading cause of death in the intensive care unit (ICU). Expediting its diagnosis, largely determined by clinical assessment, improves survival. Predictive and explanatory modelling of sepsis in the critically ill commonly bases both outcome definition and predictions on clinical criteria for consensus definitions of sepsis, leading to circularity. As a remedy, we collected ground truth labels for sepsis. Methods In the Ground Truth for Sepsis Questionnaire (GTSQ), senior attending physicians in the ICU documented daily their opinion on each patient’s condition regarding sepsis as a five-category working diagnosis and nine related items. Working diagnosis groups were described and compared and their SOFA-scores analyzed with a generalized linear mixed model. Agreement and discriminatory performance measures for clinical criteria of sepsis and GTSQ labels as reference class were derived. Results We analyzed 7291 questionnaires and 761 complete encounters from the first survey year. Editing rates for all items were > 90%, and responses were consistent with current understanding of critical illness pathophysiology, including sepsis pathogenesis. Interrater agreement for presence and absence of sepsis was almost perfect but only slight for suspected infection. ICU mortality was 19.5% in encounters with SIRS as the “worst” working diagnosis compared to 5.9% with sepsis and 5.9% with severe sepsis without differences in admission and maximum SOFA. Compared to sepsis, proportions of GTSQs with SIRS plus acute organ dysfunction were equal and macrocirculatory abnormalities higher (p < 0.0001). SIRS proportionally ranked above sepsis in daily assessment of illness severity (p < 0.0001). Separate analyses of neurosurgical referrals revealed similar differences. Discriminatory performance of Sepsis-1/2 and Sepsis-3 compared to GTSQ labels was similar with sensitivities around 70% and specificities 92%. Essentially no difference between the prevalence of SIRS and SOFA ≥ 2 yielded sensitivities and specificities for detecting sepsis onset close to 55% and 83%, respectively. Conclusions GTSQ labels are a valid measure of sepsis in the ICU. They reveal suspicion of infection as an unclear clinical concept and refute an illness severity hierarchy in the SIRS-sepsis-severe sepsis spectrum. Ground truth challenges the accuracy of Sepsis-1/2 and Sepsis-3 in detecting sepsis onset. It is an indispensable intermediate step towards advancing diagnosis and therapy in the ICU and, potentially, other health care settings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.