Visual attributes are great means of describing images or scenes, in a way both humans and computers understand. In order to establish a correspondence between images and to be able to compare the strength of each property between images, relative attributes were introduced. However, since their introduction, hand-crafted and engineered features were used to learn increasingly complex models for the problem of relative attributes. This limits the applicability of those methods for more realistic cases. We introduce a deep neural network architecture for the task of relative attribute prediction. A convolutional neural network (ConvNet) is adopted to learn the features by including an additional layer (ranking layer) that learns to rank the images based on these features. We adopt an appropriate ranking loss to train the whole network in an end-to-end fashion. Our proposed method outperforms the baseline and state-of-the-art methods in relative attribute prediction on various coarse and fine-grained datasets. Our qualitative results along with the visualization of the saliency maps show that the network is able to learn effective features for each specific attribute. Source code of the proposed method is available at https
Purpose To develop a three-dimensional (3D) deep learning algorithm to detect glaucoma using spectral-domain optical coherence tomography (SD-OCT) optic nerve head (ONH) cube scans and validate its performance on ethnically diverse real-world datasets and on cropped ONH scans. Methods In total, 2461 Cirrus SD-OCT ONH scans of 1012 eyes were obtained from the Glaucoma Clinic Imaging Database at the Byers Eye Institute, Stanford University, from March 2010 to December 2017. A 3D deep neural network was trained and tested on this unique raw OCT cube dataset to identify a multimodal definition of glaucoma excluding other concomitant retinal disease and optic neuropathies. A total of 1022 scans of 363 glaucomatous eyes (207 patients) and 542 scans of 291 normal eyes (167 patients) from Stanford were included in training, and 142 scans of 48 glaucomatous eyes (27 patients) and 61 scans of 39 normal eyes (23 patients) were included in the validation set. A total of 3371 scans (Cirrus SD-OCT) from four different countries were used for evaluation of the model: the non overlapping test dataset from Stanford (USA) consisted of 694 scans: 241 scans from 113 normal eyes of 66 patients and 453 scans of 157 glaucomatous eyes of 89 patients. The datasets from Hong Kong (total of 1625 scans; 666 OCT scans from 196 normal eyes of 99 patients and 959 scans of 277 glaucomatous eyes of 155 patients), India (total of 672 scans; 211 scans from 147 normal eyes of 98 patients and 461 scans from 171 glaucomatous eyes of 101 patients), and Nepal (total of 380 scans; 158 scans from 143 normal eyes of 89 patients and 222 scans from 174 glaucomatous eyes of 109 patients) were used for external evaluation. The performance of the model was then evaluated on manually cropped scans from Stanford using a new algorithm called DiagFind . The ONH region was cropped by identifying the appropriate zone of the image in the expected location relative to Bruch's Membrane Opening (BMO) using a commercially available imaging software. Subgroup analyses were performed in groups stratified by eyes, myopia severity of glaucoma, and on a set of glaucoma cases without field defects. Saliency maps were generated to highlight the areas the model used to make a prediction. The model's performance was compared to that of a glaucoma specialist using all available information on a subset of cases. Results The 3D deep learning system achieved area under the curve (AUC) values of 0.91 (95% CI, 0.90–0.92), 0.80 (95% CI, 0.78–0.82), 0.94 (95% CI, 0.93–0.96), and 0.87 (95% CI, 0.85–0.90) on Stanford, Hong Kong, India, and Nepal datasets, respectively, to detect perimetric glaucoma and AUC values of 0.99 (95% CI, 0.97–1.00), 0.96 (95% CI, 0.93–1.00), and 0.92 (95% CI, 0.89–0.95) on severe, moderate, and mild myopia cases, respectively, and an AUC of 0.77 on cropped scans. The model achieved an AUC value of 0.92 (95% CI, 0.90–0.93) versus that of the h...
Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like English. This work focuses on Persian language, one of the widely spoken languages in the world, and yet there are few NLU datasets available for this language. The availability of high-quality evaluation datasets is a necessity for reliable assessment of the progress on different NLU tasks and domains. We introduce ParsiNLU, the first benchmark in Persian language that includes a range of language understanding tasks—reading comprehension, textual entailment, and so on. These datasets are collected in a multitude of ways, often involving manual annotations by native speakers. This results in over 14.5k new instances across 6 distinct NLU tasks. Additionally, we present the first results on state-of-the-art monolingual and multilingual pre-trained language models on this benchmark and compare them with human performance, which provides valuable insights into our ability to tackle natural language understanding challenges in Persian. We hope ParsiNLU fosters further research and advances in Persian language understanding.1
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.