Cross‐modal fusion encoder via graph neural network for referring image segmentation
Yuqing Zhang,
Yong Zhang,
Xinglin Piao
et al.
Abstract:Referring image segmentation identifies the object masks from images with the guidance of input natural language expressions. Nowadays, many remarkable cross‐modal decoder are devoted to this task. But there are mainly two key challenges in these models. One is that these models usually lack to extract fine‐grained boundary information and gradient information of images. The other is that these models usually lack to explore language associations among image pixels. In this work, a Multi‐scale Gradient balance… Show more
Glaucoma poses a significant threat to vision, capable of causing irreversible damage and, in severe instances, leading to permanent blindness. Accurate optic cup (OC) and optic disc (OD) segmentation are essential in glaucoma screening. In this study, a novel OC and OD segmentation approach is proposed. Based on U‐Net, it is optimized by introducing cardinality dimensions. Moreover, attention gates are implemented to reinforce salient features while suppressing irrelevant information. Additionally, a convolutional block attention module (CBAM) is integrated into the decoder segment. This fusion hones in on effective information in both channel and spatial dimensions. Meanwhile, an image processing procedure is proposed for image normalization and enhancement. All of these increase the accuracy of the model. This model is evaluated on the ORIGA and REFUGE datasets, demonstrating the model's superiority in OC and OD segmentation compared to the state‐of‐the‐art methods. Additionally, after the proposed image processing, cup‐to‐disc ratio (CDR) prediction on a batch of 155 in‐house fundus images yields an absolute CDR error of 0.099, which is reduced by 0.04 compared to the case where only conventional processing was performed.
Glaucoma poses a significant threat to vision, capable of causing irreversible damage and, in severe instances, leading to permanent blindness. Accurate optic cup (OC) and optic disc (OD) segmentation are essential in glaucoma screening. In this study, a novel OC and OD segmentation approach is proposed. Based on U‐Net, it is optimized by introducing cardinality dimensions. Moreover, attention gates are implemented to reinforce salient features while suppressing irrelevant information. Additionally, a convolutional block attention module (CBAM) is integrated into the decoder segment. This fusion hones in on effective information in both channel and spatial dimensions. Meanwhile, an image processing procedure is proposed for image normalization and enhancement. All of these increase the accuracy of the model. This model is evaluated on the ORIGA and REFUGE datasets, demonstrating the model's superiority in OC and OD segmentation compared to the state‐of‐the‐art methods. Additionally, after the proposed image processing, cup‐to‐disc ratio (CDR) prediction on a batch of 155 in‐house fundus images yields an absolute CDR error of 0.099, which is reduced by 0.04 compared to the case where only conventional processing was performed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.