2017
DOI: 10.1007/978-3-319-66179-7_37
|View full text |Cite
|
Sign up to set email alerts
|

TandemNet: Distilling Knowledge from Medical Images Using Diagnostic Reports as Optional Semantic References

Abstract: In this paper, we introduce the semantic knowledge of medical images from their diagnostic reports to provide an inspirational network training and an interpretable prediction mechanism with our proposed novel multimodal neural network, namely TandemNet. Inside TandemNet, a language model is used to represent report text, which cooperates with the image model in a tandem scheme. We propose a novel dual-attention model that facilitates high-level interactions between visual and semantic information and effectiv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
28
0

Year Published

2018
2018
2019
2019

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 49 publications
(28 citation statements)
references
References 11 publications
0
28
0
Order By: Relevance
“…In the testing stage, the input is a fundus image only, and the output is a probabilistic map of the lesion types in the image. Zhang et al . proposed a multimodal network that jointly learns from medical images and their diagnostic reports, in which semantic information interacts with visual information to improve the image understanding ability by teaching the network to distill informative features.…”
Section: Expanding Datasets For Deep Learningmentioning
confidence: 99%
“…In the testing stage, the input is a fundus image only, and the output is a probabilistic map of the lesion types in the image. Zhang et al . proposed a multimodal network that jointly learns from medical images and their diagnostic reports, in which semantic information interacts with visual information to improve the image understanding ability by teaching the network to distill informative features.…”
Section: Expanding Datasets For Deep Learningmentioning
confidence: 99%
“…In the testing stage, the input is a fundus image only, and the output is a probabilistic map of the lesion types in the image. Zhang et al 334 proposed a multimodal network that jointly learns from medical images and their diagnostic reports, in which semantic information interacts with visual information to improve the image understanding ability by teaching the network to distill informative features. Applied to bladder cancer images and the corresponding diagnostic reports, the network demonstrated improved performance compared to baseline CNN that only use image information for training.…”
Section: B Data Annotation Via Mining Text Reportsmentioning
confidence: 99%
“…Attention mechanism has been explored for image captioning [17], voice activity detection [18], speech emotion recognition [19] and question answering [20]. For biomedical imaging, attention has been used for report generation [21], disease classification [22], [23], organ segmentation [24] and localization [25]. In [26], authors have introduced attention mechanism for macular OCT classification where the proposed deep network requires a large number of model parameters, but their performance evaluation is limited.…”
Section: Introductionmentioning
confidence: 99%