Transformer-Based Disease Identification for Small-Scale Imbalanced Capsule Endoscopy Dataset

Bai, Long; Wang, Liangyu; Chen, Tong; Zhao, Yunhui; Ren, Hongliang

doi:10.3390/electronics11172747

Cited by 21 publications

(6 citation statements)

References 56 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The transformer is currently a state-of-the-art model for computer vision and NLP tasks. As a result, recent works [13,15] have introduced the application of transformers for processing and analyzing WCE images. These studies have demonstrated the effectiveness of utilizing the transformer architecture in achieving high performance when applied to WCE images.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network

Oh,

Kim

et al. 2023

Diagnostics

View full text Add to dashboard Cite

Although wireless capsule endoscopy (WCE) detects small bowel diseases effectively, it has some limitations. For example, the reading process can be time consuming due to the numerous images generated per case and the lesion detection accuracy may rely on the operators’ skills and experiences. Hence, many researchers have recently developed deep-learning-based methods to address these limitations. However, they tend to select only a portion of the images from a given WCE video and analyze each image individually. In this study, we note that more information can be extracted from the unused frames and the temporal relations of sequential frames. Specifically, to increase the accuracy of lesion detection without depending on experts’ frame selection skills, we suggest using whole video frames as the input to the deep learning system. Thus, we propose a new Transformer-architecture-based neural encoder that takes the entire video as the input, exploiting the power of the Transformer architecture to extract long-term global correlation within and between the input frames. Subsequently, we can capture the temporal context of the input frames and the attentional features within a frame. Tests on benchmark datasets of four WCE videos showed 95.1% sensitivity and 83.4% specificity. These results may significantly advance automated lesion detection techniques for WCE images.

show abstract

Section: Discussionmentioning

confidence: 99%

“…Furthermore, the vision Transformer (ViT), a model that modified the original transformer for computer vision, has also performed well in image classification [12]. Because ViT performs well in computer vision tasks, some studies have employed the transformer architecture to analyze WCE images [13][14][15].…”

Section: Introductionmentioning

confidence: 99%

Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network

Oh,

Kim

et al. 2023

Diagnostics

View full text Add to dashboard Cite

show abstract

“…The HiFuse Tiny, HiFuse Small, and HiFuse Base models attained accuracy rates of 84.85%, 85.00%, and 84.35%, respectively. Bai et al [ 35 ] improved a ViT-based architecture for the classification of wireless capsule endoscopy images. They obtained 79.15% accuracy with the Kvasir-Capsule dataset utilized to evaluate the performance of the ViT-based architecture.…”

Section: Related Workmentioning

confidence: 99%

Spatial-attention ConvMixer architecture for classification and detection of gastrointestinal diseases using the Kvasir dataset

Demirbaş,

Üzen,

Fırat

2024

Health Inf Sci Syst

View full text Add to dashboard Cite

Gastrointestinal (GI) disorders, encompassing conditions like cancer and Crohn’s disease, pose a significant threat to public health. Endoscopic examinations have become crucial for diagnosing and treating these disorders efficiently. However, the subjective nature of manual evaluations by gastroenterologists can lead to potential errors in disease classification. In addition, the difficulty of diagnosing diseased tissues in GI and the high similarity between classes made the subject a difficult area. Automated classification systems that use artificial intelligence to solve these problems have gained traction. Automatic detection of diseases in medical images greatly benefits in the diagnosis of diseases and reduces the time of disease detection. In this study, we suggested a new architecture to enable research on computer-assisted diagnosis and automated disease detection in GI diseases. This architecture, called Spatial-Attention ConvMixer (SAC), further developed the patch extraction technique used as the basis of the ConvMixer architecture with a spatial attention mechanism (SAM). The SAM enables the network to concentrate selectively on the most informative areas, assigning importance to each spatial location within the feature maps. We employ the Kvasir dataset to assess the accuracy of classifying GI illnesses using the SAC architecture. We compare our architecture’s results with Vanilla ViT, Swin Transformer, ConvMixer, MLPMixer, ResNet50, and SqueezeNet models. Our SAC method gets 93.37% accuracy, while the other architectures get respectively 79.52%, 74.52%, 92.48%, 63.04%, 87.44%, and 85.59%. The proposed spatial attention block improves the accuracy of the ConvMixer architecture on the Kvasir, outperforming the state-of-the-art methods with an accuracy rate of 93.37%.

show abstract

“…In this circumstance, other compensating sensors like GSR should activate to detect the stress and pain condition for pain report. Machine learning techniques ( Bai et al, 2021 ; Bai et al, 2022 ) may also be employed to help the system learn the pain feature of particular patients, enabling more real-time feedback when the pain features appear in patients’ daily activities.…”

Section: Limitation and Future Workmentioning

confidence: 99%

Rethinking pain communication of patients with Alzheimer’s disease through E-textile interaction design

Li,

Bai,

Mao

et al. 2023

Front. Physiol.

Self Cite

View full text Add to dashboard Cite

Older individuals are easily prone to chronic pain. Due to the complexity of chronic pain, most elderly often have difficulty expressing pain to others to seek assistance, especially those with Alzheimer’s disease (AD). The caregivers cannot instantly discover the patients’ pain condition and provide timely pain management. This project applies physiological signal sensing technology to help AD patients express the presence of pain non-verbally. We embed sensors on patients’ handkerchiefs to identify the patient’s abnormal physical activity when pain occurs. Next, we translate the physiological signal into qualitative light alert to send to caregivers and indicate the pain occurrence condition. Then, utilizing multi-sensory stimulation intervention, we create an electronic textile (e-textile) tool to help caregivers effectively support patients in pain. And thus to create a two-way pain communication between caregivers and the patients. Pain perception can be independent of subjective expressions and tangibly perceived by others through our textile prototype. The e-textile handkerchiefs also bring up a new guide to facilitate communication for caregivers when their patients. We contribute the design insights of building a bio-sensing and e-textile system with considering the pain communication needs, patients’ pain behaviors and preference of objects. Our e-textile system may contribute to pain communication bio-sensing tool design for special elderly groups, especially those with weakened cognition and communication abilities. We provide a new approach to dealing with the pain of AD patients for healthcare professionals.

show abstract

Transformer-Based Disease Identification for Small-Scale Imbalanced Capsule Endoscopy Dataset

Cited by 21 publications

References 56 publications

Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network

Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network

Spatial-attention ConvMixer architecture for classification and detection of gastrointestinal diseases using the Kvasir dataset

Rethinking pain communication of patients with Alzheimer’s disease through E-textile interaction design

Contact Info

Product

Resources

About