Background
Various techniques have been proposed in the literature for phase and tool recognition from laparoscopic videos. In comparison, research in multilabel annotation of still frames is limited.
Methods
We describe a framework for multilabel annotation of images extracted from laparoscopic cholecystectomy (LC) videos based on multi‐instance multiple‐label learning. The image is considered as a bag of features extracted from local regions after coarse segmentation. A method based on variational Bayesian gaussian mixture models (VBGMM) is proposed for bag representation. Three techniques based on different feature extraction and bag representation models are employed for comparison.
Results
Four anatomical structures (abdominal wall, gallbladder, fat, and liver bed) and a tool‐like object (specimen bag) were annotated in 482 images. Our method achieved the best performance on single label accuracy: 0.87 (highest) and 0.69 (lowest). Moreover, the performance was >20% higher in terms of four multilabel classification error metrics (one‐error, ranking‐loss, hamming‐loss, and coverage).
Conclusions
Our approach provides an accurate and efficient image representation for multilabel classification of still images captured in LC.