“…The extraction and comprehension of text in images play a critical role in many computer vision applications. Text spotting algorithms have progressed significantly in recent years [33,42,45,49,67], specifically within the task of detecting [2,28,36,63] and recognizing [5,12,40,41,59] individual text instances in images. Previously, defining the geometric layout [7,9,24,62] of extracted textual content occurred independent of text spotting and remained focused on document images.…”