“…Motivated by the successes of [180], [181], [68], Su et al [131] used the histogram of oriented gradients (HOG) feature [182] in their STR system to construct sequential features of word images. Later, CNNs [138], [140], [141], [143], [72] have been widely used for feature representation stage, such as the VGGNet [183], [116], [178], [133], [71], [137], [162]. For more powerful feature representation, some complex neural networks were applied in STR algorithms, such as ResNet [184] [135], [145], [148], [73], [151], [166], [152], [154], [48], [155], [157], [158], [161], [163], [164] and DenseNet [185], [150], [160].…”