“…We compare the performance of our AGOS with three handcrafted features (PLSA, BOW, LDA) [17], [87], three typical CNN models (AlexNet, VGG, GoogLeNet) [17], [87], twentytwo latest CNN-based state-of-the-art approaches (MIDCNet [2], RANet [29], APNet [88], SPPNet [20], DCNN [28], TEXNet [89], MSCP [18], VGG+FV [21], DSENet [45], MS2AP [46], MSDFF [47], CADNet [48], LSENet [5], GBNet [49], MBLANet [50], MG-CAP [51], Contourlet CNN [52], STHP [53], SAGM [54], DARTS [55], LML [56], GCSANet [57]), one RNN-based approach (ARCNet [25]), two autoencoder based approaches (SGUFL [59], PARTLETS [58]) and two GAN-based approaches (MARTA [60], AGAN [61]) respectively. The performance under the backbone of ResNet-50, ResNet-101 and DenseNet-121 is all reported for fair evaluation as some latest methods [47], [48] use much deeper networks as backbone.…”