“…In the inference phase, we use the training dataset and validation dataset to train our model with 960 × 720 resolution input. Our models are compared to some non-real-time algorithms, including SegNet (Badrinarayanan et al, 2017), Deeplab (Chen et al, 2015), RTA (Huang et al, 2018), Dilate8 (Yu and Koltun, 2016), PSPNet (Zhao et al, 2017), VideoGCRF (Chandra et al, 2018), and DenseDecoder (Bilinski and Prisacariu, 2018), and real-time algorithms, containing ENet (Paszke et al, 2016), IC-Net (Zhao et al, 2018a), DABNet (Li et al, 2019a), DFANet (Li et al, 2019b), SwiftNet (Orsic et al, 2019), BiSeNetV1 (Yu et al, 2018a). BiSeNetV2 achieves much faster inference speed than other methods.…”