“… SL1, smooth L1 loss; CE, cross-entropy loss; DV, dual view loss; PL, patient-level loss; CS, cost-sensitive loss; 40 L1, L1 loss; CE(5 class), mean loss of 5 class (one versus others); MMoE, multi-gate mixture of expert; 41 GMP, generalized mean pooling; 42 OHEM, online hard example mining; 43 , 44 CV, cross-validation; O, oversampling; ES, early stopping; TL, transfer learning; TTA, test time augmentation; 45 , 46 PLT, pseudo-labeled and labeled training. …”