“…Among the surveyed 31 symmetric approaches, direct approaches operated on the feature representations across domains by minimizing their differences (via mutual information [ 63 ], maximum mean discrepancy [ 46 , 49 , 64 ], Euclidean distance [ 65 , 66 , 67 , 68 , 69 , 70 , 71 ], Wasserstein distance [ 72 ], and average likelihood [ 73 ]), maximizing their correlation [ 74 , 75 ] or covariance [ 36 ], and introducing sparsity with L1/L2 norms [ 42 , 76 ]. On the other hand, indirect approaches were applied via adversarial training [ 28 , 41 , 54 , 77 , 78 , 79 , 80 , 81 , 82 , 83 , 84 , 85 ], and knowledge distillation [ 86 ].…”