“…Two types of loss corrections are proposed: forward and backward. Forward loss multiplies model predictions with noise transition matrix to match them with noisy labels while backward loss multiplies Noise Model Based Methods 1.Noisy Channel a.Extra layer: Linear fully connected layer [42][43], softmax layer [44] b.Separate Network: Estimating noise type [45], masking [46], quality embedding [47] c.Explicit Calculation: EM [26], [48], [49], conditional independence [50], forward&backward loss [51], unsupervised generative model [52], bayesian form [53] 2.Label Noise Cleansing a.Using Reference Set: train cleaner on reference set [54], [55], clean based on extracted features [56], teacher cleans for student [57], ensemble of networks [58] b.Not Using Reference Set: Moving average of network predictions [59], consistency loss [60], ensemble of network [61], prototypes [62], random split [63], confidence policy [64] 3.Sample Choosing a.Self Consistency: Consistency with model [65], consistency with moving average of model [66], graph-based [67], dividing to two subset [68] b.Curriculum Learning: Screening loss [69], teacher-student [70], selecting uncertain samples [71], extra layer with similarity matrix [72], curriculum loss [73], data complexity [74], partial labels [75] c.Multiple Classifiers: Consistency of networks [76], co-teaching [77]- [79] d.Active Learning: Relabel hard ...…”