Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning

Arazo, Eric; Ortego, Diego; Albert, Paul S.; O’Connor, Noel E.; McGuinness, Kevin

doi:10.48550/arxiv.1908.02983

Cited by 34 publications

(84 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Additional, dependent on the accuracy of the pseudo-labels, we increase the amount of labelled data the model has access to and reduce overfitting to the initally small label set. There are many ways to incorporate unlabelled data / pseudo-label pairs into the loss function but the most common ways are to either create a specific loss term for the unlabelled data pseudo-label pairs [12], [18] or by using composite batches containing both labelled and unlabelled data and keeping the standard supervised classification loss [20], [33].…”

Section: B Pseudo-labelling Techniquesmentioning

confidence: 99%

“…As pointed out by Arazo et al [20] there is a potential pitfall in this style of approach. Networks are often wrong and the neural network can overfit to its own incorrectly guessed pseudo-labels in a process termed confirmation bias.…”

Section: B Pseudo-labelling Techniquesmentioning

confidence: 99%

“…1) Methods which used the 13-CNN architecture [32]: Π-Model [30], Mean Teacher(MT) [32], Virtual Adversarial Training (VAT) [31], Label Propogation for Deep Semi-Supervised Learning (LP) [33], Smooth Neighbors on Teacher Graphs (SNTG) [42], Stochastic Weight Averaging(SWA) [43], Interpolation Consistency Training (ICT) [21], Dual Student [44], Transductive Semi-Supervised Deep Learning(TSSDL) [35], Density-Aware Graphs (DAG) [38] and Pseudo-Label Mixup [20]. Unfortunately, due to the natural progress in the field, each paper has different implementation choices which are not standardised.…”

Section: Evaluation Protocolmentioning

confidence: 99%

“…Recent approaches in SSL have proposed costly optimisation schemes involving multi-term loss functions to improve the generalisation of their models [14], [20]. Some approaches [12] use separate loss terms for unlabelled and labelled data, whilst consistency regularisation approaches such as [13] use a standard supervised loss in combination with a specialised consistency loss.…”

Section: Introductionmentioning

confidence: 99%

“…Some approaches [12] use separate loss terms for unlabelled and labelled data, whilst consistency regularisation approaches such as [13] use a standard supervised loss in combination with a specialised consistency loss. Other approaches go even further [14], [20] and use three or more loss terms which promote entropy minimisation, class balancing or simultaneously minimise several consistency losses.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

LaplaceNet: A Hybrid Energy-Neural Model for Deep Semi-Supervised Classification

Sellars,

Aviles-Rivero,

Schönlieb

2021

Preprint

View full text Add to dashboard Cite

Semi-supervised learning has received a lot of recent attention as it alleviates the need for large amounts of labelled data which can often be expensive, requires expert knowledge and be time consuming to collect. Recent developments in deep semi-supervised classification have reached unprecedented performance and the gap between supervised and semi-supervised learning is ever-decreasing. This improvement in performance has been based on the inclusion of numerous technical tricks, strong augmentation techniques and costly optimisation schemes with multi-term loss functions. We propose a new framework, LaplaceNet, for deep semi-supervised classification that has a greatly reduced model complexity. We utilise a hybrid energyneural network where graph based pseudo-labels, generated by minimising the graphical Laplacian, are used to iteratively improve a neural-network backbone. Our model outperforms state-of-the-art methods for deep semi-supervised classification, over several benchmark datasets. Furthermore, we consider the application of strong-augmentations to neural networks theoretically and justify the use of a multi-sampling approach for semi-supervised learning. We demonstrate, through rigorous experimentation, that a multi-sampling augmentation approach improves generalisation and reduces the sensitivity of the network to augmentation. Code available at https://github.com/psellcam/ LaplaceNet.

show abstract

Section: B Pseudo-labelling Techniquesmentioning

confidence: 99%

Section: B Pseudo-labelling Techniquesmentioning

confidence: 99%

Section: Evaluation Protocolmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

LaplaceNet: A Hybrid Energy-Neural Model for Deep Semi-Supervised Classification

Sellars,

Aviles-Rivero,

Schönlieb

2021

Preprint

View full text Add to dashboard Cite

show abstract

Robust semi-supervised classification based on data augmented online ELMs with deep features

Zeng

et al. 2021

Knowledge-Based Systems

View full text Add to dashboard Cite

Robust Federated Learning With Noisy Communication

Ang

Chen

Zhao

et al. 2020

IEEE Trans. Commun.

109

View full text Add to dashboard Cite

Federated learning is a communication-efficient training process that alternates between local training at the edge devices and averaging the updated local model at the central server. Nevertheless, it is impractical to achieve a perfect acquisition of the local models in wireless communication due to noise, which also brings serious effects on federated learning. To tackle this challenge, we propose a robust design for federated learning to alleviate the effects of noise in this paper. Considering noise in the two aforementioned steps, we first formulate the training problem as a parallel optimization for each node under the expectation-based model and the worst-case model. Due to the non-convexity of the problem, a regularization for the loss function approximation method is proposed to make it tractable. Regarding the worst-case model, we develop a feasible training scheme which utilizes the sampling-based successive convex approximation algorithm to tackle the unavailable maxima or minima noise condition and the non-convex issue of the objective function. Furthermore, the convergence rates of both new designs are analyzed from a theoretical point of view. Finally, the improvement of prediction accuracy and the reduction of loss function are demonstrated via simulations for the proposed designs.Index Terms-Expectation-based model, federated learning, robust design, worst-case model.

show abstract

Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning

Cited by 34 publications

References 19 publications

LaplaceNet: A Hybrid Energy-Neural Model for Deep Semi-Supervised Classification

LaplaceNet: A Hybrid Energy-Neural Model for Deep Semi-Supervised Classification

Robust semi-supervised classification based on data augmented online ELMs with deep features

Robust Federated Learning With Noisy Communication

Contact Info

Product

Resources

About