Object tracking belongs to active research areas in computer vision. We are interested in matching-based trackers exploiting deep machine learning known as Siamese trackers. Their powerful capabilities stem from similarity learning. This tracking paradigm is promising due to its inherent balance between performance and efficiency, so trackers of this type are suitable for real-time generic object tracking. There is an upsurge in research interest in Siamese trackers and the lack of available specialized surveys in this category. In this survey, we aim to identify and elaborate on the most significant challenges the Siamese trackers face. Our goal is to answer what design decisions the authors made and what problems they attempted to solve in the first place. We thus perform an in-depth analysis of the core principles on which Siamese trackers operate with a discussion of incentives behind them. Besides, we provide an up-todate qualitative and quantitative comparison of the prominent Siamese trackers on established benchmarks. Among other things, we discuss current trends in developing Siamese trackers. Our survey could help absorb the details about the underlying principles of Siamese trackers and the challenges they face.INDEX TERMS visual object tracking, deep learning, Siamese neural networks, similarity learning, fully convolutional networks
One-hot encoding is the prevalent method used in neural networks to represent multi-class categorical data. Its success stems from its ease of use and interpretability as a probability distribution when accompanied by a softmax activation function. However, one-hot encoding leads to very high dimensional vector representations when the categorical data’s cardinality is high. The Hamming distance in one-hot encoding is equal to two from the coding theory perspective, which does not allow detection or error-correcting capabilities. Binary coding provides more possibilities for encoding categorical data into the output codes, which mitigates the limitations of the one-hot encoding mentioned above. We propose a novel method based on Zadeh fuzzy logic to train binary output codes holistically. We study linear block codes for their possibility of separating class information from the checksum part of the codeword, showing their ability not only to detect recognition errors by calculating non-zero syndrome, but also to evaluate the truth-value of the decision. Experimental results show that the proposed approach achieves similar results as one-hot encoding with a softmax function in terms of accuracy, reliability, and out-of-distribution performance. It suggests a good foundation for future applications, mainly classification tasks with a high number of classes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.