Capturing visual similarity among images is the core of many computer vision and pattern recognition tasks. This problem can be formulated in such a paradigm called metric learning. Most research in the area has been mainly focusing on improving the loss functions and similarity measures. However, due to the ignoring of geometric structure, existing methods often lead to sub-optimal results. Thus, several recent research methods took advantage of Wasserstein distance between batches of samples to characterize the spacial geometry. Although these approaches can achieve enhanced performance, the aggregation over batches definitely hinders Wasserstein distance's superior measure capability and leads to high computational complexity. To address this limitation, we propose a novel Deep Wasserstein Metric Learning framework, which employs Wasserstein distance to precisely capture the relationship among various images under ranking-based loss functions such as contrastive loss and triplet loss. Our method directly computes the distance between images, considering the geometry at a finer granularity than batch level. Furthermore, we introduce a new efficient algorithm using Sinkhorn approximation and Wasserstein measure coreset. The experimental results demonstrate the improvements of our framework over various baselines in different applications and benchmark datasets.
This paper introduces a novel Robust Regression
(RR) model, named Sinkhorn regression, which
imposes Sinkhorn distances on both loss function
and regularization. Traditional RR methods target
at searching for an element-wise loss function
(e.g., Lp-norm) to characterize the errors such that
outlying data have a relatively smaller influence on
the regression estimator. Due to the neglect of the
geometric information, they often lead to the suboptimal
results in the practical applications. To
address this problem, we use a cross-bin distance
function, i.e., Sinkhorn distances, to capture the geometric
knowledge of real data. Sinkhorn distances
is invariant in movement, rotation and zoom. Thus,
our method is more robust to variations of data
than traditional regression models. Meanwhile, we
leverage Kullback-Leibler divergence to relax the
proposed model with marginal constraints into its
unbalanced formulation to adapt more types of features.
In addition, we propose an efficient algorithm
to solve the relaxed model and establish its
complete statistical guarantees under mild conditions.
Experiments on the five publicly available
microarray data sets and one mass spectrometry
data set demonstrate the effectiveness and robustness
of our method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.