Triplet loss is widely used as the objective function in image-text retrieval tasks.However, as all the triplets are treated equally, triplet loss has a bottleneck problem of slow convergence and other unsatisfactory performances. In this article, we propose solutions by appropriately weighting triplets according to the relative similarities among the training samples. Specifically, we present three weighting functions to assign an appropriate weight for the selected informative triplets to accelerate the convergence. We evaluate our approach on two widely used benchmark datasets: Flickr30k and MSCOCO, with results outperforming the previous methods, which demonstrates its superiority.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.