“…Therefore, our framework shares the same spirit as the traditional label efficiency research. Data valuation In the literature, other than active learning, there exists many techniques to quantify the importance of individual samples, e.g., influence function [Koh and Liang 2017] and its variants [Wu, Weimer, and Davidson 2021], Glister [Killamsetty et al 2021], HOST-CP [Das et al 2021], TracIn [Pruthi et al 2020], DVRL [Yoon, Arik, and Pfister 2020] and Data Shapley value [Ghorbani and Zou 2019]. However, among these methods, Data Shapley value [Ghorbani and Zou 2019] is very computationally expensive while others rely on the assumption that a set of "clean" validation samples (or meta samples) are given, which is thus not suitable for our framework (we have more detailed discussions on Data Shapley value and its extensions in Appendix "Appendix: more related work").…”