“…Over the past few years, we have witnessed tremendous progress in vision-based action detection [37,57,85,17,40,2,68,69,61,4,30,34,62,81,90,56]. This success is largely attributed to the deep neural networks, which demonstrates superior performance in several computer vi- Unlike the prior UDA method [76], which follows a class-based mixed sampling to generate augmented mixed images, our mixed sampling algorithm randomly samples image patches based on the number of action instances present in the source frames.…”