“…To directly compare to other early action prediction models [2], [3], [23], [24], [25], [40] on RGB videos only, we tested our model on the UCF101 set [48], which contains 13320 unconstrained RGB videos from 101 action classes. For evaluation, we followed the same experimental settings as in [24], [25]. And we used the first 15 groups of videos for training, the next 3 groups for validation, and the rest for testing (note that those groups were pre-partitioned in [25]).…”