“…In this aspect, various benchmarks have been proposed to measure the robustness under distribution shifts [9,14,23,25,26,29,30,45,48,50], and this problem has been extensively studied in broad research fields [3,4,10,15,16,24,38,39,40,43,52,55,62]. Among them, benchmarking robustness [23] and resolving scene bias [10,42] or distribution shift [43,59] are the most related to our problem setup. Different from the aforementioned works, we first explore the background shift issue in the CSLR task with a newly synthesized benchmark.…”