“…It finds many applications in realworld scenarios, e.g., better management of crowd gatherings, safety and security, and circumventing any undesirable incident. Many deep learning-based image-only schemes [39,42,41,17,41,19,60,32] have been proposed to date, ranging from single and multi-branch networks [60,39,41], multi-regressors [42] based to trellis networks [19]. Although they show reasonable performance in regular images, they fail to generalize well in many practical scenarios such as low illumination and lighting conditions, noise, severe occlusion, and low-resolution images, where visual…”