We propose a method of improving detection precision (mAP) with the help of the prior knowledge about the scene geometry: we assume the scene to be a plane with objects placed on it. We focus our attention on autonomous robots, so given the robot's dimensions and the inclination angles of the camera, it is possible to predict the spatial scale for each pixel of the input frame. With slightly modified YOLOv3-tiny we demonstrate that the detection supplemented by the scale channel, further referred as S, outperforms standard RGB-based detection with small computational overhead.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.