This paper proposes a control framework allowing to position a robot in range and orientation with respect to a sound source from a binaural system. To this end, a sensorbased control framework is developed upon interaural level difference (ILD) cues. ILD is known to be implicitly related to the sound source azimuth but also to the sound source distance. We emphasize the latter property by introducing the concept of ILD annulus. Then a sensor-based task is designed in order to approach a sound source. This method is validated in simulation and in real world experiments performed on a humanoid robot. Index Terms-Robot Audition, Sensor-based Control I. INTRODUCTION I N robot audition, motion control from a binaural setup is generally based on the estimation of the azimuth and/or elevation angles through a localization paradigm. Distance estimation is seldom performed because of the complexity of this process compared to azimuth and elevation estimation. Range positioning is thus generally exploited by systems endowed with an array (i.e., > 2) of microphones, whether by using the interaural time difference (ITD) [1], [2], the interaural level difference (ILD) [3] or both cues [4]. However the latter approach remains out of the scope of this paper that focuses on binaural setups. Alternatively, in the field of active audition, distance is recovered by fusing acoustic measurements from different positions of the robot, as mostly performed in the state-of-theart of binaural robot audition [5], [6], [7], [8], [9]. In a different way, we develop in this paper a method that is rather inspired by psycho-acoustics literature related to distance estimation by human listeners. Several studies [10], [11], [12] performed on human subjects demonstrate that binaural cues do not only convey orientation information but also distance. In particular, a correlation between ILD variation and distance has been exhibited: ILD increases rapidly when approaching a sound source in the near-field. In robot audition, such results are exploited in [13], where distance is estimated from a prelearned sequence of features/distance bins. More specifically, relevant results based on ILD are shown for sound sources