We optimize a six degrees of freedom hovering policy using reinforcement meta-learning. The policy maps flash LIDAR measurements directly to on/off spacecraft body-frame thrust commands, allowing hovering at a fixed position and attitude in the asteroid body-fixed reference frame. Importantly, the policy does not require position and velocity estimates, and can operate in environments with unknown dynamics, and without an asteroid shape model or navigation aids. Indeed, during optimization the agent is confronted with a new randomly generated asteroid for each episode, insuring that it does not learn an asteroid's shape, texture, or environmental dynamics. This allows the deployed policy to generalize well to novel asteroid characteristics, which we demonstrate in our experiments. The hovering controller has the potential to simplify mission planning by allowing asteroid body-fixed hovering immediately upon the spacecraft's arrival to an asteroid. This in turn simplifies shape model generation and allows resource mapping via remote sensing immediately upon arrival at the target asteroid.spacecraft's position and velocity [6]. Lee et. al. demonstrates 6-DOF hovering using a control law developed in the Lie group SE(3) [7], but again assumes the spacecraft's state can be inferred, and requires an estimate of the environmental dynamics. Gaudet and Furfaro developed a 3-DOF hovering controller using reinforcement learning [8] and showed improved transient response as compared to an LQR controller; however, the method assumes that the spacecraft's position and velocity can be inferred. None of this work treats the case where the spacecraft arrives at an asteroid and we want the spacecraft to be able to immediately hover in the body-fixed frame when both 1.) there is no knowledge of the environmental dynamics and 2.) there is not an existing shape model that can be used by a navigation system to infer the spacecraft's position and velocity.In this work we focus on the body-fixed hovering problem, where the spacecraft can be commanded to hover at its current position and attitude. By body-fixed, we mean that the spacecraft's position remains fixed with respect to the asteroid's surface. In contrast, when hovering in the asteroid centered inertial reference frame the asteroid rotates below the spacecraft. Unlike previous work, we do not assume that the spacecraft can infer position and velocity from measurements (as is possible with a preexisting shape model) and that the environmental dynamics are unknown (except to the extent required for thruster sizing). The goal is to remain at a constant asteroid body-fixed position and attitude from the commencement of the hovering maneuver. We will assume that the spacecraft is equipped with a flash LIDAR system, gyroscopes that can measure the change in the spacecraft's attitude from the initiation of the hovering maneuver, and rate gyros that measure rotational velocity. We further assume that these sensors can provide measurements every 6s. At the start of the hovering maneuve...