“…At each time step, we sample candidate paths for the robot and evaluate utilities over those paths. Since we need to approach the target object to grasp it, we sample, near the approximate target object position p target and within a radius that relates to the robot's high reachability area [30], [31], N b base poses that serve as goals for our path generation {p goali base } N b i=0 . Note that the camera poses at the sampled base goal poses are randomly selected within a field of view such that the robot always "looks" at the bounding box area of the target object.…”