Ultrasound (US) imaging is beneficial for kidney diagnosis; however, it involves sophisticated tasks that must be performed by physicians to obtain the target image. We propose a target-image search strategy combining visual servoing and deep learning-based image evaluation for robotic kidney US imaging. The search strategy is designed by mimicking physicians’ motion axis of the US probe. By controlling the position of the US probe along each of the motion axes while evaluating the obtained US images based on an anatomical feature extraction method via instance segmentation with YOLACT++, we are able to search for an optimal target image. The proposed approach was validated through phantom studies. The results showed that the proposed approach could find the target kidney images with error rates of 2.88±1.76 mm and 2.75±3.36°. Thus, the proposed method enables the accurate identification of the target image, which highlights its potential for application in autonomous kidney US imaging.