Background Appropriate sampling position selection is a key step of the renal artery ultrasound examination to obtain proper spectral waveform to evaluate the renal artery blood flow, which is a challenge for inexperience physicians. Based on deep learning (DL) technology, this study models sampling position selection as an object detection process in the color doppler sonography (CDS) images to assist renal artery ultrasound scanning.Methods 2004 patients received renal artery ultrasound examination in Peking Union Medical College Hospital from August 2017 to December 2019 were included. CDS images from these patients were classified into four categories, abdominal aorta (AO), normal renal artery (NRA), renal artery stenosis (RAS), and intrarenal interlobular artery (IRA) according to scanning position, and then randomly split into model training dataset (N = 6661 images), parameter optimizing dataset (N = 441), and clinical validation dataset (N = 1243). Seven DL object detection models, including three two-stage models (Faster R-CNN, Cascade R-CNN, and Double Head R-CNN), and four one-stage models (RetinaNet, YOLOv3, FoveaBox, and Deformable DETR), were trained and evaluated. The predictive accuracy of sampling position selection was calculated as an indicator of model’s efficiency. For each model, 10 trained results were obtained and the difference of seven models’ efficiencies were compared with independent two-sample t-test.Results The Double Head R-CNN model achieved the significantly higher average accuracies on both parameter optimizing and validation datasets (89.3 ± 0.6% and 88.5 ± 0.3%) than other methods (P-value < 0.001). Performance of three two-stage DL object detection models were better than the RetinaNet, FoveaBox, and Deformable DETR (P < 0.001). On clinical validation data, predictive accuracies of the Double Head R-CNN model on four types of images (AO, NRA, RAS, and IRA) were 86.5 ± 1.1%, 90.4 ± 0.1%, 84.7 ± 1.0%, and 88.8 ± 0.6% respectively, which were all significantly higher than the other methods (P < 0.001). Besides the predictive performance of Double Head R-CNN model on NRA and IRA images were better than that on the RAS and AO (P < 0.001).Conclusions The DL object detection model achieves well predictive validity and is promising to help physicians to improve the accuracy of sampling position selection during renal artery ultrasound examination.