In order to estimate bayberry yield, a lightweight bayberry target detection count model, YOLOv7-CS, based on YOLOv7, was proposed to address the issues of slow detection and recognition speed, as well as low recognition rate, of high-density bayberry targets under complex backgrounds. In this study, 8990 bayberry images were used for experiments. The training set, validation set, and test set were randomly recreated in a ratio of 8:1:1. The new network was developed with SPD-Conv detection head modules to extract features at various scales, to better capture small and indistinct bayberry targets. To improve accuracy and achieve a lightweight design, a CNxP module that replaces the backbone’s ELAN structure is proposed. We propose a global attention mechanism (GAM) in the intermediate layers of the network, to enhance cross-dimensional interactions, and a new pyramid pooling module called SPPFCSPC, to extend the field of perception and improve boundary detection accuracy. Finally, we combine the Wise-IoU function to enhance the network’s ability to identify overlapping and occluded objects. Compared with the SSD, Faster-RCNN, DSSD, and YOLOv7X target detection algorithms, YOLOv7-CS increases mAP 0.5 by 35.52%, 56.74%, 12.36%, and 7.05%. Compared with basic YOLOv7, mAP 0.5 increased from 5.43% to 90.21%, while mAP 0.95 increased from 13.2% to 54.67%. This parameter is reduced by 17.3 m. Ablation experiments further show that the designed module improves the accuracy of bayberry detection, reduces parameter counts, and makes bayberry image detection more accurate and effective.