To address the issues of low model accuracy caused by complex ground environments and uneven target scales and high computational complexity in unmanned aerial vehicle (UAV) aerial infrared image target detection, this study proposes a lightweight UAV aerial infrared small target detection algorithm called PHSI-RTDETR. Initially, an improved backbone feature extraction network is designed using the lightweight RPConv-Block module proposed in this paper, which effectively captures small target features, significantly reducing the model complexity and computational burden while improving accuracy. Subsequently, the HiLo attention mechanism is combined with an intra-scale feature interaction module to form an AIFI-HiLo module, which is integrated into a hybrid encoder to enhance the focus of the model on dense targets, reducing the rates of missed and false detections. Moreover, the slimneck-SSFF architecture is introduced as the cross-scale feature fusion architecture of the model, utilizing GSConv and VoVGSCSP modules to enhance adaptability to infrared targets of various scales, producing more semantic information while reducing network computations. Finally, the original GIoU loss is replaced with the Inner-GIoU loss, which uses a scaling factor to control auxiliary bounding boxes to speed up convergence and improve detection accuracy for small targets. The experimental results show that, compared to RT-DETR, PHSI-RTDETR reduces model parameters by 30.55% and floating-point operations by 17.10%. Moreover, detection precision and speed are increased by 3.81% and 13.39%, respectively, and mAP50, impressively, reaches 82.58%, demonstrating the great potential of this model for drone infrared small target detection.