Accurately mapping farmlands is important for precision agriculture practices. Unmanned Aerial Vehicles (UAV) embedded with multispectral cameras are commonly used to map vegetation in these areas. However, separating plantation fields from the remaining objects in a multispectral scene is a difficult task for most traditional algorithms. In this manner, deep learning methods that perform semantic segmentation could help improve the overall outcome. To the best of our knowledge, in the agricultural context, it is yet unknown the performance of deep networks to semantic segmentation in UAV-based multispectral imagery; especially in arboreous vegetation types like citrus-orchards. Here, we evaluate state-of-the-art deep learning methods to semantic segment citrus-trees in multispectral images. For this purpose, we used a multispectral camera that operates at the green (530-570 nm), red (640-680 nm), red-edge (730-740 nm), and also near-infrared (770-810 nm) spectral regions. We evaluated the performance of the five state-of-the-art pixelwise methods: FCN, U-Net, SegNet, DeepLabV3+, and DDCN. Our results indicate that the evaluated methods performed similarly in the proposed task, returning F1-Scores between 94.00% (FCN and U-Net) and 94.42% (DDCN). We also determined the inference time needed per area, and although the DDCN method was slower, based on a qualitative analysis, it performed better in highly shadow-affected areas. We conclude that the semantic segmentation of citrus orchards is highly achievable with deep neural networks. The state-of-the-art deep learning methods investigated here proved to be equally suitable to solve this task, providing fast solutions with inference time varying from 0.98 to 4.36 minutes per hectare. This approach could be incorporated into similar research, and contribute to decision-making and accurate mapping of the plantation fields.