A pixel-based segmentation method was demonstrated to be confounded by developmental stage in estimation of flowering of mango. Categorization of panicles into three developmental stages was undertaken with a single and a two-stage deep learning framework (YOLO and R2CNN), using either upright or rotated bounding boxes. For a validation image set and for total panicle count, the models MangoYOLO(-upright), MangoYOLO-rotated, YOLOv3-rotated, R2CNN(-rotated) and R2CNN-upright achieved: (i) RMSEs of 25.6, 16.0, 15.4, 25.8 and 32.3 panicles per tree image, (ii) Mean average precision (mAP) scores of 72.2, 69.1, 65.0, 62.5 and 70.9% and (iii) weighted F1-scores of 76.5, 76.1, 74.9, 74.0 and 82.0, respectively. For a test set of images involving a different orchard and cultivar and use of a different camera, the R2 for machine vision to human count of panicles per tree was 0.86, 0.80, 0.83, 0.81 and 0.76 for the same models, respectively. Thus, models generalised well, but with no consistent benefit from use of rotated over upright bounding boxes. While the YOLOv3-rotated model was superior in terms of total panicle count, the R2CNN-upright model was more accurate for panicle stage classification. To demonstrate practical application, panicle counts were made weekly for an orchard of 994 trees, with a peak detection routine applied to document multiple flowering events.