Machine vision plays a great role in localizing strawberries in a complex orchard or greenhouse for picking robots. Due to the variety of each strawberry (shape, size, and color) and occlusions of strawberries by leaves and stems, precisely locating each strawberry brings a great challenge to the vision system of picking robots. Several methods have been developed for localizing strawberries, based on the well-known Mask R-CNN network, which, however, are not efficient running on the picking robots. In this paper, we propose a simple and highly efficient framework for strawberry instance segmentation running on low-power devices for picking robots, termed StrawSeg. Instead of using the common paradigm of “detection-then-segment”, we directly segment each strawberry in a single-shot manner without relying on object detection. In our model, we design a novel feature aggregation network to merge features with different scales, which employs a pixel shuffle operation to increase the resolution and reduce the channels of features. Experiments on the open-source dataset StrawDI_Db1 demonstrate that our model can achieve a good trade-off between accuracy and inference speed on a low-power device.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.