Online 3D Bin Packing with Constrained Deep Reinforcement Learning

Zhao, Hang; She, Qijin; Zhu, Chenyang; Yang, Yin; Xu, Kai

doi:10.48550/arxiv.2006.14978

Cited by 5 publications

(13 citation statements)

References 40 publications

(40 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, it would be convenient to know the maximum score for evaluation purposes. For this reason, [2] constructs three types of sequences, CUT-1, CUT-2 and Random Sequence (RS). CUT-1 and CUT-2 items are first generated via cutting-stock, that is to say, a bin sized cuboid is randomly and recursively 'cut' until the sliced items match the size constraints.…”

Section: Datasetsmentioning

confidence: 99%

“…All these three datasets consist of 2000 and 100 sequences for training and testing respectively. The item dimensions vary in the range between [2,5] in all three dimensions, forming a set of 64 different items, while the bin resolution is 10 × 10 × 10.…”

Section: Datasetsmentioning

confidence: 99%

“…In our work, we opt for a more *This work is supported, in part, by "3D Bin Packing with Deep Reinforcement Learning" project funded by Hyundai Robotics Co. Ltd., in part, by "Edge Brain Based Intelligent Manufacturing" project IITP-2022-0-00067, in part, by AI Graduate School Program, Grant No.2019-0-00421, and by ICT Consilience Program, IITP-2020-0-01821, of the Institute of Information and Communication Technology Planning Evaluation (IITP), sponsored by the Korean Ministry of Science and Information Technology (MSIT). 1 Authors from the Artificial Intelligence School, Sungkyunkwan University, Suwon, South Korea, * Corresponding author: Sukhan Lee Lsh1@skku.edu practical definition based on [2], where the decisions are irreversible and items are delivered in sequence one by one (online), such that we give special attention to the immediate items B ⊂ I. In practice, a conveyor belt carries the item sequence to a robotic arm located at the head of the line.…”

Section: Introductionmentioning

confidence: 99%

“…[20], inspired from previously mentioned works, presented a solution for the 3D-BPP relying on Pointer Net. Another relevant work is presented in [2], the authors introduced an on-policy model-free DRL agent composed of an actor, a critic and a predictor, to predict action probabilities, value and feasibility mask respectively. Multiple on-policy and off-policy algorithms were trained, being the on-policy method [10] the one with the highest performance.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Online 3D Bin Packing Reinforcement Learning Solution with Buffer

Puche¹,

Lee²

2022

Preprint

View full text Add to dashboard Cite

The 3D Bin Packing Problem (3D-BPP) is one of the most demanded yet challenging problems in industry, where an agent must pack variable size items delivered in sequence into a finite bin with the aim to maximize the space utilization. It represents a strongly NP-Hard optimization problem such that no solution has been offered to date with high performance in space utilization. In this paper, we present a new reinforcement learning (RL) framework for a 3D-BPP solution for improving performance. First, a buffer is introduced to allow multi-item action selection. By increasing the degree of freedom in action selection, a more complex policy that results in better packing performance can be derived. Second, we propose an agnostic data augmentation strategy that exploits both bin item symmetries for improving sample efficiency. Third, we implement a model-based RL method adapted from the popular algorithm AlphaGo, which has shown superhuman performance in zerosum games. Our adaptation is capable of working in singleplayer and score based environments. In spite of the fact that AlphaGo versions are known to be computationally heavy, we manage to train the proposed framework with a single thread and GPU, while obtaining a solution that outperforms the stateof-the-art results in space utilization.

show abstract

Section: Datasetsmentioning

confidence: 99%

Section: Datasetsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Online 3D Bin Packing Reinforcement Learning Solution with Buffer

Puche¹,

Lee²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…The final architecture will be considered feasible only if all the blocks produce a flat surface and two cliffs are stably connected -fluctuations on the bridge surface or a tiny misplacement of blocks can result in a complete failure. Our task is much more challenging than similar assembly tasks like bin-packing, where a partial score can be achieved even if the bin is not fully packed [6], [7].…”

Section: Introductionmentioning

confidence: 99%

Learning to Design and Construct Bridge without Blueprint

Li¹,

Kong²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

Interpretability of rectangle packing solutions with Monte Carlo tree search

Galán López,

González García,

García Díaz

et al. 2024

J Heuristics

View full text Add to dashboard Cite

Packing problems have been studied for a long time and have great applications in real-world scenarios. In recent times, with problems in the industrial world increasing in size, exact algorithms are often not a viable option and faster approaches are needed. We study Monte Carlo tree search, a random sampling algorithm that has gained great importance in literature in the last few years. We propose three approaches based on MCTS and its integration with metaheuristic algorithms or deep learning models to obtain approximated solutions to packing problems that are also interpretable by means of MCTS exploration and from which knowledge can be extracted. We focus on two-dimensional rectangle packing problems in our experimentation and use several well known benchmarks from literature to compare our solutions with existing approaches and offer a view on the potential uses for knowledge extraction from our method. We manage to match the quality of state-of-the-art methods, with improvements in time with respect to some of them and greater interpretability.

show abstract

Online 3D Bin Packing with Constrained Deep Reinforcement Learning

Cited by 5 publications

References 40 publications

Online 3D Bin Packing Reinforcement Learning Solution with Buffer

Online 3D Bin Packing Reinforcement Learning Solution with Buffer

Learning to Design and Construct Bridge without Blueprint

Interpretability of rectangle packing solutions with Monte Carlo tree search

Contact Info

Product

Resources

About