Online 3D Bin Packing with Constrained Deep Reinforcement Learning

Zhao, Hang; She, Qijin; Zhu, Chenyang; Yang, Yin; Xu, Kai

doi:10.1609/aaai.v35i1.16155

Cited by 68 publications

(38 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This work also discusses multiple constraints and criteria that should be considered in online BPPs. Zhao et al discretise the bottom of a bin as the grid, with grid points to be selected for locating the front-leftbottom corner of a box, and optimise space utilisation rate in 3D-BPP by an on-policy actor-critic algorithm [21,109]. This method is limited to integer constrained box sizes and employ an additional neural network to predict infeasible grid points for pruning solution space.…”

Section: On-line Bppsmentioning

confidence: 99%

“…This method is limited to integer constrained box sizes and employ an additional neural network to predict infeasible grid points for pruning solution space. To handle continuous sizes of boxes, the authors also propose a tree search-based learning method, where leaf nodes represent candidate placements for the next item, and they use attention-based neural network to select a leaf node and spread the tree [18]. Yang et al reflect the promising candidate actions (derived from heuristics) as MDP rewards for the agent, so as to guide the policy learning via PPO [107].…”

Section: On-line Bppsmentioning

confidence: 99%

“…However, very a few current works learn to decide on all three aspects [20,66], since it will cause the issue of the large action space. This could be more intractable when continuous positions are considered [18]. From the perspective of DRL, such large action space could be potentially tackled by the algorithms for continuous action space, for example, DDPG, TD3, and decomposition techniques.…”

Section: Future Directionsmentioning

confidence: 99%

“…Despite very early attempts that couple machine learning (ML) with COPs [3][4][5], recent deep learning-based methods present promising results, which are comparable to classic optimisation methods [6][7][8]. Their success has inspired miscellaneous deep models to cope with specific COPs, for example, vehicle routing [9][10][11][12][13], scheduling [14][15][16][17], bin packing [18][19][20][21], or enhance performance in solving general COPs, for example, integer programming [22][23][24][25][26][27][28] and constraint programming [29][30][31]. The corresponding literature on the methods for specific COPs are reviewed respectively in refs [32][33][34][35].…”

mentioning

confidence: 99%

See 3 more Smart Citations

A review on learning to solve combinatorial optimisation problems in manufacturing

Zhang

et al. 2023

IET Collab Intel Manufact

View full text Add to dashboard Cite

An efficient manufacturing system is key to maintaining a healthy economy today. With the rapid development of science and technology and the progress of human society, the modern manufacturing system is becoming increasingly complex, posing new challenges to both academia and industry. Ever since the beginning of industrialisation, leaps in manufacturing technology have always accompanied technological breakthroughs from other fields, for example, mechanics, physics, and computational science. Recently, machine learning (ML) technology, one of the crucial subjects of artificial intelligence, has made remarkable progress in many areas. This study thoroughly reviews how ML, specifically deep (reinforcement) learning, motivates new ideas for addressing challenging problems in manufacturing systems. We collect the literature targeting three aspects: scheduling, packing, and routing, which correspond to three pivotal cooperative production links of today's manufacturing system, that is, production, packing, and logistics respectively. For each aspect, we first present and discuss the state-of-the-art research. Then we summarise and analyse the development trends and point out future research opportunities and challenges. K E Y W O R D Sbin packing, combinatorial optimisation, deep reinforcement learning, job shop scheduling, manufacturing systems, vehicle routing | INTRODUCTIONCombinatorial optimisation problems (COPs), as one important branch of mathematical optimisation, have practical applications in many fields, such as communication, transportation, manufacturing and aroused broad research in industrial engineering, computer science, and operations research. Due to the NP (non-deterministic polynomial-time) hardness, finding their optimal solutions is challenging. In specific, the discrete solution space in COPs renders the optimisation less efficient, without the guidance of gradient as in continuous optimisation. Meanwhile, the complexity of searching the (near-)optimal solution(s) among feasible solutions could exponentially increase as the problem scale grows. Classic methods, including exact algorithms and (meta-)heuristics, generally depend on massive expertise and tuning work to solve specific problems. They are Cong Zhang, Yaoxin Wu, and Yining Ma are equal contribution.

show abstract

Section: On-line Bppsmentioning

confidence: 99%

Section: On-line Bppsmentioning

confidence: 99%

Section: Future Directionsmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

A review on learning to solve combinatorial optimisation problems in manufacturing

Zhang

et al. 2023

IET Collab Intel Manufact

View full text Add to dashboard Cite

show abstract

“…In order to efficiently generate the optimal sequence and placement of objects, heuristic methods [1], [3], [4] with the greedy objective minimize the object stack heights in the packing boxes. Since the greedy search results in sub-optimal solution and high computational cost for object arrangement, data-driven methods [2], [5], [6] employ the reinforcement learning framework for bin packing. However, the objects for packing in realistic applications are usually irregular.…”

Section: Introductionmentioning

confidence: 99%

Planning Irregular Object Packing via Hierarchical Reinforcement Learning

Huang

Wang

Zhou

et al. 2023

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

Object packing by autonomous robots is an important challenge in warehouses and logistics industry. Most conventional data-driven packing planning approaches focus on regular cuboid packing, which are usually heuristic and limit the practical use in realistic applications with everyday objects. In this paper, we propose a deep hierarchical reinforcement learning approach to simultaneously plan packing sequence and placement for irregular object packing. Specifically, the top manager network infers packing sequence from six principal view heightmaps of all objects, and then the bottom worker network receives heightmaps of the next object to predict the placement position and orientation. The two networks are trained hierarchically in a self-supervised Q-Learning framework, where the rewards are provided by the packing results based on the top height , object volume and placement stability in the box. The framework repeats sequence and placement planning iteratively until all objects have been packed into the box or no space is remained for unpacked items. We compare our approach with existing robotic packing methods for irregular objects in a physics simulator. Experiments show that our approach can pack more objects with less time cost than the state-of-the-art packing methods of irregular objects. We also implement our packing plan with a robotic manipulator to show the generalization ability in the real world.

show abstract

A Large-Scale Tobacco 3D Bin Packing Model Based on Dual-Task Learning of Group Blocks

Li¹,

Wang²

2022

Artificial Intelligence

View full text Add to dashboard Cite

Online 3D Bin Packing with Constrained Deep Reinforcement Learning

Cited by 68 publications

References 37 publications

A review on learning to solve combinatorial optimisation problems in manufacturing

A review on learning to solve combinatorial optimisation problems in manufacturing

Planning Irregular Object Packing via Hierarchical Reinforcement Learning

A Large-Scale Tobacco 3D Bin Packing Model Based on Dual-Task Learning of Group Blocks

Contact Info

Product

Resources

About