Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong MAPF (LMAPF) -an online variant of the problem where agents are immediately assigned a new goal upon reaching their current one -in dense and highly structured environments, typical of real-world warehouse operations. Effectively solving LMAPF in such environments requires expensive coordination between agents as well as frequent replanning abilities, a daunting task for existing coupled and decoupled approaches alike. With the purpose of achieving considerable agent coordination without any compromise on reactivity and scalability, we introduce PRIMAL 2 , a distributed reinforcement learning framework for LMAPF where agents learn fully decentralized policies to reactively plan paths online in a partially observable world. We extend our previous work, which was effective in low-density sparsely occupied worlds, to highly structured and constrained worlds by identifying behaviors and conventions which improve implicit agent coordination, and enabling their learning through the construction of a novel local agent observation and various training aids. We present extensive results of PRIMAL 2 in both MAPF and LMAPF environments with up to 1024 agents and compare its performance to complete state-of-the-art planners. We experimentally observe that agents successfully learn to follow ideal conventions and can exhibit selfless coordinated maneuvers that maximize joint rewards. We find that not only does PRIMAL 2 significantly surpass our previous work, it is also able to perform on par and even outperform state-of-theart planners in terms of throughput.
Designing polymeric membranes with high solute−solute selectivity and permeance is important but technically challenging. Existing industrial interfacial polymerization (IP) process to fabricate polyamide-based polymeric membranes is largely empirical, which requires enormous trial-and-error experimentations to identify optimal fabrication conditions from a wide candidate space for separating a given solute pair. Herein, we developed a novel multitask machine learning (ML) model based on an artificial neural network (ANN) with skip connections and selectivity regularization to guide the design of polyamide membranes. We used limited sets of lab-collected data to obtain satisfactory model performance over four iterations by introducing human expert experience in the online learning process. Four membranes under fabrication conditions guided by the model exceeded the present upper bound for mono/divalent ion selectivity and permeance of the polymeric membranes. Moreover, we obtained new mechanistic insights into the membrane design through feature analysis of the model. Our work demonstrates a ML approach that represents a paradigm shift for high-performance polymeric membranes design.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.