In automatic parking motion planning, multi-objective optimization including safety, comfort, parking efficiency, and final parking performance should be considered. Most of the current research relies on the parking data from expert drivers or prior knowledge of humans. However, it is challenging to obtain a large amount of high-quality expert drivers' data. Furthermore, expert drivers' data or prior knowledge of humans does not guarantee an optimal multi-objective parking performance. In this paper, we propose a model-based reinforcement learning method that learns parking policy of the data, by executing the data generation, data evaluation, and training network, iteratively. The trained network is used to guide the data generation cycle in the subsequent iteration. Based on this proposed method, we can get rid of human experience largely and learn parking strategies autonomously and quickly. The learned strategies ensure the multi-objective optimality of above requirements in the parking process. First, an environment model that approximates the actual environment is established, and the learning efficiency is accelerated through the simulated interaction between the agent and the environment model. To make the system independent of expert data or prior knowledge, a data generation algorithm combining Monte Carlo Tree Search (MCTS) and longitudinal and lateral policies is proposed. Then, to meet the multi-objective optimal demands mentioned above, a reward function is constructed to evaluate and filter the parking data. Finally, a neural network is used to learn the parking strategy from the filtered data. From the real vehicle test benchmarked with a mass-produced parking system, the proposed method is found to achieve better parking efficiency and lower requirements for start parking posture, thereby verifying the algorithm's superiority.
Reinforcement learning (RL) is a promising direction in automated parking systems (APSs), as integrating planning and tracking control using RL can potentially maximize the overall performance. However, commonly used model-free RL requires many interactions to achieve acceptable performance, and model-based RL in APS cannot continuously learn. In this paper, a data-efficient RL method is constructed to learn from data by use of a model-based method. The proposed method uses a truncated Monte Carlo tree search to evaluate parking states and select moves. Two artificial neural networks are trained to provide the search probability of each tree branch and the final reward for each state using self-trained data. The data efficiency is enhanced by weighting exploration with parking trajectory returns, an adaptive exploration scheme, and experience augmentation with imaginary rollouts. Without human demonstrations, a novel training pipeline is also used to train the initial action guidance network and the state value network. Compared with path planning and path-following methods, the proposed integrated method can flexibly co-ordinate the longitudinal and lateral motion to park a smaller parking space in one maneuver. Its adaptability to changes in the vehicle model is verified by joint Carsim and MATLAB simulation, demonstrating that the algorithm converges within a few iterations. Finally, experiments using a real vehicle platform are used to further verify the effectiveness of the proposed method. Compared with obtaining rewards using simulation, the proposed method achieves a better final parking attitude and success rate.
A simplified two-dimensional axisymmetric model was established based on a typical continental sedimentary basin in China to simulate the thermal evolution of wellbore and reservoir during the injection of CO 2 by taking consideration of lithology heterogeneity of reservoir. By comparing with two simple one-dimensional theory models, the lithology heterogeneity influence on CO 2 mass flow rate distribution along depth in the wellbore is identified. Results suggested that the interaction of multiple layers in the heterogeneous reservoir will influence the CO 2 mass flow rate distribution along depth in the wellbore so as to impact the corresponding temperature and pressure evolution in the wellbore and reservoir. Layer burial depth (or relative location), porosity, permeability and thickness are all important factors that affect CO 2 mass flow rate in wellbore. The variation of CO 2 mass flow rate in the wellbore will change the CO 2 temperature flowing into each layers through impact the heat extraction from rocks, compressibility of CO 2 and potential energy loss, and by varying the CO 2 hydrostatic pressure and pressure drop due to friction to determine the CO 2 injection pressure. Layer burial depth, porosity, permeability and thickness are all important factors that affect the CO 2 mass flow rate distribution in the wellbore. This study may help deepen our understanding of CO 2 flow and thermal evolution in the actual heterogeneous reservoir and provide important knowledge supplement for the liquid injection (especially CO 2 ) into underground, such as deep saline aquifer, depleted oil/gas reservoir and coal bed.
Gut bacteria consists of 150 times more genes than humans that are vital for health. Several studies revealed that gut bacteria are associated with disease status and influence human behavior and mentality. Whether human brain injury alters the gut bacteria is yet unclear, we tested 20 fecal samples from patients with cerebral intraparenchymal hemorrhage and corresponding healthy controls through metagenomic shotgun sequencing. The composition of patients’ gut bacteria changed significantly at the phylum level; Verrucomicrobiota was the specific phylum colonized in the patients’ gut. The functional alteration was observed in the patients’ gut bacteria, including high metabolic activity for nutrients or neuroactive compounds, strong antibiotic resistance, and less virulence factor diversity. The changes in the transcription and metabolism of differential species were more evident than those of the non-differential species between groups, which is the primary factor contributing to the functional alteration of patients with cerebral intraparenchymal hemorrhage.
Beamforming based on microphone array measurements is a popular method for identifying sound sources. However, beamforming has many limitations that limit their precision. These limitations are addressed in research. To separate the contributions which come from two sides of the microphone array more accurately, an innovative beamforming method based on a double-layer microphone array, called functional generalized inverse beamforming (FGIB), is proposed to improve beamforming performance. This method, which involves the use of a priori beamforming regularization matrix and a matrix function to redefine the inverse problem, is combined with the advantages of both generalized inverse beamforming (GIB) and functional beamforming. Compared with GIB, with reduced iterations, the computational efficiency of FGIB is greatly improved. The dynamic range of the proposed method can be modestly improved as order v increases. Furthermore, the sidelobes gradually disappear and the mainlobes become narrower. Both simulations and experiments have shown that the sources are correctly located and separated. The proposed FGIB demonstrates the good performance when compared to other beamforming methods in terms of resolution and sidelobes level.
Automated parking system (APS) that explicitly considers the time efficiency of the motion has received large amounts of attention in recent years. Trajectory planning module in these APS delivered parking trajectory, which was expected to be precisely tracked by tracking module. However, the reference points of frequently used trackers were selected in the spatial domain, resulting in significant trajectory tracking errors with temporal information. In this paper, a tracking control method called ILC-MPC, which combined model predictive control (MPC) and iterative learning control (ILC), was proposed to improve the spatiotemporal tracking accuracy of the autonomous vehicle. ILC was utilized for longitudinal compensation using the error signal between historical and expected speed. Accordingly, the error model in the longitudinal direction was simplified to decrease the number of decision variables in MPC. Simulation experiments using CarSim were carried out to compare the proposed method with open-loop control, linear quadratic regulator (LQR), and pure MPC that had a similar computing time with ILC-MPC. ILC-MPC converged in a few iterations of the learning process and achieved the highest tracking accuracy in spatiotemporal domain among the mentioned methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.