Experience replay enables reinforcement learning agents to memorize and reuse past experiences, just as humans replay memories for the situation at hand. Contemporary off-policy algorithms either replay past experiences uniformly or utilize a rulebased replay strategy, which may be sub-optimal. In this work, we consider learning a replay policy to optimize the cumulative reward. Replay learning is challenging because the replay memory is noisy and large, and the cumulative reward is unstable. To address these issues, we propose a novel experience replay optimization (ERO) framework which alternately updates two policies: the agent policy, and the replay policy. The agent is updated to maximize the cumulative reward based on the replayed data, while the replay policy is updated to provide the agent with the most useful experiences. The conducted experiments on various continuous control tasks demonstrate the effectiveness of ERO, empirically showing promise in experience replay learning to improve the performance of off-policy reinforcement learning algorithms.
The echo-enabled harmonic generation (EEHG) scheme holds promising prospects for efficiently generating intense coherent radiation at very high harmonics of a conventional ultraviolet seed laser. We report the lasing of the EEHG free-electron laser (FEL) at an extreme ultraviolet (EUV) wavelength with a seeded FEL facility, the Shanghai soft x-ray FEL. For the first time, we have benchmarked the basic theory of EEHG by measuring the bunching factor distributions over one octave down to the EUV region. Our results demonstrated the key advantages of the EEHG FEL, i.e., generation of very high harmonics with a small laser-induced energy spread and insensitivity to beam imperfections, and marks a great step towards fully coherent x rays with the EEHG scheme.
Graph data are pervasive in many real-world applications. Recently, increasing attention has been paid on graph neural networks (GNNs), which aim to model the local graph structures and capture the hierarchical patterns by aggregating the information from neighbors with stackable network modules. Motivated by the observation that different nodes often require different iterations of aggregation to fully capture the structural information, in this paper, we propose to explicitly sample diverse iterations of aggregation for different nodes to boost the performance of GNNs. It is a challenging task to develop an effective aggregation strategy for each node, given complex graphs and sparse features. Moreover, it is not straightforward to derive an efficient algorithm since we need to feed the sampled nodes into different number of network layers. To address the above challenges, we propose Policy-GNN, a metapolicy framework that models the sampling procedure and message passing of GNNs into a combined learning process. Specifically, Policy-GNN uses a meta-policy to adaptively determine the number of aggregations for each node. The meta-policy is trained with deep reinforcement learning (RL) by exploiting the feedback from the model. We further introduce parameter sharing and a buffer mechanism to boost the training efficiency. Experimental results on three real-world benchmark datasets suggest that Policy-GNN significantly outperforms the state-of-the-art alternatives, showing the promise in aggregation optimization for GNNs.
Graph neural networks (GNN) has been successfully applied to operate on the graph-structured data. Given a specific scenario, rich human expertise and tremendous laborious trials are usually required to identify a suitable GNN architecture. It is because the performance of a GNN architecture is significantly affected by the choice of graph convolution components, such as aggregate function and hidden dimension. Neural architecture search (NAS) has shown its potential in discovering effective deep architectures for learning tasks in image and language modeling. However, existing NAS algorithms cannot be directly applied to the GNN search problem. First, the search space of GNN is different from the ones in existing NAS work. Second, the representation learning capacity of GNN architecture changes obviously with slight architecture modifications. It affects the search efficiency of traditional search methods. Third, widely used techniques in NAS such as parameter sharing might become unstable in GNN.To bridge the gap, we propose the automated graph neural networks (AGNN) framework, which aims to find an optimal GNN architecture within a predefined search space. A reinforcement learning based controller is designed to greedily validate architectures via small steps. AGNN has a novel parameter sharing strategy that enables homogeneous architectures to share parameters, based on a carefully-designed homogeneity definition. Experiments on real-world benchmark datasets demonstrate that the GNN architecture identified by AGNN achieves the best performance, comparing with existing handcrafted models and tradistional search methods.
A hard X-ray Split-and-Delay Line (SDL) under construction for the Materials Imaging and Dynamics station at the European X-Ray Free-Electron Laser (XFEL) is presented. This device aims at providing pairs of X-ray pulses with a variable time delay ranging from -10 ps to 800 ps in a photon energy range from 5 to 10 keV for photon correlation and X-ray pump-probe experiments. A custom designed mechanical motion system including active feedback control ensures that the high demands for stability and accuracy can be met and the design goals achieved. Using special radiation configurations of the European XFEL's SASE-2 undulator (SASE: Self-Amplified Spontaneous Emission), two-color hard x-ray pump-probe schemes with varying photon energy separations have been proposed. Simulations indicate that more than 10 photons on the sample per pulse-pair and up to about 10% photon energy separation can be achieved in the hard X-ray region using the SDL.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.