Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search

Tian, Yuan; Wang, Qin; Huang, Zhiwu; Li, Wen; Dai, Dengxin; Yang, Minghao; Wang, Jun; Fink, Olga

doi:10.1007/978-3-030-58571-6_11

Cited by 48 publications

(33 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Dai et al [99] Data adapted pruning for efficient neural architecture search DA-NAS 2020 ECCV Gradient based Classification Tian et al [100] Efficient and effective GAN architecture search E 2 GAN 2020 ECCV Reinforcement learning GAN Chu et al [101] Fair differentiable architecture search FairDARTS 2020 ECCV Gradient based Classification Hu et al [102] Three-freedom neural architecture search TF-NAS 2020 ECCV Gradient based Classification Hu et al [103] Angle-based search space shrinking ABS 2020 ECCV Other Classification Yu et al [104] Barrier penalty neural architecture search BP-NAS 2020 ECCV Other Classification Wang et al [105] Attention cell search for video classification AttentionNAS 2020 ECCV Other Video classification Bulat et al [106] Binary architecTure search BATS 2020 ECCV Other Classification Yu et al [107] Neural architecture search with big single-stage models BigNAS 2020 ECCV Gradient based Classification Guo et al [108] Single path one-shot neural architecture search with uniform sampling Single-Path-SuperNet 2020 ECCV Evolutionary algorithm Classification Liu et al [109] Unsupervised neural architecture search UnNAS 2020 ECCV Gradient based Classification get tasks, which can solve large GPU memory consumption problems and long computation time of the NAS method. Liu et al [67] proposed the method of DARTS for effective structure search.…”

Section: Gradient Based Classificationmentioning

confidence: 99%

2D and 3D Palmprint and Palm Vein Recognition Based on Neural Architecture Search

Jia

Xia

Min

et al. 2021

Int. J. Autom. Comput.

View full text Add to dashboard Cite

Palmprint recognition and palm vein recognition are two emerging biometrics technologies. In the past two decades, many traditional methods have been proposed for palmprint recognition and palm vein recognition and have achieved impressive results. In recent years, in the field of artificial intelligence, deep learning has gradually become the mainstream recognition technology because of its excellent recognition performance. Some researchers have tried to use convolutional neural networks (CNNs) for palmprint recognition and palm vein recognition. However, the architectures of these CNNs have mostly been developed manually by human experts, which is a time-consuming and error-prone process. In order to overcome some shortcomings of manually designed CNN, neural architecture search (NAS) technology has become an important research direction of deep learning. The significance of NAS is to solve the deep learning model’s parameter adjustment problem, which is a cross-study combining optimization and machine learning. NAS technology represents the future development direction of deep learning. However, up to now, NAS technology has not been well studied for palmprint recognition and palm vein recognition. In this paper, in order to investigate the problem of NAS-based 2D and 3D palmprint recognition and palm vein recognition in-depth, we conduct a performance evaluation of twenty representative NAS methods on five 2D palmprint databases, two palm vein databases, and one 3D palmprint database. Experimental results show that some NAS methods can achieve promising recognition results. Remarkably, among different evaluated NAS methods, ProxylessNAS achieves the best recognition performance.

show abstract

Section: Gradient Based Classificationmentioning

confidence: 99%

2D and 3D Palmprint and Palm Vein Recognition Based on Neural Architecture Search

Jia

Xia

Min

et al. 2021

Int. J. Autom. Comput.

View full text Add to dashboard Cite

show abstract

“…In adversarial training, the discriminator and generator compete, forcing the generator to produce high-quality output that can fool the discriminator. Adversarial training is usually successful in image generation (Karras et al 2019;Tian et al 2020;Gong et al 2019), limited contribution to natural language processing tasks (Wiseman and Rush 2016;Yang et al 2018;Yu et al 2017), mainly due to the difficulty in propagating error signals from discriminator to generator through discrete generated natural language tokens. Yu et al (2017) alleviates such difficulties employing reinforcement learning methods for sequence generation.…”

Section: Related Workmentioning

confidence: 99%

“…In this paper, we propose an Open IE system with Generative Adversarial Networks (Goodfellow et al 2014) architecture. GANs is a promising framework for alleviating exposure bias problem and recently shows remarkable promise in many tasks, such as machine translation (Wiseman and Rush 2016; Yang et al 2018;Yu et al 2017), especially in image generation (Karras et al 2019;Tian et al 2020;Gong et al 2019). Besides the typical sequence-tosequence model (implemented by Transformer and output the sequence with separators) to address the Open IE problem.…”

Section: Introductionmentioning

confidence: 99%

Generative adversarial networks for open information extraction

Han

Wang

2021

Adv. in Comp. Int.

View full text Add to dashboard Cite

Open information extraction (Open IE) is a core task of natural language processing (NLP). Even many efforts have been made in this area, and there are still many problems that need to be tackled. Conventional Open IE approaches use a set of handcrafted patterns to extract relational tuples from the corpus. Secondly, many NLP tools are employed in their procedure; therefore, they face error propagation. To address these problems and inspired by the recent success of Generative Adversarial Networks (GANs), we employ an adversarial training architecture and name it Adversarial-OIE. In Adversarial-OIE, the training of the Open IE model is assisted by a discriminator, which is a (Convolutional Neural Network) CNN model. The goal of the discriminator is to differentiate the extraction result generated by the Open IE model from the training data. The goal of the Open IE model is to produce high-quality triples to cheat the discriminator. A policy gradient method is leveraged to co-train the Open IE model and the discriminator. In particular, due to insufficient training, the discriminator usually leads to the instability of GAN training. We use the distant supervision method to generate training data for the Adversarial-OIE model to solve this problem. To demonstrate our approach, an empirical study on two large benchmark dataset shows that our approach significantly outperforms many existing baselines.

show abstract

“…Combined with powerful function approximators, such as deep neural networks, RL methods can work with complex large-state spaces. RL methods have been applied to various control problems in robotics [16][17][18], water systems management [19], computational biology [20], and AutoML [21]. RL methods have multiple advantages over traditional methods: (1) RL agents can learn to solve tasks without any knowledge of the underlying model.…”

Section: Introductionmentioning

confidence: 99%

Learning to Calibrate Battery Models in Real-Time with Deep Reinforcement Learning

et al. 2021

Self Cite

View full text Add to dashboard Cite

Lithium-ion (Li-I) batteries have recently become pervasive and are used in many physical assets. For the effective management of the batteries, reliable predictions of the end-of-discharge (EOD) and end-of-life (EOL) are essential. Many detailed electrochemical models have been developed for the batteries. Their parameters are calibrated before they are taken into operation and are typically not re-calibrated during operation. However, the degradation of batteries increases the reality gap between the computational models and the physical systems and leads to inaccurate predictions of EOD/EOL. The current calibration approaches are either computationally expensive (model-based calibration) or require large amounts of ground truth data for degradation parameters (supervised data-driven calibration). This is often infeasible for many practical applications. In this paper, we introduce a reinforcement learning-based framework for reliably inferring calibration parameters of battery models in real time. Most importantly, the proposed methodology does not need any labeled data samples of observations and the ground truth parameters. The experimental results demonstrate that our framework is capable of inferring the model parameters in real time with better accuracy compared to approaches based on unscented Kalman filters. Furthermore, our results show better generalizability than supervised learning approaches even though our methodology does not rely on ground truth information during training.

show abstract

Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search

Cited by 48 publications

References 24 publications

2D and 3D Palmprint and Palm Vein Recognition Based on Neural Architecture Search

2D and 3D Palmprint and Palm Vein Recognition Based on Neural Architecture Search

Generative adversarial networks for open information extraction

Learning to Calibrate Battery Models in Real-Time with Deep Reinforcement Learning

Contact Info

Product

Resources

About