2021 Data Compression Conference (DCC) 2021
DOI: 10.1109/dcc50243.2021.00009
|View full text |Cite
|
Sign up to set email alerts
|

A Dual-Critic Reinforcement Learning Framework for Frame-Level Bit Allocation in HEVC/H.265

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
10
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 6 publications
0
10
0
Order By: Relevance
“…Gao et al [2016] utilize a game theory method to allocate CTU-level bit allocation and optimize for SSIM in HEVC. More recently, Reinforcement learning approaches in rate control have also been proposed to HEVC , Ho et al, 2021, Chen et al, 2018. Mao et al [2020] use an imitation learning approach on evolutionary search based policy [Salimans et al, 2017] with a feedback-based correction for rate control in VP9.…”
Section: Related Workmentioning
confidence: 99%
“…Gao et al [2016] utilize a game theory method to allocate CTU-level bit allocation and optimize for SSIM in HEVC. More recently, Reinforcement learning approaches in rate control have also been proposed to HEVC , Ho et al, 2021, Chen et al, 2018. Mao et al [2020] use an imitation learning approach on evolutionary search based policy [Salimans et al, 2017] with a feedback-based correction for rate control in VP9.…”
Section: Related Workmentioning
confidence: 99%
“…Different from the single-critic approaches [1,2,3,4], Ho et al [5] learn two separate critics, one for estimating the distortion r D reward and the other for the rate r R reward. They introduce a dual-critic learning algorithm that trains the RL agent by alternating the rate critic with the distortion critic according to how the RL agent behaves in encoding a GOP.…”
Section: Introductionmentioning
confidence: 99%
“…In this paper, we propose an action-constrained RL framework though Neural Frank-Wolfe Policy Optimization (NFWPO). Similar to the dual-critic idea [5], our scheme includes a rate critic and a distortion critic. However, unlike [5], the rate critic is utilized to specify a state-dependent feasible set, i.e.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations