Online Selective Kernel-Based Temporal Difference Learning

Chen, Xingguo; Wang, Ruili

doi:10.1109/tnnls.2013.2270561

Cited by 24 publications

(7 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…After 43 trials, we count the number of trials which received a positive reward, and the success rate is averaged over 50 Monte Carlo runs. The performance of the Q -KTD algorithm is compared with Q -learning via time delayed neural net ( Q -TDNN) and the online selective kernel-based temporal difference learning algorithm ( Q -OSKTD) [23] in Figure 9. Note that TDNN is a conventional approach to function approximation and has already been applied to RLBMI experiments for neural decoding [1, 2].…”

Section: Experimental Results On Neural Decodingmentioning

confidence: 99%

“…These methods, known as kernel sparsification methods, can be applied to the KTD algorithm to control the growth of the terms in the function expansion, also known as filter size. Popular examples of kernel sparsification methods are the approximate linear dependence (ALD) [19], Surprise criterion [32], Quantization approach [21], and the kernel distance based method [23]. The main idea of sparsification is to only consider a reduced set of samples, called the dictionary, to represent the function of interest.…”

Section: Online Sparsificationmentioning

confidence: 99%

“…Otherwise, the new input state x ( n ) is absorbed by the closest existing unit. Very similar to the quantization approach, the method presented in [23] applies a distance threshold criterion in the RKHS. The kernel distance based criterion given a state dictionary D ( n − 1) adds a new unit when the new input state x ( n ) satisfies following condition;

\begin{matrix} \min_{x_{i} \in D (n - 1)} {‖ϕ (x (n)) - ϕ (x_{i})‖}^{2} > μ_{1} . \end{matrix}

For some kernels such as Gaussian, the Quantization method and the kernel distance based criterion can be shown to be equivalent.…”

Section: Online Sparsificationmentioning

confidence: 99%

“…A Quantization approach proposed in [21] has been used in KTD( λ ) [22]. In a similar flavor, the kernel distance based online sparsification method was proposed for a KTD algorithm in [23]. Note that ALD is O ( n 2 ) complexity, whereas quantization and kernel distances are O ( n ).…”

Section: Introductionmentioning

confidence: 99%

“…The main difference between the quantization approach and the kernel distance is the space where the distances are computed. Quantization approach uses criterion of input space distances whereas kernel distance computes them in the RKHS associated with the kernel [23]. …”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Kernel Temporal Differences for Neural Decoding

Bae

Giraldo

Pohlmeyer

et al. 2015

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

We study the feasibility and capability of the kernel temporal difference (KTD)(λ) algorithm for neural decoding. KTD(λ) is an online, kernel-based learning algorithm, which has been introduced to estimate value functions in reinforcement learning. This algorithm combines kernel-based representations with the temporal difference approach to learning. One of our key observations is that by using strictly positive definite kernels, algorithm's convergence can be guaranteed for policy evaluation. The algorithm's nonlinear functional approximation capabilities are shown in both simulations of policy evaluation and neural decoding problems (policy improvement). KTD can handle high-dimensional neural states containing spatial-temporal information at a reasonable computational complexity allowing real-time applications. When the algorithm seeks a proper mapping between a monkey's neural states and desired positions of a computer cursor or a robot arm, in both open-loop and closed-loop experiments, it can effectively learn the neural state to action mapping. Finally, a visualization of the coadaptation process between the decoder and the subject shows the algorithm's capabilities in reinforcement learning brain machine interfaces.

show abstract

Section: Experimental Results On Neural Decodingmentioning

confidence: 99%