Second Order Training and Sizing for the Multilayer Perceptron

Tyagi, Kanishka; Nguyen, Son; Rawat, Rohit; Manry, M.T.

doi:10.1007/s11063-019-10116-7

Cited by 12 publications

(2 citation statements)

References 115 publications

(135 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These methods typically involve using a constructive algorithm to grow the neural architecture first, then prune the subsequent architecture, or simultaneously grow and prune during the learning process [25]. While using a hybrid approach is appealing and has had great success [16,[26][27][28][29][30][31]; the focus of the present article is on constructive algorithms.…”

Section: Introductionmentioning

confidence: 99%

Dynamic multilayer growth: Parallel vs. sequential approaches

Ross,

Berberian,

Nikolla

et al. 2024

PLoS ONE

View full text Add to dashboard Cite

The decision of when to add a new hidden unit or layer is a fundamental challenge for constructive algorithms. It becomes even more complex in the context of multiple hidden layers. Growing both network width and depth offers a robust framework for leveraging the ability to capture more information from the data and model more complex representations. In the context of multiple hidden layers, should growing units occur sequentially with hidden units only being grown in one layer at a time or in parallel with hidden units growing across multiple layers simultaneously? The effects of growing sequentially or in parallel are investigated using a population dynamics-inspired growing algorithm in a multilayer context. A modified version of the constructive growing algorithm capable of growing in parallel is presented. Sequential and parallel growth methodologies are compared in a three-hidden layer multilayer perceptron on several benchmark classification tasks. Several variants of these approaches are developed for a more in-depth comparison based on the type of hidden layer initialization and the weight update methods employed. Comparisons are then made to another sequential growing approach, Dynamic Node Creation. Growing hidden layers in parallel resulted in comparable or higher performances than sequential approaches. Growing hidden layers in parallel promotes growing narrower deep architectures tailored to the task. Dynamic growth inspired by population dynamics offers the potential to grow the width and depth of deeper neural networks in either a sequential or parallel fashion.

show abstract

Section: Introductionmentioning

confidence: 99%

Dynamic multilayer growth: Parallel vs. sequential approaches

Ross,

Berberian,

Nikolla

et al. 2024

PLoS ONE

View full text Add to dashboard Cite

show abstract

“…In the GAN network [23], the KL divergence is substituted into the objective function to solve the minimax game problem. Mapping to the tracking problem, we build a loss function based on minimizing the information loss of the KL divergence, and the model can be optimized and updated by solving the minimum value of the loss function [24]. In [25], the KL divergence is minimized to train the regression network.…”

Section: Introductionmentioning

confidence: 99%

Variational Online Learning Correlation Filter for Visual Tracking

Wang,

Liu,

Deng

2024

Mathematics

View full text Add to dashboard Cite

Recently, discriminative correlation filters (DCF) have been successfully applied for visual tracking. However, traditional DCF trackers tend to separately solve boundary effect and temporal degradation problems in the tracking process. In this paper, a variational online learning correlation filter (VOLCF) is proposed for visual tracking to improve the robustness and accuracy of the tracking process. Unlike previous methods, which use only first-order temporal constraints, this approach leads to overfitting and filter degradation. First, beyond the standard filter training requirement, our proposed VOLCF method introduces a model confidence term, which leverages the temporal information of adjacent frames during filter training. Second, to ensure the consistency of the temporal and spatial characteristics of the video sequence, the model introduces Kullback–Leibler (KL) divergence to obtain the second-order information of the filter. In contrast to traditional target tracking models that rely solely on first-order feature information, this approach facilitates the acquisition of a generalized connection between the previous and current filters. As a result, it incorporates joint-regulated filter updating. Through quantitative and qualitative analyses of the experiment, it proves that the VOLCF model has excellent tracking performance.

show abstract