Abstract:Deep Echo State Networks (DeepESNs) recently extended the applicability of Reservoir Computing (RC) methods towards the field of deep learning. In this paper we study the impact of constrained reservoir topologies in the architectural design of deep reservoirs, through numerical experiments on several RC benchmarks. The major outcome of our investigation is to show the remarkable effect, in terms of predictive performance gain, achieved by the synergy between a deep reservoir construction and a structured orga… Show more
“…The presented results differ by multiple orders of magnitude from e.g. Gallicchio and Micheli [9], who reached NARMA10 ≈ 10 −4 ± 10 −5 , MG17 ≈ 10 −9 ± 10 −10 , and MG30 ≈ 10 −8 ± 10 −9 . The parameters in the aforementioned work were tuned manually and the sparse topology ended up as the worst of the four.…”
Section: Resultscontrasting
confidence: 92%
“…Some authors (e.g., [25]) use the same constant for all the nonzero connection weights in the ring, chain, and permutation topologies instead of generating the values from N(𝜇 𝑟𝑒𝑠 , 𝜎 2 𝑟𝑒𝑠 ) as is the case for the sparse topology (e.g., [9]). In other words, the reservoir matrix 𝑊 can be expressed as 𝜆𝑊 𝑏 , where 𝑊 𝑏 is a binary matrix and 𝜆 is the desired constant.…”
Section: Topologiesmentioning
confidence: 99%
“…It is worth noting that the sparse topology has O(𝑛 2 ) parameters where 𝑛 is the number of neurons, whereas the ring, chain and permutation topologies have only O(𝑛) parameters. Analogously to other papers (e.g., [9] [31]), we compare topologies with the same number of neurons, not the same number of parameters.…”
Section: Topologiesmentioning
confidence: 99%
“…Unfortunately, researchers have not yet converged to a single and easily comparable performance measure and even though there exist widely used benchmark tasks, many authors have developed their specific modification or parametrization. Unless specified otherwise, we will use the measures from Gallicchio and Micheli [9]. 3.5.1 NARMA10.…”
Section: Benchmarksmentioning
confidence: 99%
“…The hyperparameters of each reservoir topology are optimized so that the instantiated network maximizes its performance on one of the benchmark tasks. Similarly to [9], the experiment uses ESNs with 500 reservoir neurons, regardless of the topology. The reservoir weights are generated from normal distribution N(𝜇 𝑟𝑒𝑠 , 𝜎 2 𝑟𝑒𝑠 ), feedback weights from uniform distribution 𝑈 (−𝜔 𝑓 𝑏 , 𝜔 𝑓 𝑏 ), and input weights from 𝑈 (−𝜔 𝑖𝑛 , 𝜔 𝑖𝑛 ).…”
Echo State Networks represent a type of recurrent neural network with a large randomly generated reservoir and a small number of readout connections trained via linear regression. The most common topology of the reservoir is a fully connected network of up to thousands of neurons. Over the years, researchers have introduced a variety of alternative reservoir topologies, such as a circular network or a linear path of connections. When comparing the performance of different topologies or other architectural changes, it is necessary to tune the hyperparameters for each of the topologies separately since their properties may significantly differ. The hyperparameter tuning is usually carried out manually by selecting the best performing set of parameters from a sparse grid of predefined combinations. Unfortunately, this approach may lead to underperforming configurations, especially for sensitive topologies. We propose an alternative approach of hyperparameter tuning based on the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). Using this approach, we have improved multiple topology comparison results by orders of magnitude suggesting that topology alone does not play as important role as properly tuned hyperparameters.
“…The presented results differ by multiple orders of magnitude from e.g. Gallicchio and Micheli [9], who reached NARMA10 ≈ 10 −4 ± 10 −5 , MG17 ≈ 10 −9 ± 10 −10 , and MG30 ≈ 10 −8 ± 10 −9 . The parameters in the aforementioned work were tuned manually and the sparse topology ended up as the worst of the four.…”
Section: Resultscontrasting
confidence: 92%
“…Some authors (e.g., [25]) use the same constant for all the nonzero connection weights in the ring, chain, and permutation topologies instead of generating the values from N(𝜇 𝑟𝑒𝑠 , 𝜎 2 𝑟𝑒𝑠 ) as is the case for the sparse topology (e.g., [9]). In other words, the reservoir matrix 𝑊 can be expressed as 𝜆𝑊 𝑏 , where 𝑊 𝑏 is a binary matrix and 𝜆 is the desired constant.…”
Section: Topologiesmentioning
confidence: 99%
“…It is worth noting that the sparse topology has O(𝑛 2 ) parameters where 𝑛 is the number of neurons, whereas the ring, chain and permutation topologies have only O(𝑛) parameters. Analogously to other papers (e.g., [9] [31]), we compare topologies with the same number of neurons, not the same number of parameters.…”
Section: Topologiesmentioning
confidence: 99%
“…Unfortunately, researchers have not yet converged to a single and easily comparable performance measure and even though there exist widely used benchmark tasks, many authors have developed their specific modification or parametrization. Unless specified otherwise, we will use the measures from Gallicchio and Micheli [9]. 3.5.1 NARMA10.…”
Section: Benchmarksmentioning
confidence: 99%
“…The hyperparameters of each reservoir topology are optimized so that the instantiated network maximizes its performance on one of the benchmark tasks. Similarly to [9], the experiment uses ESNs with 500 reservoir neurons, regardless of the topology. The reservoir weights are generated from normal distribution N(𝜇 𝑟𝑒𝑠 , 𝜎 2 𝑟𝑒𝑠 ), feedback weights from uniform distribution 𝑈 (−𝜔 𝑓 𝑏 , 𝜔 𝑓 𝑏 ), and input weights from 𝑈 (−𝜔 𝑖𝑛 , 𝜔 𝑖𝑛 ).…”
Echo State Networks represent a type of recurrent neural network with a large randomly generated reservoir and a small number of readout connections trained via linear regression. The most common topology of the reservoir is a fully connected network of up to thousands of neurons. Over the years, researchers have introduced a variety of alternative reservoir topologies, such as a circular network or a linear path of connections. When comparing the performance of different topologies or other architectural changes, it is necessary to tune the hyperparameters for each of the topologies separately since their properties may significantly differ. The hyperparameter tuning is usually carried out manually by selecting the best performing set of parameters from a sparse grid of predefined combinations. Unfortunately, this approach may lead to underperforming configurations, especially for sensitive topologies. We propose an alternative approach of hyperparameter tuning based on the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). Using this approach, we have improved multiple topology comparison results by orders of magnitude suggesting that topology alone does not play as important role as properly tuned hyperparameters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.