2021
DOI: 10.48550/arxiv.2102.10094
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Formal Language Theory Meets Modern NLP

William Merrill
Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 14 publications
0
3
0
Order By: Relevance
“…No We vary m, the dimension of the hidden state, in the range [3,32], used RMSProp optimizer [49] with the smoothing constant α = 0.99 and varied the learning rate in the range [10 −2 , 10 −3 ]. For each language we train models corresponding to each language for 100 epochs and a batch size of 32.…”
Section: φ R Smentioning
confidence: 99%
See 1 more Smart Citation
“…No We vary m, the dimension of the hidden state, in the range [3,32], used RMSProp optimizer [49] with the smoothing constant α = 0.99 and varied the learning rate in the range [10 −2 , 10 −3 ]. For each language we train models corresponding to each language for 100 epochs and a batch size of 32.…”
Section: φ R Smentioning
confidence: 99%
“…RNNs can be viewed as dynamical systems and many works have used this viewpoint to study RNNs, e.g., [22,23,24,25]. Other related work includes relation to kernel methods, e.g., [26,27,28], linear RNNs [29], saturated RNNs [30,31,32], and echo state networks [33,34]. Several other works talk about the expressive power of the novel sequence to sequence models Transformers [35,36].…”
Section: Introductionmentioning
confidence: 99%
“…This memory store is a vector of hidden units h called the hidden state. Connections between RNNs and automata have been studied extensively (Chen et al, 2018;Peng et al, 2018;Merrill, 2019;Merrill et al, 2020;Merrill, 2021).…”
Section: Stack Rnnsmentioning
confidence: 99%