Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu 2018
DOI: 10.18653/v1/n18-1205
|View full text |Cite
|
Sign up to set email alerts
|

Recurrent Neural Networks as Weighted Language Recognizers

Abstract: We investigate the computational complexity of various problems for simple recurrent neural networks (RNNs) as formal models for recognizing weighted languages. We focus on the single-layer, ReLU-activation, rationalweight RNNs with softmax, which are commonly used in natural language processing applications. We show that most problems for such RNNs are undecidable, including consistency, equivalence, minimization, and the determination of the highest-weighted string. However, for consistent RNNs the last prob… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
54
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 42 publications
(60 citation statements)
references
References 17 publications
(19 reference statements)
0
54
0
Order By: Relevance
“…A famous result by Siegelmann and Sontag (1992;1994), and its extension in (Siegelmann, 1999), demonstrates that an Elman-RNN (Elman, 1990) with a sigmoid activation function, rational weights and infinite precision states can simulate a Turing-machine in real-time, making RNNs Turing-complete. Recently, Chen et al (2017) extended the result to the ReLU activation function. However, these constructions (a) assume reading the entire input into the RNN state and only then performing the computation, using unbounded time; and (b) rely on having infinite precision in the network states.…”
Section: Introductionmentioning
confidence: 93%
See 1 more Smart Citation
“…A famous result by Siegelmann and Sontag (1992;1994), and its extension in (Siegelmann, 1999), demonstrates that an Elman-RNN (Elman, 1990) with a sigmoid activation function, rational weights and infinite precision states can simulate a Turing-machine in real-time, making RNNs Turing-complete. Recently, Chen et al (2017) extended the result to the ReLU activation function. However, these constructions (a) assume reading the entire input into the RNN state and only then performing the computation, using unbounded time; and (b) rely on having infinite precision in the network states.…”
Section: Introductionmentioning
confidence: 93%
“…However, these constructions (a) assume reading the entire input into the RNN state and only then performing the computation, using unbounded time; and (b) rely on having infinite precision in the network states. As argued by Chen et al (2017), this is not the model of RNN computation used in NLP applications. Instead, RNNs are often used by feeding an input sequence into the RNN one item at a time, each immediately returning a statevector that corresponds to a prefix of the sequence and which can be passed as input for a subsequent feed-forward prediction network operating in constant time.…”
Section: Introductionmentioning
confidence: 99%
“…Ott et al (2018) argued that uncertainty caused by noisy training data may play a role. Chen et al (2018) showed that the consistent best string problem for RNNs is decidable. We provide an alternative DFS algorithm that relies on the monotonic nature of model scores rather than consistency, and that often converges in practice.…”
Section: Related Workmentioning
confidence: 99%
“…Other works have studied the expressive power of RNNs, in particular in the context of WFSAs or HMMs (Cleeremans et al, 1989;Giles et al, 1992;Visser et al, 2001;Chen et al, 2018). In this work we relate CNNs to WFSAs, showing that a one-layer CNN with max-pooling can be simulated by a collection of linear-chain WFSAs.…”
Section: Related Workmentioning
confidence: 95%