Bridging CNNs, RNNs, and Weighted Finite-State Machines

Schwartz, Roy; Thomson, Sam; Smith, Noah A.

doi:10.18653/v1/p18-1028

Cited by 16 publications

(19 citation statements)

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To apply their method, one has to modify the structure of an RNN before training, while our method does not need any special structure to RNNs and can be applied to already trained RNNs. Schwartz, Thomson, and Smith (2018) introduce a neural network architecture that can represent (restricted forms of) CNNs and RNNs. WFAs could also be expressed by their architecture, but extraction of automata is out of their interest.…”

Section: Related Workmentioning

confidence: 99%

Weighted Automata Extraction from Recurrent Neural Networks via Regression on State Spaces

Okudono

Waga

Sekiyama

et al. 2020

AAAI

View full text Add to dashboard Cite

We present a method to extract a weighted finite automaton (WFA) from a recurrent neural network (RNN). Our method is based on the WFA learning algorithm by Balle and Mohri, which is in turn an extension of Angluin's classic L* algorithm. Our technical novelty is in the use of regression methods for the so-called equivalence queries, thus exploiting the internal state space of an RNN to prioritize counterexample candidates. This way we achieve a quantitative/weighted extension of the recent work by Weiss, Goldberg and Yahav that extracts DFAs. We experimentally evaluate the accuracy, expressivity and efficiency of the extracted WFAs.

show abstract

Section: Related Workmentioning

confidence: 99%

Weighted Automata Extraction from Recurrent Neural Networks via Regression on State Spaces

Okudono

Waga

Sekiyama

et al. 2020

AAAI

View full text Add to dashboard Cite

show abstract

“…Our definition is equivalent, giving the weight functions value0 wherever they were undefined. 3 ε-transitions can be handled with a slight modification (Schwartz et al, 2018). Note though that if A contains a cycle of ε-transitions, then either K must follow the star semiring laws (Kuich and Salomaa, 1986), or the number of consecutive ε-transitions allowed must be capped.…”

Section: ε /mentioning

confidence: 99%

“…So far our discussion has centered on B, a twostate WFSA capturing unigram patterns (Example 5). In the same spirit as going from unigram to n-gram features, one can use WFSAs with more states to capture longer patterns (Schwartz et al, 2018). In this section we augment B by introducing more states, and explore its relationship to some neural architectures motivated by n-gram features.…”

Section: More Than Two Statesmentioning

confidence: 99%

“…It is a direct application of the Forward algorithm ( §2), though now in a form that deals with the "-transition. Such an approach applies, of course, to more general cases, as noted by Schwartz et al (2018). Given an input string x 2 ⌃ + , let z (j) t denote the total score of all paths landing in state q j just after consuming x t .…”

Section: Aggregating Different Length Patternsmentioning

confidence: 99%

“…B can be seen as capturing soft unigram patterns (Davidov et al, 2010), in the sense that it consumes one input symbol to reach the final state from the initial state. It is straightforward to design WFSAs capturing longer patterns by including more states (Schwartz et al, 2018), as we will discuss later in §4 and §5.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Rational Recurrences

Peng

Schwartz²,

Thomson

et al. 2018

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Self Cite

View full text Add to dashboard Cite

Despite the tremendous empirical success of neural models in natural language processing, many of them lack the strong intuitions that accompany classical machine learning approaches. Recently, connections have been shown between convolutional neural networks (CNNs) and weighted finite state automata (WFSAs), leading to new interpretations and insights. In this work, we show that some recurrent neural networks also share this connection to WFSAs. We characterize this connection formally, defining rational recurrences to be recurrent hidden state update functions that can be written as the Forward calculation of a finite set of WFSAs. We show that several recent neural models use rational recurrences. Our analysis provides a fresh view of these models and facilitates devising new neural architectures that draw inspiration from WFSAs. We present one such model, which performs better than two recent baselines on language modeling and text classification. Our results demonstrate that transferring intuitions from classical models like WFSAs can be an effective approach to designing and understanding neural models.

show abstract

Joint intent detection and slot filling using weighted finite state transducer and BERT

Abro

Aamir

et al. 2022

Appl Intell

View full text Add to dashboard Cite

Bridging CNNs, RNNs, and Weighted Finite-State Machines

Cited by 16 publications

References 53 publications

Weighted Automata Extraction from Recurrent Neural Networks via Regression on State Spaces

Weighted Automata Extraction from Recurrent Neural Networks via Regression on State Spaces

Rational Recurrences

Joint intent detection and slot filling using weighted finite state transducer and BERT

Contact Info

Product

Resources

About