Critical Behavior in Physics and Probabilistic Formal Languages

Lin, Henry W.; Tegmark, Max

doi:10.3390/e19070299

Cited by 74 publications

(121 citation statements)

References 54 publications

Supporting

Mentioning

111

Contrasting

Unclassified

Order By: Relevance

“…The TRW of the language areas therefore appears to be in the 5-7 word range. As alluded to in the Introduction, this relatively local linguistic processing is likely driven by the statistical properties of natural language, where most semantic/syntactic dependencies are local (e.g., Futrell et al, 2015), and PMI falls off quite sharply as a function of inter-word distance (e.g., Lin & Tegmark, 2017). We can further speculate that linguistic chunks of this size are sufficient to express clause-level meanings, where clauses describe events -salient and meaningful semantic units in our experience with the world (e.g., Zacks & Tversky, 2001).…”

Section: The Temporal Receptive Window Of the Language Areasmentioning

confidence: 99%

“…The degree of local combinability can be formally estimated using tools from information theory (Shannon & Weaver, 1963). Naturalistic linguistic input is characterized by relatively high pointwise mutual information (PMI) among words within a local linguistic context, and it falls off for word pairs spanning longer distances (e.g., Li, 1990;Lin & Tegmark, 2017;Futrell, Qian, Gibson, Fedorenko, & Blank, 2019). Our local-word-swap manipulation maintained approximately the same level of local mutual information as that observed in typical linguistic input.…”

mentioning

confidence: 90%

See 1 more Smart Citation

Composition is the core driver of the language-selective network

Mollica¹,

Siegelman²,

Diachek³

et al. 2018

Preprint

View full text Add to dashboard Cite

The fronto-temporal language network responds robustly and selectively to sentences. But the features of linguistic input that drive this response and the computations these language areas support remain debated. Two key features of sentences are typically confounded in natural linguistic input: words in sentences a) are semantically and syntactically combinable into phrase-and clause-level meanings, and b) occur in an order licensed by the language's grammar. Inspired by recent psycholinguistic work establishing that language processing is robust to word order violations, we hypothesized that the core linguistic computation is composition, and, thus, can take place even when the word order violates the grammatical constraints of the language. This hypothesis predicts that a linguistic string should elicit a sentence-level response in the language network as long as the words in that string can enter into dependency relationships as in typical sentences. We tested this prediction across two fMRI experiments (total N=47) by introducing a varying number of local word swaps into naturalistic sentences, leading to progressively less syntactically well-formed strings. Critically, local dependency relationships were preserved because combinable words remained close to each other. As predicted, word order degradation did not decrease the magnitude of the BOLD response in the language network, except when combinable words were so far apart that composition among nearby words was highly unlikely. This finding demonstrates that composition is robust to word order violations, and that the language regions respond as strongly as they do to naturalistic linguistic input as long as composition can take place.

show abstract

Section: The Temporal Receptive Window Of the Language Areasmentioning

confidence: 99%

mentioning

confidence: 90%

Composition is the core driver of the language-selective network

Mollica¹,

Siegelman²,

Diachek³

et al. 2018

Preprint

View full text Add to dashboard Cite

show abstract

“…They show that an irreducible and aperiodic Markov process, with non-degenerate eigenvalues, cannot produce critical behaviour because I decays exponentially. This phenomenon is seen in a number of cases, including hidden and semi-Markov models 1,25 . In the literature, such behaviour is superficially dealt with by increasing the state space to include symbols from the past, which does not address the main issue 25 with Markov models; lack of memory.…”

Section: /15mentioning

confidence: 93%

“…This phenomenon is seen in a number of cases, including hidden and semi-Markov models 1,25 . In the literature, such behaviour is superficially dealt with by increasing the state space to include symbols from the past, which does not address the main issue 25 with Markov models; lack of memory. This analysis shows that GeoLife dataset consists of considerably higher number of long-range correlations, compared to the PrivaMov dataset and the NMDC dataset.…”

Section: /15mentioning

confidence: 93%

On the Inability of Markov Models to Capture Criticality in Human Mobility

Kulkarni¹,

Mahalunkar

Garbinato³

et al. 2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

We examine the non-Markovian nature of human mobility by exposing the inability of Markov models to capture criticality in human mobility. In particular, the assumed Markovian nature of mobility was used to establish a theoretical upper bound on the predictability of human mobility (expressed as a minimum error probability limit), based on temporally correlated entropy. Since its inception, this bound has been widely used and empirically validated using Markov chains. We show that recurrent-neural architectures can achieve significantly higher predictability, surpassing this widely used upper bound. In order to explain this anomaly, we shed light on several underlying assumptions in previous research works that has resulted in this bias. By evaluating the mobility predictability on real-world datasets, we show that human mobility exhibits scale-invariant long-range correlations, bearing similarity to a power-law decay. This is in contrast to the initial assumption that human mobility follows an exponential decay. This assumption of exponential decay coupled with Lempel-Ziv compression in computing Fano's inequality has led to an inaccurate estimation of the predictability upper bound. We show that this approach inflates the entropy, consequently lowering the upper bound on human mobility predictability. We finally highlight that this approach tends to overlook long-range correlations in human mobility. This explains why recurrent-neural architectures that are designed to handle long-range structural correlations surpass the previously computed upper bound on mobility predictability.

show abstract

“…Their claim that the LSTM model is capable of capturing longrange dependencies is thus only supported by such qualitative evidence, without giving a deep insight in the characteristics of the generated documents. Lin and Tegmark [16] compared natural language texts with those generated by Markov models and LSTMs, exploiting metrics coming from information theory. Their analysis shows that LSTMs are capable of capturing correlations that Markov models instead fail to represent, yet the range of correlations they consider is still quite limited (up to 1,000 characters).…”

Section: Introductionmentioning

confidence: 99%

Natural Language Statistical Features of LSTM-Generated Texts

Lippi

Montemurro

Esposti

et al. 2019

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

Long Short-Term Memory (LSTM) networks have recently shown remarkable performance in several tasks dealing with natural language generation, such as image captioning or poetry composition. Yet, only few works have analyzed text generated by LSTMs in order to quantitatively evaluate to which extent such artificial texts resemble those generated by humans. We compared the statistical structure of LSTMgenerated language to that of written natural language, and to those produced by Markov models of various orders. In particular, we characterized the statistical structure of language by assessing word-frequency statistics, long-range correlations, and entropy measures. Our main finding is that while both LSTM and Markov-generated texts can exhibit features similar to real ones in their word-frequency statistics and entropy measures, LSTM-texts are shown to reproduce long-range correlations at scales comparable to those found in natural language. Moreover, for LSTM networks a temperature-like parameter controlling the generation process shows an optimal value-for which the produced texts are closest to real language-consistent across the different statistical features investigated.

show abstract

Critical Behavior in Physics and Probabilistic Formal Languages

Cited by 74 publications

References 54 publications

Composition is the core driver of the language-selective network

Composition is the core driver of the language-selective network

On the Inability of Markov Models to Capture Criticality in Human Mobility

Natural Language Statistical Features of LSTM-Generated Texts

Contact Info

Product

Resources

About