Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). Using a self-supervised next-word prediction task, these models generate appropriate linguistic responses in a given context. In the current study, nine participants listened to a 30-min podcast while their brain responses were recorded using electrocorticography (ECoG). We provide empirical evidence that the human brain and autoregressive DLMs share three fundamental computational principles as they process the same natural narrative: (1) both are engaged in continuous next-word prediction before word onset; (2) both match their pre-onset predictions to the incoming word to calculate post-onset surprise; (3) both rely on contextual embeddings to represent words in natural contexts. Together, our findings suggest that autoregressive DLMs provide a new and biologically feasible computational framework for studying the neural basis of language.
We present the first computationally-efficient algorithm with O( √ T ) regret for learning in Linear Quadratic Control systems with unknown dynamics. By that, we resolve an open question of Tu (2018).
Departing from rule-based linguistic models, advances in deep learning resulted in a new type of autoregressive deep language models (DLMs). These models are trained using a self-supervised next word prediction task. We provide empirical evidence for the connection between autoregressive DLMs and the human language faculty using spoken narrative and electrocorticographic recordings. Behaviorally, we demonstrate that humans have a remarkable capacity for word prediction in natural contexts, and that, given a sufficient context window, DLMs can attain human-level prediction performance. Leveraging on DLM embeddings we demonstrate that the brain constantly and spontaneously predicts the identity of the next word in natural speech, hundreds of milliseconds before they are perceived. Finally, we demonstrate that contextual embeddings derived from autoregressive DLMs capture neural representations of the unique, context-specific meaning of words in the narrative. Our findings suggest that DLMs provides a novel biologically feasible computational framework for studying the neural basis of language.
e would like to thank Karel Mertens and Efraim Sadka for useful comments. The views expressed herein are those of the author(s) and do not necessarily reflect the views of the National Bureau of Economic Research. NBER working papers are circulated for discussion and comment purposes. They have not been peerreviewed or been subject to the review by the NBER Board of Directors that accompanies official NBER publications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.