Summary Heat causes protein misfolding and aggregation, and in eukaryotic cells triggers aggregation of proteins and RNA into stress granules. We have carried out extensive proteomic studies to quantify heat-triggered aggregation and subsequent disaggregation in budding yeast, identifying more than 170 endogenous proteins aggregating within minutes of heat shock in multiple subcellular compartments. We demonstrate that these aggregated proteins are not misfolded and destined for degradation. Stable-isotope labeling reveals that even severely aggregated endogenous proteins are disaggregated without degradation during recovery from shock, contrasting with the rapid degradation observed for exogenous thermolabile proteins. Although aggregation likely inactivates many cellular proteins, in the case of a heterotrimeric aminoacyl-tRNA synthetase complex, the aggregated proteins remain active with unaltered fidelity. We propose that most heat-induced aggregation of mature proteins reflects the operation of an adaptive, autoregulatory process of functionally significant aggregate assembly and disassembly that aids cellular adaptation to thermal stress.
The remarkable success of pretrained language models has motivated the study of what kinds of knowledge these models learn during pretraining. Reformulating tasks as fillin-the-blanks problems (e.g., cloze tests) is a natural approach for gauging such knowledge, however, its usage is limited by the manual effort and guesswork required to write suitable prompts. To address this, we develop AUTOPROMPT, an automated method to create prompts for a diverse set of tasks, based on a gradient-guided search. Using AUTO-PROMPT, we show that masked language models (MLMs) have an inherent capability to perform sentiment analysis and natural language inference without additional parameters or finetuning, sometimes achieving performance on par with recent state-of-the-art supervised models. We also show that our prompts elicit more accurate factual knowledge from MLMs than the manually created prompts on the LAMA benchmark, and that MLMs can be used as relation extractors more effectively than supervised relation extraction models. These results demonstrate that automatically generated prompts are a viable parameter-free alternative to existing probing methods, and as pretrained LMs become more sophisticated and capable, potentially a replacement for finetuning.
Adversarial examples highlight model vulnerabilities and are useful for evaluation and interpretation. We define universal adversarial triggers: input-agnostic sequences of tokens that trigger a model to produce a specific prediction when concatenated to any input from a dataset. We propose a gradientguided search over tokens which finds short trigger sequences (e.g., one word for classification and four words for language modeling) that successfully trigger the target prediction. For example, triggers cause SNLI entailment accuracy to drop from 89.94% to 0.55%, 72% of "why" questions in SQuAD to be answered "to kill american people", and the GPT-2 language model to spew racist output even when conditioned on non-racial contexts. Furthermore, although the triggers are optimized using white-box access to a specific model, they transfer to other models for all tasks we consider. Finally, since triggers are input-agnostic, they provide an analysis of global model behavior. For instance, they confirm that SNLI models exploit dataset biases and help to diagnose heuristics learned by reading comprehension models.
Neuronal avalanches are a form of spontaneous activity widely observed in cortical slices and other types of nervous tissue, both in vivo and in vitro. They are characterized by irregular, isolated population bursts when many neurons fire together, where the number of spikes per burst obeys a power law distribution. We simulate, using the Gillespie algorithm, a model of neuronal avalanches based on stochastic single neurons. The network consists of excitatory and inhibitory neurons, first with all-to-all connectivity and later with random sparse connectivity. Analyzing our model using the system size expansion, we show that the model obeys the standard Wilson-Cowan equations for large network sizes ( neurons). When excitation and inhibition are closely balanced, networks of thousands of neurons exhibit irregular synchronous activity, including the characteristic power law distribution of avalanche size. We show that these avalanches are due to the balanced network having weakly stable functionally feedforward dynamics, which amplifies some small fluctuations into the large population bursts. Balanced networks are thought to underlie a variety of observed network behaviours and have useful computational properties, such as responding quickly to changes in input. Thus, the appearance of avalanches in such functionally feedforward networks indicates that avalanches may be a simple consequence of a widely present network structure, when neuron dynamics are noisy. An important implication is that a network need not be “critical” for the production of avalanches, so experimentally observed power laws in burst size may be a signature of noisy functionally feedforward structure rather than of, for example, self-organized criticality.
Although pretrained Transformers such as BERT achieve high accuracy on indistribution examples, do they generalize to new distributions? We systematically measure out-of-distribution (OOD) generalization for seven NLP datasets by constructing a new robustness benchmark with realistic distribution shifts. We measure the generalization of previous models including bag-of-words models, ConvNets, and LSTMs, and we show that pretrained Transformers' performance declines are substantially smaller. Pretrained transformers are also more effective at detecting anomalous or OOD examples, while many previous models are frequently worse than chance. We examine which factors affect robustness, finding that larger models are not necessarily more robust, distillation can be harmful, and more diverse pretraining data can enhance robustness. Finally, we show where future work can improve OOD robustness.
Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture the abilities a dataset is intended to test. We propose a more rigorous annotation paradigm for NLP that helps to close systematic gaps in the test data. In particular, after a dataset is constructed, we recommend that the dataset authors manually perturb the test instances in small but meaningful ways that (typically) change the gold label, creating contrast sets. Contrast sets provide a local view of a model's decision boundary, which can be used to more accurately evaluate a model's true linguistic capabilities. We demonstrate the efficacy of contrast sets by creating them for 10 diverse NLP datasets (e.g., DROP reading comprehension, UD parsing, and IMDb sentiment analysis). Although our contrast sets are not explicitly adversarial, model performance is significantly lower on them than on the original test sets-up to 25% in some cases. We release our contrast sets as new evaluation benchmarks and encourage future dataset construction efforts to follow similar annotation processes.
Networks of neurons produce diverse patterns of oscillations, arising from the network's global properties, the propensity of individual neurons to oscillate, or a mixture of the two. Here we describe noisy limit cycles and quasi-cycles, two related mechanisms underlying emergent oscillations in neuronal networks whose individual components, stochastic spiking neurons, do not themselves oscillate. Both mechanisms are shown to produce gamma band oscillations at the population level while individual neurons fire at a rate much lower than the population frequency. Spike trains in a network undergoing noisy limit cycles display a preferred period which is not found in the case of quasi-cycles, due to the even faster decay of phase information in quasi-cycles. These oscillations persist in sparsely connected networks, and variation of the network's connectivity results in variation of the oscillation frequency. A network of such neurons behaves as a stochastic perturbation of the deterministic Wilson-Cowan equations, and the network undergoes noisy limit cycles or quasi-cycles depending on whether these have limit cycles or a weakly stable focus. These mechanisms provide a new perspective on the emergence of rhythmic firing in neural networks, showing the coexistence of population-level oscillations with very irregular individual spike trains in a simple and general framework.
The ability to understand and work with numbers (numeracy) is critical for many complex reasoning tasks. Currently, most NLP models treat numbers in text in the same way as other tokens-they embed them as distributed vectors. Is this enough to capture numeracy? We begin by investigating the numerical reasoning capabilities of a state-of-the-art question answering model on the DROP dataset. We find this model excels on questions that require numerical reasoning, i.e., it already captures numeracy. To understand how this capability emerges, we probe token embedding methods (e.g., BERT, GloVe) on synthetic list maximum, number decoding, and addition tasks. A surprising degree of numeracy is naturally present in standard embeddings. For example, GloVe and word2vec accurately encode magnitude for numbers up to 1,000. Furthermore, character-level embeddings are even more precise-ELMo captures numeracy the best for all pre-trained methods-but BERT, which uses sub-word units, is less exact.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.