Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture the abilities a dataset is intended to test. We propose a more rigorous annotation paradigm for NLP that helps to close systematic gaps in the test data. In particular, after a dataset is constructed, we recommend that the dataset authors manually perturb the test instances in small but meaningful ways that (typically) change the gold label, creating contrast sets. Contrast sets provide a local view of a model's decision boundary, which can be used to more accurately evaluate a model's true linguistic capabilities. We demonstrate the efficacy of contrast sets by creating them for 10 diverse NLP datasets (e.g., DROP reading comprehension, UD parsing, and IMDb sentiment analysis). Although our contrast sets are not explicitly adversarial, model performance is significantly lower on them than on the original test sets-up to 25% in some cases. We release our contrast sets as new evaluation benchmarks and encourage future dataset construction efforts to follow similar annotation processes.
This pilot study supports the feasibility of a large trial comparing lower versus higher MAP targets for shock. Further research may help delineate the reasons for vasopressor dosing in excess of prescribed targets and how individual patient characteristics modify the response to vasopressor therapy.
Neural NLP models are increasingly accurate but are imperfect and opaque-they break in counterintuitive ways and leave end users puzzled at their behavior. Model interpretation methods ameliorate this opacity by providing explanations for specific model predictions. Unfortunately, existing interpretation codebases make it difficult to apply these methods to new models and tasks, which hinders adoption for practitioners and burdens interpretability researchers. We introduce Al-lenNLP Interpret, a flexible framework for interpreting NLP models. The toolkit provides interpretation primitives (e.g., input gradients) for any AllenNLP model and task, a suite of built-in interpretation methods, and a library of front-end visualization components. We demonstrate the toolkit's flexibility and utility by implementing live demos for five interpretation methods (e.g., saliency maps and adversarial attacks) on a variety of models and tasks (e.g., masked language modeling using BERT and reading comprehension using BiDAF). These demos, alongside our code and tutorials, are available at https://allennlp. org/interpret.
Background and Aims Alagille syndrome (ALGS) is a multisystem developmental disorder characterized by bile duct (BD) paucity, caused primarily by haploinsufficiency of the Notch ligand jagged1. The course of the liver disease is highly variable in ALGS. However, the genetic basis for ALGS phenotypic variability is unknown. Previous studies have reported decreased expression of the transcription factor SOX9 (sex determining region Y‐box 9) in late embryonic and neonatal livers of Jag1‐deficient mice. Here, we investigated the effects of altering the Sox9 gene dosage on the severity of liver disease in an ALGS mouse model. Approach and Results Conditional removal of one copy of Sox9 in Jag1+/− livers impairs the biliary commitment of cholangiocytes and enhances the inflammatory reaction and liver fibrosis. Loss of both copies of Sox9 in Jag1+/− livers further worsens the phenotypes and results in partial lethality. Ink injection experiments reveal impaired biliary tree formation in the periphery of P30 Jag1+/− livers, which is improved by 5 months of age. Sox9 heterozygosity worsens the P30 biliary tree phenotype and impairs the partial recovery in 5‐month‐old animals. Notably, Sox9 overexpression improves BD paucity and liver phenotypes in Jag1+/− mice without ectopic hepatocyte‐to‐cholangiocyte transdifferentiation or long‐term liver abnormalities. Notch2 expression in the liver is increased following Sox9 overexpression, and SOX9 binds the Notch2 regulatory region in the liver. Histological analysis shows a correlation between the level and pattern of SOX9 expression in the liver and outcome of the liver disease in patients with ALGS. Conclusions Our results establish Sox9 as a dosage‐sensitive modifier of Jag1+/− liver phenotypes with a permissive role in biliary development. Our data further suggest that liver‐specific increase in SOX9 levels is a potential therapeutic approach for BD paucity in ALGS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.