Recent work has presented intriguing results examining the knowledge contained in language models (LMs) by having the LM fill in the blanks of prompts such as “ Obama is a __ by profession”. These prompts are usually manually created, and quite possibly sub-optimal; another prompt such as “ Obama worked as a __ ” may result in more accurately predicting the correct profession. Because of this, given an inappropriate prompt, we might fail to retrieve facts that the LM does know, and thus any given prompt only provides a lower bound estimate of the knowledge contained in an LM. In this paper, we attempt to more accurately estimate the knowledge contained in LMs by automatically discovering better prompts to use in this querying process. Specifically, we propose mining-based and paraphrasing-based methods to automatically generate high-quality and diverse prompts, as well as ensemble methods to combine answers from different prompts. Extensive experiments on the LAMA benchmark for extracting relational knowledge from LMs demonstrate that our methods can improve accuracy from 31.1% to 39.6%, providing a tighter lower bound on what LMs know. We have released the code and the resulting LM Prompt And Query Archive (LPAQA) at https://github.com/jzbjyb/LPAQA .
A meta-analysis of prospective cohort studies was conducted to examine the relation between fruit and vegetables (FV) consumption and the risk of cardiovascular disease (CVD). We searched PubMed and EMBASE up to June 2014 for relevant studies. Pooled relative risks (RRs) were calculated and dose-response relationship was assessed. Thirty-eight studies, consisting of 47 independent cohorts, were eligible in this meta-analysis. There were 1,498,909 participants (44,013 CVD events) with a median follow-up of 10.5 years. The pooled RR (95% confidence interval) of CVD for the highest versus lowest category was 0.83 (0.79-0.86) for FV consumption, 0.84 (0.79-0.88) for fruit consumption, and 0.87 (0.83-0.91) for vegetable consumption, respectively. Dose-response analysis showed that those eating 800 g per day of FV consumption had the lowest risk of CVD. Our results indicate that increased FV intake is inversely associated with the risk of CVD. This meta-analysis provides strong support for the current recommendations to consume a high amount of FV to reduce CVD risk.
Salinity is an important abiotic stressor that negatively affects plant growth. In this study, we investigated the physiological and molecular mechanisms underlying moderate and high salt tolerance in diploid (2×) and tetraploid (4×) Robinia pseudoacacia L. Our results showed greater H2O2 accumulation and higher levels of important antioxidative enzymes and non-enzymatic antioxidants in 4× plants compared with 2× plants under salt stress. In addition, 4× leaves maintained a relatively intact structure compared to 2× leaves under a corresponding condition. NaCl treatment didn’t significantly affect the photosynthetic rate, stomatal conductance or leaf intercellular CO2 concentrations in 4× leaves. Moreover, proteins from control and salt treated 2× and 4× leaf chloroplast samples were extracted and separated by two-dimensional gel electrophoresis. A total of 61 spots in 2× (24) and 4× (27) leaves exhibited reproducible and significant changes under salt stress. In addition, 10 proteins overlapped between 2× and 4× plants under salt stress. These identified proteins were grouped into the following 7 functional categories: photosynthetic Calvin-Benson Cycle (26), photosynthetic electron transfer (7), regulation/defense (5), chaperone (3), energy and metabolism (12), redox homeostasis (1) and unknown function (8). This study provides important information of use in the improvement of salt tolerance in plants.
Document categorization, which aims to assign a topic label to each document, plays a fundamental role in a wide variety of applications. Despite the success of existing studies in conventional supervised document classification, they are less concerned with two real problems: (1) the presence of metadata: in many domains, text is accompanied by various additional information such as authors and tags. Such metadata serve as compelling topic indicators and should be leveraged into the categorization framework; (2) label scarcity: labeled training samples are expensive to obtain in some cases, where categorization needs to be performed using only a small set of annotated data. In recognition of these two challenges, we propose MetaCat, a minimally supervised framework to categorize text with metadata. Specifically, we develop a generative process describing the relationships between words, documents, labels, and metadata. Guided by the generative model, we embed text and metadata into the same semantic space to encode heterogeneous signals. Then, based on the same generative process, we synthesize training samples to address the bottleneck of label scarcity. We conduct a thorough evaluation on a wide range of datasets. Experimental results prove the effectiveness of MetaCat over many competitive baselines.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.