Open-domain Question Answering models that directly leverage question-answer (QA) pairs, such as closed-book QA (CBQA) models and QA-pair retrievers, show promise in terms of speed and memory compared with conventional models which retrieve and read from text corpora. QA-pair retrievers also offer interpretable answers, a high degree of control, and are trivial to update at test time with new knowledge. However, these models fall short of the accuracy of retrieve-and-read systems, as substantially less knowledge is covered by the available QA-pairs relative to text corpora like Wikipedia. To facilitate improved QA-pair models, we introduce Probably Asked Questions (PAQ), a very large resource of 65M automatically generated QA-pairs. We introduce a new QA-pair retriever, RePAQ, to complement PAQ. We find that PAQ preempts and caches test questions, enabling RePAQ to match the accuracy of recent retrieve-and-read models, whilst being significantly faster. Using PAQ, we train CBQA models which outperform comparable baselines by 5%, but trail RePAQ by over 15%, indicating the effectiveness of explicit retrieval. RePAQ can be configured for size (under 500MB) or speed (over 1K questions per second) while retaining high accuracy. Lastly, we demonstrate RePAQ’s strength at selective QA, abstaining from answering when it is likely to be incorrect. This enables RePAQ to “back-off” to a more expensive state-of-the-art model, leading to a combined system which is both more accurate and 2x faster than the state-of-the-art model alone.
BackgroundThe influenza viruses circulating in animals sporadically transmit to humans and pose pandemic threats. Animal models to evaluate the potential public health risk potential of these viruses are needed.Methodology/Principal FindingsWe investigated the guinea pig as a mammalian model for the study of the replication and transmission characteristics of selected swine H1N1, H1N2, H3N2 and avian H9N2 influenza viruses, compared to those of pandemic (H1N1) 2009 and seasonal human H1N1, H3N2 influenza viruses. The swine and avian influenza viruses investigated were restricted to the respiratory system of guinea pigs and shed at high titers in nasal tracts without prior adaptation, similar to human strains. None of the swine and avian influenza viruses showed transmissibility among guinea pigs; in contrast, pandemic (H1N1) 2009 virus transmitted from infected guinea pigs to all animals and seasonal human influenza viruses could also horizontally transmit in guinea pigs. The analysis of the receptor distribution in the guinea pig respiratory tissues by lectin histochemistry indicated that both SAα2,3-Gal and SAα2,6-Gal receptors widely presented in the nasal tract and the trachea, while SAα2,3-Gal receptor was the main receptor in the lung.Conclusions/SignificanceWe propose that the guinea pig could serve as a useful mammalian model to evaluate the potential public health threat of swine and avian influenza viruses.
A core problem of information retrieval (IR) is relevance matching, which is to rank documents by relevance to a user's query. On the other hand, many NLP problems, such as question answering and paraphrase identification, can be considered variants of semantic matching, which is to measure the semantic distance between two pieces of short texts. While at a high level both relevance and semantic matching require modeling textual similarity, many existing techniques for one cannot be easily adapted to the other. To bridge this gap, we propose a novel model, HCAN (Hybrid Co-Attention Network), that comprises (1) a hybrid encoder module that includes ConvNet-based and LSTM-based encoders, (2) a relevance matching module that measures soft term matches with importance weighting at multiple granularities, and (3) a semantic matching module with co-attention mechanisms that capture context-aware semantic relatedness. Evaluations on multiple IR and NLP benchmarks demonstrate state-ofthe-art effectiveness compared to approaches that do not exploit pretraining on external data. Extensive ablation studies suggest that relevance and semantic matching signals are complementary across many problem settings, regardless of the choice of underlying encoders.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.