We introduce PubMedQA, a novel biomedical question answering (QA) dataset collected from PubMed abstracts. The task of Pub-MedQA is to answer research questions with yes/no/maybe (e.g.: Do preoperative statins reduce atrial fibrillation after coronary artery bypass grafting?) using the corresponding abstracts. PubMedQA has 1k expert-annotated, 61.2k unlabeled and 211.3k artificially generated QA instances. Each PubMedQA instance is composed of (1) a question which is either an existing research article title or derived from one, (2) a context which is the corresponding abstract without its conclusion, (3) a long answer, which is the conclusion of the abstract and, presumably, answers the research question, and (4) a yes/no/maybe answer which summarizes the conclusion. Pub-MedQA is the first QA dataset where reasoning over biomedical research texts, especially their quantitative contents, is required to answer the questions. Our best performing model, multi-phase fine-tuning of BioBERT with long answer bag-of-word statistics as additional supervision, achieves 68.1% accuracy, compared to single human performance of 78.0% accuracy and majority-baseline of 55.2% accuracy, leaving much room for improvement. PubMedQA is publicly available at https://pubmedqa.github.io.
Many problems in NLP require aggregating information from multiple mentions of the same entity which may be far apart in the text. Existing Recurrent Neural Network (RNN) layers are biased towards short-term dependencies and hence not suited to such tasks. We present a recurrent layer which is instead biased towards coreferent dependencies. The layer uses coreference annotations extracted from an external system to connect entity mentions belonging to the same cluster. Incorporating this layer into a state-of-the-art reading comprehension model improves performance on three datasets -Wikihop, LAMBADA and the bAbi AI tasks -with large gains when training data is scarce.
Contextualized word embeddings derived from pre-trained language models (LMs) show significant improvements on downstream NLP tasks. Pre-training on domain-specific corpora, such as biomedical articles, further improves their performance. In this paper, we conduct probing experiments to determine what additional information is carried intrinsically by the in-domain trained contextualized embeddings. For this we use the pre-trained LMs as fixed feature extractors and restrict the downstream task models to not have additional sequence modeling layers. We compare BERT (Devlin et al., 2018), ELMo (Peters et al., 2018a), BioBERT (Lee et al., 2019) and BioELMo, a biomedical version of ELMo trained on 10M PubMed abstracts. Surprisingly, while fine-tuned BioBERT is better than BioELMo in biomedical NER and NLI tasks, as a fixed feature extractor BioELMo outperforms BioBERT in our probing tasks. We use visualization and nearest neighbor analysis to show that better encoding of entity-type and relational information leads to this superiority.
The addition of chitosan to silicate (Laponite) cross-linked poly(ethylene oxide) (PEO) is used for tuning nanocomposite material properties and tailoring cellular adhesion and bioactivity. By combining the characteristics of chitosan (which promotes cell adhesion and growth, antimicrobial) with properties of PEO (prevents protein and cell adhesion) and those of Laponite (bioactive), the resulting material properties can be used to tune cellular adhesion and control biomineralization. Here, we present the hydration, dissolution, degradation, and mechanical properties of multiphase bio-nanocomposites and relate these to the cell growth of MC3T3-E1 mouse preosteoblast cells. We find that the structural integrity of these bio-nanocomposites is improved by the addition of chitosan, but the release of entrapped proteins is suppressed. Overall, this study shows how chitosan can be used to tune properties in Laponite cross-linked PEO for creating bioactive scaffolds to be considered for bone repair.
The compositions and the multi phase structures of bio-nanocomposite hydrogels made from silicate cross-linked PEO and chitosan are related to some of their physical and biological properties. The gels are injectable and self-healing because the cross-linking is physical and reversible under deformation. The presence of chitosan aggregates affects the viscoelastic properties and reinforces the hydrogel network. The chitosan adds advantageous properties to the hydrogel such as enhanced cell spreading and adhesion. In vitro biocompatibility data indicate that NIH 3T3 fibroblasts grow and proliferate on the bio-nanocomposite hydrogel as well as on hydrogel films.
In several parts of China, there have been a large number of hydropericardium syndrome (HPS) outbreaks caused by serotype 4 fowl adenovirus (FAdV4) in broiler chickens since 2015. These outbreak-associated FAdV-4 strains were distinct from previous circulating strains which did not lead to severe HPS outbreaks. To better understand the molecular epidemiology of the currently circulating FAdV strains for effective diagnosis and treatment of HPS, we isolated 12 HPS outbreak-associated FAdV-4 strains from different regions in central China and investigated their molecular characteristics by performing phylogenetic analyses based on the hexon genes. Our results indicated the FAdV-4 strains in this study all belonged to serotype FAdV-4, species FAdV-C. And in comparison with ON1, KR5, MX-SHP95, PK-01, PJ-06 strains within the cluster where outbreak-associated FAdV-4 strains were located, the nucleotide sequence divergence were 1.31, 1.10, 1.42, 2.77 and 2.84%, respectively. Phylogenetic analyses revealed the hexon genes of the 12 outbreak-associated strains clustered to a relatively independent branch of the tree, and evolved from the same ancestor and we suggested that these outbreak-associated FAdV-4 strains originate from earlier strains in India.
Background: Porcine circovirus type 2 (PCV2) is the pathogen of porcine circovirus associated diseases (PCVAD) and one of the main pathogens in the global pig industry, which has brought huge economic losses to the pig industry. In recent years, there has been limited research on the prevalence of PCV2 in Henan Province. This study investigated the genotype and evolution of PCV2 in this area. Results: We collected 117 clinical samples from different regions of Henan Province from 2015 to 2018. Here, we found that the PCV2 infection rate of PCV2 was 62.4%. Thirty-seven positive clinical samples were selected to amplify the complete genome of PCV2 and were sequenced. Based on the phylogenetic analysis of PCV2 ORF2 and complete genome, it was found that the 37 newly detected strains belonged to PCV2a (3 of 37), PCV2b (21 of 37) and PCV2d (13 of 37), indicating the predominant prevalence of PCV2b and PCV2d strains. In addition, we compared the amino acid sequences and found several amino acid mutation sites among different genotypes. Furthermore, the results of selective pressure analysis showed that there were 5 positive selection sites. Conclusions: This study indicated the genetic diversity, molecular epidemiology and evolution of PCV2 genotypes in Henan Province during 2015-2018.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.