To identify genetic susceptibility loci for hepatitis B virus (HBV)-related hepatocellular carcinoma (HCC) in the Chinese population, we carried out a genome-wide association study (GWAS) in 2,514 chronic HBV carriers (1,161 HCC cases and 1,353 controls) followed by a 2-stage validation among 6 independent populations of chronic HBV carriers (4,319 cases and 4,966 controls). The joint analyses showed that HCC risk was significantly associated with two independent loci: rs7574865 at STAT4, Pmeta = 2.48 × 10−10, odds ratio (OR) = 1.21; and rs9275319 at HLA-DQ, Pmeta = 2.72 × 10−17, OR = 1.49. The risk allele G at rs7574865 was significantly associated with lower mRNA levels of STAT4 in both the HCC tissues and nontumor tissues of 155 individuals with HBV-related HCC (Ptrend = 0.0008 and 0.0002, respectively). We also found significantly lower mRNA expression of STAT4 in HCC tumor tissues compared with paired adjacent nontumor tissues (P = 2.33 × 10−14).
Spelling error correction is an important yet challenging task because a satisfactory solution of it essentially needs human-level language understanding ability. Without loss of generality we consider Chinese spelling error correction (CSC) in this paper. A state-ofthe-art method for the task selects a character from a list of candidates for correction (including non-correction) at each position of the sentence on the basis of BERT, the language representation model. The accuracy of the method can be sub-optimal, however, because BERT does not have sufficient capability to detect whether there is an error at each position, apparently due to the way of pre-training it using mask language modeling. In this work, we propose a novel neural architecture to address the aforementioned issue, which consists of a network for error detection and a network for error correction based on BERT, with the former being connected to the latter with what we call soft-masking technique. Our method of using 'Soft-Masked BERT' is general, and it may be employed in other language detectioncorrection problems. Experimental results on two datasets demonstrate that the performance of our proposed method is significantly better than the baselines including the one solely based on BERT.
Hepatitis B virus affects more than 2 billion people worldwide, 350 million of which have developed chronic hepatitis B (CHB). The genetic factors that confer CHB risk are still largely unknown. We sought to identify genetic variants for CHB susceptibility in the Chinese population. We undertook a genome-wide association study (GWAS) in 2,514 CHB cases and 1,130 normal controls from eastern China. We replicated 33 of the most promising signals and eight previously reported CHB risk loci through a two-stage validation totaling 6,600 CHB cases and 8,127 controls in four independent populations, of which two populations were recruited from eastern China, one from northern China and one from southern China. The joint analyses of 9,114 CHB cases and 9,257 controls revealed significant association of CHB risk with five novel loci. Four loci are located in the human leukocyte antigen (HLA) region at 6p21.
In microblogging services, users usually use hashtags to mark keywords or topics. Along with the fast growing of social network, the task of automatically recommending hashtags has received considerable attention in recent years. Previous works focused only on the use of textual information. However, many microblog posts contain not only texts but also the corresponding images. These images can provide additional information that is not included in the text, which could be helpful to improve the accuracy of hashtag recommendation. Motivated by the successful use of the attention mechanism, we propose a co-attention network incorporating textual and visual information to recommend hashtags for multimodal tweets. Experimental results on the data collected from Twitter demonstrated that the proposed method can achieve better performance than state-of-the-art methods using textual information only.
In this work, we study the problem of partof-speech tagging for Tweets. In contrast to newswire articles, Tweets are usually informal and contain numerous out-ofvocabulary words. Moreover, there is a lack of large scale labeled datasets for this domain. To tackle these challenges, we propose a novel neural network to make use of out-of-domain labeled data, unlabeled in-domain data, and labeled indomain data. Inspired by adversarial neural networks, the proposed method tries to learn common features through adversarial discriminator. In addition, we hypothesize that domain-specific features of target domain should be preserved in some degree. Hence, the proposed method adopts a sequence-to-sequence autoencoder to perform this task. Experimental results on three different datasets show that our method achieves better performance than state-of-the-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.