Effective molecular representation learning is of great importance to facilitate molecular property prediction. Recent advances for molecular representation learning have shown great promise in applying graph neural networks to model molecules. Moreover, a few recent studies design self-supervised learning methods for molecular representation to address insufficient labelled molecules; however, these self-supervised frameworks treat the molecules as topological graphs without fully utilizing the molecular geometry information. The molecular geometry, also known as the three-dimensional spatial structure of a molecule, is critical for determining molecular properties. To this end, we propose a novel geometry-enhanced molecular representation learning method (GEM). The proposed GEM has a specially designed geometry-based graph neural network architecture as well as several dedicated geometry-level self-supervised learning strategies to learn the molecular geometry knowledge. We compare GEM with various state-of-the-art baselines on different benchmarks and show that it can considerably outperform them all, demonstrating the superiority of the proposed method.
AI-based protein structure prediction pipelines, such as AlphaFold2, have achieved near-experimental accuracy. These advanced pipelines mainly rely on Multiple Sequence Alignments (MSAs) and templates as inputs to learn the co-evolution information from the homologous sequences. Nonetheless, searching MSAs and templates from protein databases is time-consuming, usually taking dozens of minutes. Consequently, we attempt to explore the limits of fast protein structure prediction by using only primary sequences of proteins. HelixFold-Single is proposed to combine a largescale protein language model with the superior geometric learning capability of AlphaFold2. Our proposed method, HelixFold-Single, first pre-trains a large-scale protein language model (PLM) with thousands of millions of primary sequences utilizing the self-supervised learning paradigm, which will be used as an alternative to MSAs and templates for learning the co-evolution information. Then, by combining the pre-trained PLM and the essential components of AlphaFold2, we obtain an end-to-end differentiable model to predict the 3D coordinates of atoms from only the primary sequence. HelixFold-Single is validated in datasets CASP14 and CAMEO, achieving competitive accuracy with the MSA-based methods on the targets with large homologous families. Furthermore, HelixFold-Single consumes much less time than the mainstream pipelines for protein structure prediction, demonstrating its potential in tasks requiring many predictions. The code of HelixFold-Single is available at https://github.com/PaddlePaddle/PaddleHelix/tree/ dev/apps/protein_folding/helixfold-single, and we also provide stable web services on https://paddlehelix.baidu.com/app/drug/protein-single/forecast.
AI-based protein structure prediction pipelines, such as AlphaFold2, have achieved near-experimental accuracy. These advanced pipelines mainly rely on Multiple Sequence Alignments (MSAs) as inputs to learn the co-evolution information from the homologous sequences. Nonetheless, searching MSAs from protein databases is time-consuming, usually taking dozens of minutes. Consequently, we attempt to explore the limits of fast protein structure prediction by using only primary sequences of proteins. HelixFold-Single is proposed to combine a large-scale protein language model with the superior geometric learning capability of AlphaFold2. Our proposed method, HelixFold-Single, first pre-trains a large-scale protein language model (PLM) with thousands of millions of primary sequences utilizing the self-supervised learning paradigm, which will be used as an alternative to MSAs for learning the co-evolution information. Then, by combining the pre-trained PLM and the essential components of AlphaFold2, we obtain an end-to-end differentiable model to predict the 3D coordinates of atoms from only the primary sequence. HelixFold-Single is validated in datasets CASP14 and CAMEO, achieving competitive accuracy with the MSA-based methods on the targets with large homologous families. Furthermore, HelixFold-Single consumes much less time than the mainstream pipelines for protein structure prediction, demonstrating its potential in tasks requiring many predictions. The code of HelixFold-Single is available at https://github.com/PaddlePaddle/PaddleHelix/tree/dev/apps/protein_folding/helixfold-single, and we also provide stable web services on https://paddlehelix.baidu.com/app/drug/protein-single/forecast.
Effective molecular representation learning is of great importance to facilitate molecular property prediction, which is a fundamental task for the drug and material industry. Recent advances in graph neural networks (GNNs) have shown great promise in applying GNNs for molecular representation learning. Moreover, a few recent studies have also demonstrated successful applications of self-supervised learning methods to pre-train the GNNs to overcome the problem of insufficient labeled molecules. However, existing GNNs and pre-training strategies usually treat molecules as topological graph data without fully utilizing the molecular geometry information. Whereas, the three-dimensional (3D) spatial structure of a molecule, a.k.a molecular geometry, is one of the most critical factors for determining molecular physical, chemical, and biological properties. To this end, we propose a novel Geometry Enhanced Molecular representation learning method (GEM) for Chemical Representation Learning (ChemRL). At first, we design a geometry-based GNN architecture that simultaneously models atoms, bonds, and bond angles in a molecule. To be specific, we devised double graphs for a molecule: The first one encodes the atom-bond relations; The second one encodes bond-angle relations. Moreover, on top of the devised GNN architecture, we propose several novel geometry-level self-supervised learning strategies to learn spatial knowledge by utilizing the local and global molecular 3D structures. We compare ChemRL-GEM with various state-of-the-art (SOTA) baselines on different molecular benchmarks and exhibit that ChemRL-GEM can significantly outperform all baselines in both regression and classification tasks. For example, the experimental results show an overall improvement of 8.8% on average compared to SOTA baselines on the regression tasks, demonstrating the superiority of the proposed method.
Abstract. This paper addresses the problem of unsupervised domain adaptation on the task of pedestrian detection in crowded scenes. First, we utilize an iterative algorithm to iteratively select and auto-annotate positive pedestrian samples with high confidence as the training samples for the target domain. Meanwhile, we also reuse negative samples from the source domain to compensate for the imbalance between the amount of positive samples and negative samples. Second, based on the deep network we also design an unsupervised regularizer to mitigate influence from data noise. More specifically, we transform the last fully connected layer into two sub-layers -an element-wise multiply layer and a sum layer, and add the unsupervised regularizer to further improve the domain adaptation accuracy. In experiments for pedestrian detection, the proposed method boosts the recall value by nearly 30% while the precision stays almost the same. Furthermore, we perform our method on standard domain adaptation benchmarks on both supervised and unsupervised settings and also achieve state-of-the-art results.
Facing the aggravating trend of an aging population and a fragmented medical service delivery system, the Chinese Central Government has introduced a series of policies to promote the development of integrated care against the background of the “Healthy China Strategy”. The achievement of integrated care depends on the choice of policy instruments. However, few studies have focused on how policy instruments promote the practice of integrated care in China. This article aims to obtain a deeper understanding of the use of policy instruments in the development of integrated care in China. Policy documents are the carriers of policy instruments. National-level integrated care policy documents from 2009 to 2019 were selected. Using the qualitative document analysis method, this paper conducts an analysis of integrated care policy instruments. In order to comprehensively view the integrated care policy instruments, a three-dimensional analytical framework consisting of the policy instruments dimension, stakeholders dimension, and health service supply chains dimension is proposed. The results are as follows. (1) From the perspective of policy instruments, the integrated care policy has adopted supply-side policy instruments, demand-side policy instruments, and environmental policy instruments. Among the three types of policy instruments, environmental policy instruments are used most frequently, supply-side policies are preferred, while demand-side policy instruments are relatively inadequate. (2) As for the stakeholders dimension, the central policy instruments focus on the health service providers, while less attention is paid to the health service demanders. (3) In terms of health service supply chains, the number of policy instruments used in the prevention stage is the highest, followed by the treatment stage, whereas less attention paid to the rehabilitation stage. Finally, suggestions were made for the development of integrated care by better perfecting policy instruments.
Integrated healthcare has received considerable attention and has developed into the highly important health policy known as Integrated Healthcare in County (IHC) against the background of the Grading Diagnosis and Treatment System (GDTS) in rural China. However, the causal conditions under which different integrated health-care modes might be selected are poorly understood, particularly in the context of China’s authoritarian regime. This study aims to identify these causal conditions, and how they shape the mode selection mechanism for Integrated Healthcare in County (IHC). A theoretical framework consisting of resource heterogeneity, governance structure, and institutional normalization was proposed, and a sample of fifteen IHCs was selected, with data for each IHC being collected from news reports, work reports, government documents and field research for Fuzzy-sets Qualitative Comparative Analysis (fsQCA). This study firstly pointed out that strong governmental control and centralization are necessary conditions for the administration-oriented organization mode (MOA). Additionally, this research found three critical configured paths in the selection of organizational modes. Specifically, we found that the combination of low resource heterogeneity, weak governmental control, centralization, and normalization was sufficient to explain the selection path of the insurance-driven organization mode (MOI); the combination of low resource heterogeneity, strong governmental control, centralization, and normalization was sufficient for selecting MOA; and the combination of weak governmental control, weak centralization, and weak normalization was sufficient for selecting the contractual organization mode (MOC). Our study highlighted the necessity and feasibility of constructing different IHC modes separately and promoting their development gradually, as a result of the complex relationships among the causal conditions described above, thus helping to optimize the distribution of health resources and integrate the healthcare system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.