Pengcheng He scite author profile

Plant traits-the morphological, anatomical, physiological, biochemical and phenological characteristics of plants-determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait-based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits-almost complete coverage for 'plant growth form'. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait-environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects.We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives. Geosphere-Biosphere Program (IGBP) and DIVERSITAS, the TRY database (TRY-not an acronym, rather a statement of sentiment; https ://www.try-db.org; Kattge et al., 2011) was proposed with the explicit assignment to improve the availability and accessibility of plant trait data for ecology and earth system sciences. The Max Planck Institute for Biogeochemistry (MPI-BGC) offered to host the database and the different groups joined forces for this community-driven program. Two factors were key to the success of TRY: the support and trust of leaders in the field of functional plant ecology submitting large databases and the long-term funding by the Max Planck Society, the MPI-BGC and the German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, which has enabled the continuous development of the TRY database.

show abstract

Multi-Task Deep Neural Networks for Natural Language Understanding

Liu¹,

He²,

Chen³

et al. 2019

899

713

View full text Add to dashboard Cite

We present MT-DNN 1 , an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multitask knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pretrained models will be publicly available at https://github.com/namisan/mt-dnn.

show abstract

On the Variance of the Adaptive Learning Rate and Beyond

Liu¹,

Jiang²,

He³

et al. 2019

Preprint

305

296

View full text Add to dashboard Cite

The learning rate warmup heuristic achieves remarkable success in stabilizing training, accelerating convergence and improving generalization for adaptive stochastic optimization algorithms like RMSprop and Adam. Here, we study its mechanism in details. Pursuing the theory behind warmup, we identify a problem of the adaptive learning rate (i.e., it has problematically large variance in the early stage), suggest warmup works as a variance reduction technique, and provide both empirical and theoretical evidence to verify our hypothesis. We further propose RAdam, a new variant of Adam, by introducing a term to rectify the variance of the adaptive learning rate. Extensive experimental results on image classification, language modeling, and neural machine translation verify our intuition and demonstrate the effectiveness and robustness of our proposed method. 1 * Work was done during an internship at Microsoft. † Work was done during an internship at Microsoft.

show abstract

A translational study of circulating cell-free microRNA-1 in acute myocardial infarction

et al. 2010

View full text Add to dashboard Cite

MicroRNAs (miRNAs) precipitate in many diseases including cardiovascular disease. In contrast to our original thought, miRNAs exist in circulating blood and they are relatively stable due to binding with other materials. The current translational study is to establish a method to determine the absolute amount of a miRNA in blood and to determine the potential applications of circulating cell-free microRNA-1 (miR-1) in acute myocardial infarction (AMI). The results revealed that miR-1 is the most abundant miRNA in the heart and is also a heart and muscle specific miRNA. In a cardiac cell necrosis model induced by Triton-100 in vitro, we found that cardiac miR-1 can be released into cultured medium and is stable at least for 24 h. In a rat model of AMI induced by coronary ligation, we found that serum miR-1 is quickly increased after AMI with the peak at 6h, in which an over 200-fold increased miR-1 was demonstrated. The miR-1 level was returned to basal level at 3 days after AMI. Moreover, the serum miR-1 level in rats with AMI has a strong positive correlation with the myocardial size. To further verify the relationship between myocardial size and miR-1 level, an ischemic preconditioning model was applied. The result showed that ischemic preconditioning significantly reduced the circulating miR-1 and the myocardial size induced by ischemia-reperfusion injury. Finally, the levels of circulating cell-free miR-1 were significantly increased in patients with AMI and had a positive correlation with serum CK-MB levels. The results suggest that serum miR-1 could be a novel sensitive diagnostic biomarker for AMI.

show abstract

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

He¹,

Liu²,

Gao³

et al. 2020

Preprint

274

231

View full text Add to dashboard Cite

Recent progress in pre-trained neural language models has significantly improved the performance of many natural language processing (NLP) tasks. In this paper we propose a new model architecture DeBERTa (Decoding-enhanced BERT with disentangled attention) that improves the BERT and RoBERTa models using two novel techniques. The first is the disentangled attention mechanism, where each word is represented using two vectors that encode its content and position, respectively, and the attention weights among words are computed using disentangled matrices on their contents and relative positions. Second, an enhanced mask decoder is used to replace the output softmax layer to predict the masked tokens for model pretraining. We show that these two techniques significantly improve the efficiency of model pre-training and performance of downstream tasks. Compared to RoBERTa-Large, a DeBERTa model trained on half of the training data performs consistently better on a wide range of NLP tasks, achieving improvements on MNLI by +0.9% (90.2% vs. 91.1%), on SQuAD v2.0 by +2.3% (88.4% vs. 90.7%) and RACE by +3.6% (83.2% vs. 86.8%). The DeBERTa code and pre-trained models will be made publicly available https://github.com/microsoft/DeBERTa.Preprint. Under review.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Pengcheng He

TRY plant trait database – enhanced coverage and open access

Multi-Task Deep Neural Networks for Natural Language Understanding

On the Variance of the Adaptive Learning Rate and Beyond

A translational study of circulating cell-free microRNA-1 in acute myocardial infarction

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Contact Info

Product

Resources

About