Jiang Xu scite author profile

Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. Starting with a set of labeler-written prompts and prompts submitted through the OpenAI API, we collect a dataset of labeler demonstrations of the desired model behavior, which we use to fine-tune GPT-3 using supervised learning. We then collect a dataset of rankings of model outputs, which we use to further fine-tune this supervised model using reinforcement learning from human feedback. We call the resulting models InstructGPT. In human evaluations on our prompt distribution, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters. Moreover, InstructGPT models show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets. Even though InstructGPT still makes simple mistakes, our results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent.

show abstract

Historical and projected emissions of major halocarbons in China

Wan

Zhang

et al. 2009

Atmospheric Environment

View full text Add to dashboard Cite

Spatial and temporal trends of climate change in Xinjiang, China

et al. 2011

View full text Add to dashboard Cite

Temperature and precipitation time series datasets from 1961 to 2005 at 65 meteorological stations were used to reveal the spatial and temporal trends of climate change in Xinjiang, China. Annual and seasonal mean air temperature and total precipitation were analyzed using Mann-Kendall (MK) test, inverse distance weighted (IDW) interpolation, and R/S methods. The results indicate that: (1) both temperature and precipitation increased in the past 45 years, but the increase in temperature is more obvious than that of precipitation;(2) for temperature increase, the higher the latitude and the higher the elevation the faster the increase, though the latitude has greater influence on the increase. Northern Xinjiang shows a faster warming than southern Xinjiang, especially in summer; (3) increase of precipitation occurs mainly in winter in northern Xinjiang and in summer in southern Xinjiang. Ili, which has the most precipitation in Xinjiang, shows a weak increase of precipitation; (4) although both temperature and precipitation increased in general, the increase is different inside Xinjiang; (5) Hurst index (H) analysis indicates that climate change will continue the current trends.

show abstract

Valuing the health risks of particulate air pollution in the Pearl River Delta, China

Huang

Zhang

2012

Environmental Science & Policy

135

View full text Add to dashboard Cite

WebGPT: Browser-assisted question-answering with human feedback

Nakano¹,

Hilton²,

Balaji³

et al. 2021

Preprint

View full text Add to dashboard Cite

We fine-tune GPT-3 to answer long-form questions using a text-based webbrowsing environment, which allows the model to search and navigate the web. By setting up the task so that it can be performed by humans, we are able to train models on the task using imitation learning, and then optimize answer quality with human feedback. To make human evaluation of factual accuracy easier, models must collect references while browsing in support of their answers. We train and evaluate our models on ELI5, a dataset of questions asked by Reddit users. Our best model is obtained by fine-tuning GPT-3 using behavior cloning, and then performing rejection sampling against a reward model trained to predict human preferences. This model's answers are preferred by humans 56% of the time to those of our human demonstrators, and 69% of the time to the highest-voted answer from Reddit.

show abstract

The Spatial Relationship of Tourist Distribution in Chinese Cities

Yan

Pei‐jun

2011

Tourism Geographies

View full text Add to dashboard Cite

Turbiscan Lab® Expert analysis of the biological demulsification of a water-in-oil emulsion by two biodemulsifiers

Liu

Huang

Liu

et al. 2011

Journal of Hazardous Materials

117

View full text Add to dashboard Cite

Dependence of trends in and sensitivity of drought over China (1961–2013) on potential evaporation model

Jie

Sun

et al. 2016

Geophysical Research Letters

View full text Add to dashboard Cite

The Palmer Drought Severity Index (PDSI) can lead to controversial results in assessing droughts responding to global warming. Here we assess recent changes in the droughts over China (1961–2013) using the PDSI with two different estimates, i.e., the Thornthwaite (PDSI_th) and Penman‐Monteith (PDSI_pm) approaches. We found that droughts have become more severe in the PDSI_th but slightly lessened in the PDSI_pm estimate. To quantify and interpret the different responses in the PDSI_th and PDSI_pm, we designed numerical experiments and found that drying trend of the PDSI_th responding to the warming alone is 3.4 times higher than that of the PDSI_pm, and the latter was further compensated by decreases in wind speed and solar radiation causing the slightly wetting in the PDSI_pm. Interestingly, we found that interbasin difference in the PDSI_th and PDSI_pm responses to the warming alone tends to be larger in warmer basins, exponentially depending on mean temperature.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jiang Xu

Training language models to follow instructions with human feedback

Historical and projected emissions of major halocarbons in China

Spatial and temporal trends of climate change in Xinjiang, China

Valuing the health risks of particulate air pollution in the Pearl River Delta, China

WebGPT: Browser-assisted question-answering with human feedback

The Spatial Relationship of Tourist Distribution in Chinese Cities

Turbiscan Lab® Expert analysis of the biological demulsification of a water-in-oil emulsion by two biodemulsifiers

Dependence of trends in and sensitivity of drought over China (1961–2013) on potential evaporation model

Contact Info

Product

Resources

About