Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. Starting with a set of labeler-written prompts and prompts submitted through the OpenAI API, we collect a dataset of labeler demonstrations of the desired model behavior, which we use to fine-tune GPT-3 using supervised learning. We then collect a dataset of rankings of model outputs, which we use to further fine-tune this supervised model using reinforcement learning from human feedback. We call the resulting models InstructGPT. In human evaluations on our prompt distribution, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters. Moreover, InstructGPT models show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets. Even though InstructGPT still makes simple mistakes, our results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent.
What factors constrain whether tool use modulates the user's body representations? To date, studies on representational plasticity following tool use have primarily focused on the act of using the tool. Here, we investigated whether the tool's morphology also serves to constrain plasticity. In 2 experiments, we varied whether the tool was morphologically similar to a target body part (Experiment 1, hand; Experiment 2, arm). Participants judged the tactile distance between pairs of points applied to their tool-using target body surface and forehead (control surface) before and after tool use. We applied touch in 2 orientations, allowing us to quantify how tool use modulates the representation's shape. Significant representational plasticity in hand shape (increase in width, decrease in length) was found when the tool was morphologically similar to a hand (Experiment 1A), but not when the tool was arm-shaped (Experiment 1B). Conversely, significant representational plasticity was found on the arm when the tool was arm-shaped (Experiment 2B), but not when hand-shaped (Experiment 2A). Taken together, our results indicate that morphological similarity between the tool and the effector constrains tool-induced representational plasticity. The embodiment of tools may thus depend on a match-to-template process between tool morphology and representation of the body.
The ability to extend sensory information processing beyond the nervous system has been observed throughout the animal kingdom; for example, when rodents palpate objects using whiskers and spiders localize prey using webs. We investigated whether the ability to sense objects with tools represents an analogous information processing scheme in humans. Here we provide evidence from behavioural psychophysics, structural mechanics and neuronal modelling, which shows that tools are treated by the nervous system as sensory extensions of the body rather than as simple distal links between the hand and the environment. We first demonstrate that tool users can accurately sense where an object contacts a wooden rod, just as is possible on the skin. We next demonstrate that the impact location is encoded by the modal response of the tool upon impact, reflecting a pre-neuronal stage of mechanical information processing akin to sensing with whiskers and webs. Lastly, we use a computational model of tactile afferents to demonstrate that impact location can be rapidly re-encoded into a temporally precise spiking code. This code predicts the behaviour of human participants, providing evidence that the information encoded in motifs shapes localization. Thus, we show that this sensory capability emerges from the functional coupling between the material, biomechanical and neural levels of information processing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.