A General Language Assistant as a Laboratory for Alignment

Askell, Amanda; Bai, Yuntao; Chen, Anna; Drain, Dawn; Ganguli, Deep; Henighan, Tom; Jones, Andy; Joseph, Nicholas; Mann, Ben; DasSarma, Nova; Elhage, Nelson; Hatfield-Dodds, Zac; Hernandez, Danny; Kernion, Jackson; Kamal, Ndousse,; Olsson, Catherine; Amodei, Dario; Brown, Tom; Clark, Jack A.; McCandlish, Sam; Olah, Chris; Kaplan, Jared

doi:10.48550/arxiv.2112.00861

Cited by 36 publications

(75 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Pile-CC: Papers that train models on datasets that include the Pile-CC subset of the Pile include Luo et al [2021], Kharya and Alvi [2021], Askell et al [2021]. While this data has likely been used by other researchers for various purposes, we are unaware of any uses that would be directly comparable.…”

Section: Motivation For Dataset Creationmentioning

confidence: 99%

“…While this data has likely been used by other researchers for various purposes, we are unaware of any uses that would be directly comparable. OpenWebText2: Papers that train models on datasets that include the OpenWebText2 subset of the Pile include Luo et al [2021], Kharya and Alvi [2021] FreeLaw: Papers that train models on datasets that include the FreeLaw subset of the Pile include Askell et al [2021]. While this data has likely been used by other researchers for various purposes, we are unaware of any uses that would be directly comparable.…”

Section: Motivation For Dataset Creationmentioning

confidence: 99%

“…OpenSubtitles: OpenSubtitles has been used extensively, including by [Wang, 2017, Sjöblom et al, 2018, Zilio et al, 2018, Gordon and Duh, 2020, Krišlauks and Pinnis, 2020. Papers that train models on datasets that include the OpenSubstitles subset of the Pile include Luo et al [2021], Askell et al [2021] DM Mathematics: DM Mathematics has been used extensively, including by [Cho et al, 2019, Qi and Wu, 2019, Talmor et al, 2020, Dinu et al, 2020, Firestone, 2020. BookCorpus2: The BookCorpus dataset that BookCorpus2 is based on has been used extensively, including by [Karpathy and Fei-Fei, 2015, Reed et al, 2016, Ba et al, 2016, Devlin et al, 2018.…”

Section: Motivation For Dataset Creationmentioning

confidence: 99%

“…BookCorpus2 was studied by Bandy and Vincent [2021]. Papers that train models on datasets that include the Pile-CC subset of the Pile include Kharya and Alvi [2021], Askell et al [2021] Ubuntu IRC: Papers that train models on datasets that include the Pile-CC subset of the Pile include Askell et al [2021]. While this data has likely been used by other researchers for various purposes, we are unaware of any uses that would be directly comparable.…”

Section: Motivation For Dataset Creationmentioning

confidence: 99%

“…The primary goal of the data processing is to create an extremely high quality dataset for language modeling. The Pile's success is evidenced both by its widespread adoption for training language models [Lieber et al, 2021, Tang, 2021, Askell et al, 2021 and by studies of the dataset and the models trained on it such as Peyrard et al [2021], Mukherjee et al [2021], Mitchell et al [2021].…”

Section: Collection Processmentioning

confidence: 99%

See 4 more Smart Citations

Datasheet for the Pile

Biderman¹,

Bicheno²,

Gao³

2022

Preprint

View full text Add to dashboard Cite

This datasheet describes the Pile, a 825 GiB dataset of human-authored text compiled by EleutherAI for use in large-scale language modeling. The Pile is comprised of 22 different text sources, ranging from original scrapes done for this project, to text data made available by the data owners, to third-party scrapes available online. Background on the PileThe Pile is a massive text corpus created by EleutherAI for large-scale language modeling efforts. It is comprised of textual data from 22 sources (see below) and can be downloaded from the official website as well as from a community mirror. Each source dataset is at its core a textual work, and any non-textual data (including metadata) has been removed. While still preserving their internal order, the documents from all the sources have been randomly shuffled. For further information on the Pile, see Gao et al. [2020].This document is not intended to be -and should not be used as -a substitute for a datasheet for the original versions of the component datasets. While it is accurate for the text data that we derived from each component dataset, the original source dataset may have other properties. This document is intended to inform people interested in using the Pile for natural language processing. People interested in using the original datasets should contact the data owners for information about the properties of the original data.It is not always the case that the answer to the questions below are known with certainty. For example, while we have no reason to believe that personal identifying information (PII) is contained in most of the subsets of our dataset, it is always possible that someone wrote down PII in a document and uploaded it to arXiv. Due to the sheer scale of the data, it is impractical to systematically search through every text to validate that it is what it purports to be. We have endeavored to answer the questions below as best we can, and to be open and honest about the limitations of the accuracy of this document. Anyone who engages in research on or with the Pile is welcome to contact us to have their findings added to this document. Similarly, we welcome all comments, suggestions, or corrections. Datasets contained in the Pile:Pile-CC: The Pile-CC dataset is a sample from the Common Crawl WARCs that has been converted to text using jusText [Endrédy and Novák, 2013].

show abstract

Section: Motivation For Dataset Creationmentioning

confidence: 99%

Section: Motivation For Dataset Creationmentioning

confidence: 99%

Section: Motivation For Dataset Creationmentioning

confidence: 99%

Section: Motivation For Dataset Creationmentioning

confidence: 99%

Section: Collection Processmentioning

confidence: 99%

See 3 more Smart Citations

Datasheet for the Pile

Biderman¹,

Bicheno²,

Gao³

2022

Preprint

View full text Add to dashboard Cite

show abstract

The rapid competitive economy of machine learning development: a discussion on the social risks and benefits

Walter

2023

AI Ethics

View full text Add to dashboard Cite

Research in artificial intelligence (AI) has started in the twentieth century but it was not until 2012 that modern models of artificial neural networks aided the machine learning process considerably so that in the past ten years, both computer vision as well as natural language processing have become increasingly better. AI developments have accelerated rapidly, leaving open questions about the potential benefits and risks of these dynamics and how the latter might be managed. This paper discusses three major risks, all lying in the domain of AI safety engineering: the problem of AI alignment, the problem of AI abuse, and the problem of information control. The discussion goes through a short history of AI development, briefly touching on the benefits and risks, and eventually making the case that the risks might potentially be mitigated through strong collaborations and awareness concerning trustworthy AI. Implications for the (digital) humanities are discussed.

show abstract

STELA: a community-centred approach to norm elicitation for AI alignment

Bergman,

Marchal,

Mellor

et al. 2024

Sci Rep

View full text Add to dashboard Cite

Value alignment, the process of ensuring that artificial intelligence (AI) systems are aligned with human values and goals, is a critical issue in AI research. Existing scholarship has mainly studied how to encode moral values into agents to guide their behaviour. Less attention has been given to the normative questions of whose values and norms AI systems should be aligned with, and how these choices should be made. To tackle these questions, this paper presents the STELA process (SocioTEchnical Language agent Alignment), a methodology resting on sociotechnical traditions of participatory, inclusive, and community-centred processes. For STELA, we conduct a series of deliberative discussions with four historically underrepresented groups in the United States in order to understand their diverse priorities and concerns when interacting with AI systems. The results of our research suggest that community-centred deliberation on the outputs of large language models is a valuable tool for eliciting latent normative perspectives directly from differently situated groups. In addition to having the potential to engender an inclusive process that is robust to the needs of communities, this methodology can provide rich contextual insights for AI alignment.

show abstract

A General Language Assistant as a Laboratory for Alignment

Cited by 36 publications

References 3 publications

Datasheet for the Pile

Datasheet for the Pile

The rapid competitive economy of machine learning development: a discussion on the social risks and benefits

STELA: a community-centred approach to norm elicitation for AI alignment

Contact Info

Product

Resources

About