21Background: Next-generation RNA sequencing is a rapidly developing technology 22 with complex procedures encompassing different experimental modalities. As the 23 technology evolves and its use expand, so does the need to capture the data 24 provenance from these sequencing studies and the need to create new tools to 25 manage and manipulate these provenance stores. 26 27 Results: Here we used the Ontology for Biomedical Investigations (OBI) and many 28 other ontologies from the Open Biological and Biomedical Ontology (OBO) Foundry 29 as a framework from which to create an application ontology (ORNASEQ: Ontology 30 of RNA sequencing) to capture data provenance for next-generation RNA 31 sequencing studies. Additionally, we provide an extensive real-life sample 32 provenance data set for use in developing new provenance tools and additional 33 sequencing data models. 34 35 Conclusions: The Ontology of RNA Sequencing (ORNASEQ) provides core terms for 36 use in building data models to capture the provenance from next-generation RNA 37 sequencing studies. The supplied sample provenance data also exemplifies many of 38 the complexities of RNA sequencing studies and underscores the need for potent 39 workflow management systems. 40 Keywords 41 Ontology, RNAseq, PROV-XML 42 Fisher, Kim 3 43 Background 44 Until recently the cost of performing next-generation RNA sequencing (RNAseq) 45 experiments limited the amount of data generated by a single lab and managing and 46 properly documenting a few experiments was not fundamentally challenging. 47 Ontology Number Terms BFO: Basic Formal Ontology