We introduce BPG, an easy-to-use framework for generating publication-quality, highly-customizable plots in the R statistical environment. This open-source package includes novel methods of displaying high-dimensional datasets and facilitates generation of complex multi-panel figures, making it ideal for complex datasets. A webbased interactive tool allows online figure customization, from which R code can be downloaded for seamless integration with computational pipelines. BPG is available at http://labs.oicr.on.ca/boutros-lab/software/bpg . CC-BY 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/156067 doi: bioRxiv preprint first posted online Jun. 26, 2017; BPG P'ng et al.
Page 3 of 8Biological experiments are increasingly generating large, multifaceted datasets. Exploring such data and communicating observations is, in turn, growing more difficult and the need for robust scientific datavisualization is growing rapidly 1,2,3,4 . Myriad data visualization tools exist, particularly as web-based interfaces and non-R-based local software packages. Unfortunately these do not integrate easily into R-based statistical pipelines such as the widely used Bioconductor 5 . Within R, visualization packages exist, including base graphics 6 , ggplot2 7 , lattice 8 , Sushi 9 , circlize 10 , multiDimBio 11 , NetBioV 12 , GenomeGraphs 13 and ggbio 14 . These lack publication-quality defaults, contain limited plot types, provide limited scope for automatic generation of multi-panel figures, are constrained to specific data-types and do not allow interactive visualization.Good visualization software must create a wide variety of chart-types in order to match the diversity of datatypes available. It should provide flexible parametrization for highly customized figures and allow for multiple output formats while employing reasonable, publication-appropriate default settings, such as producing high resolution output. In addition, it should integrate seamlessly with existing computational pipelines while also providing an easily intuitive, interactive mode. There should be an ability to transition between pipeline and interactive mode, allowing cyclical development. Finally, good design principles should be encouraged, such as suggesting appropriate color choices and layouts for specific use-cases. To help users quickly gain proficiency, detailed examples, tutorials and an application programming interface (API) are required. To date, no existing visualization suite fills these needs.To address this gap, we have created the BPG library, which is implemented in R using the grid graphics system and lattice framework. It generates a broad suite of chart-types, ranging from common plots such as bar charts and box plots to more specialized plots, such as Manhattan plots (Figure 1; code is in Supplementary File 1).