“…We are now in an age where there are more health data than ever before collected (the ‘big data’ era2), at an individual person-linked level across the population, to a fine-grained clinical/contextual level of detail (with the spread of integrated electronic medical records, or iEMRs), using a multitude of rapidly expanding data types and sources, many of which are available near real-time (ie, the four defining characteristics of ‘big data’ being volume, variety, veracity and value3). A recent article reported that 33 zettabytes (33 trillion gigabytes) of data were created worldwide in 2018, and that it is anticipated that by 2020 there will be 44 zettabytes (44 trillion gigabytes) of health data alone created!4 Building the core infrastructure for compilation, storage, handling and integration of these health data (eg, data lakes, business vaults, data marts and data warehouses) is occupying vast amounts of time and resources of health departments, with the demand for data engineers, data architects, data scientists, health informaticians and the like surpassing the market supply.…”