We present the Tesco Grocery 1.0 dataset: a record of 420 M food items purchased by 1.6 M fidelity card owners who shopped at the 411 Tesco stores in Greater London over the course of the entire year of 2015, aggregated at the level of census areas to preserve anonymity. For each area, we report the number of transactions and nutritional properties of the typical food item bought including the average caloric intake and the composition of nutrients. The set of global trade international numbers (barcodes) for each food type is also included. To establish data validity we: i) compare food purchase volumes to population from census to assess representativeness, and ii) match nutrient and energy intake to official statistics of food-related illnesses to appraise the extent to which the dataset is ecologically valid. Given its unprecedented scale and geographic granularity, the data can be used to link food purchases to a number of geographically-salient indicators, which enables studies on health outcomes, cultural aspects, and economic factors.
Background & SummaryTesco is a British multinational grocery and general merchandise retailer. In 2015, it was 9th highest-grossing retailer in the world, with 81B in global revenue 1 and the biggest grocery retailer in UK, with 28% of market share 2 . Tesco operates a loyalty scheme where customers apply for a Clubcard that is used for both in-store and online purchases to accumulate points that can be later spent to redeem prizes or discount vouchers. With the customer consent, the record of their purchases is archived and anonymously linked to their Clubcard number. In this paper, we focus on the in-store purchases done in the 411 Tesco shops within the boundaries of Greater London during the entire year of 2015. We present aggregated and privacy-preserving data views that combine individual purchases at different spatial granularities, from Lower Super Output Areas (containing around 2,000 residents each, on average) to Boroughs (more than 250k residents, on average).Despite the importance of studying food consumption at scale, there is little data about what people actually eat over long periods of time. The fine-grained geographical information included in Tesco Grocery 1.0 is the key to link food consumption data of an entire city to any attribute that can be measured at the level of statistical census areas. These include cultural aspects (ethnicity 3 , migration 4,5 ), societal aspects (youth alcohol use 6 ), economic factors (deprivation 7 , inequality 8 ), health determinants (medical prescriptions 9 , health awareness and daily habits 10,11 ), and social media discourse (textual 12 or visual 13 descriptors of geo-referenced posts).Several studies mined grocery sales data (which has not been made publicly available) to, for example, build recommender systems that are able to suggest what people might like based on their past purchases 14-16 , or establish whether healthy foods tend to be pricey 17 and, ultimately, whether their purchase tends to be mediated by pri...