Generalist medical foundation models, pre-trained on massive medical datasets, have shown great potential as the next generation of medical artificial intelligence (AI). However, collecting millions of medical data is extremely expensive, time-consuming, and raises concerns over the high-risk leakage of sensitive private patient information. Here, we present a general framework that enables ultra-high data efficiency in building medical foundation models by leveraging expertise-informed generative AI to scale the limited pre-training dataset. Specifically, we follow this framework and propose a new foundation model DERETFound in ophthalmology, using only 16.7% (150,786 images) of the real-world colour fundus photography images required in the latest retinal foundation model RETFound (904,170 images, Y. Zhou et al, Nature 2023). By integrating expert insights into generative AI, we generate approximately one million synthetic data that are consistent with real retinal images in terms of physiological structures and feature distribution. DERETFound achieves comparable or even superior performance to RETFound on nine public datasets across four downstream tasks, including diabetic retinopathy grading, glaucoma diagnosis, age-related macular degeneration grading, multi-disease classification, and challenging external evaluation. In addition, DERETFound demonstrates competitively high label efficiency, saving over 50% of expert-annotated training data compared to RETFound on datasets for diabetic retinopathy grading. Our data-efficient framework challenges the classic view that building medical foundation models requires the collection of large amounts of real-world medical data as a prerequisite. The framework also provides an effective solution for any other diseases that were once discouraged from building foundation models due to limited data, which has profound significance for medical AI.