BackgroundGenerative artificial intelligence (AI) facilitates the development of digital twins, which enable virtual representations of real patients to explore, predict and simulate patient health trajectories, ultimately aiding treatment selection and clinical trial design, among other applications. Recent advances in forecasting utilizing generative AI, in particular large language models (LLMs), highlights untapped potential to overcome real-world data (RWD) challenges such as missingness, noise and limited sample sizes, thus empowering the next generation of AI algorithms in healthcare.MethodsWe developed the Digital Twin - Generative Pretrained Transformer (DT-GPT) model, which leverages biomedical LLMs using rich electronic health record (EHR) data. Our method eliminates the need for data imputation and normalization, enables forecasting of clinical variables, and prediction exploration via a chatbot interface. We analyzed the method’s performance on RWD from both a long-term US nationwide non-small cell lung cancer (NSCLC) dataset and a short-term intensive care unit (MIMIC-IV) dataset.FindingsDT-GPT surpassed state-of-the-art machine learning methods in patient trajectory forecasting on mean absolute error (MAE) for both the long-term (3.4% MAE improvement) and the short-term (1.3% MAE improvement) datasets. Additionally, DT-GPT was capable of preserving cross-correlations of clinical variables (average R2of 0.98), and handling data missingness as well as noise. Finally, we discovered the ability of DT-GPT both to provide insights into a forecast’s rationale and to perform zero-shot forecasting on variables not used during the fine-tuning, outperforming even fully trained, leading task-specific machine learning models on 14 clinical variables.InterpretationDT-GPT demonstrates that LLMs can serve as a robust medical forecasting platform, empowering digital twins that are able to virtually replicate patient characteristics beyond their training data. We envision that LLM-based digital twins will enable a variety of use cases, including clinical trial simulations, treatment selection and adverse event mitigation.