“…A typical clinical practice uses a diverse set of information formats contained within the patient electronic health record (EHR) such as tabular data (e.g., age, demographics, procedures, history, billing codes), image data (e.g., photographs, x-rays, computerized-tomography scans, magnetic resonance imaging, pathology slides), time-series data (e.g., intermittent pulse oximetry, blood chemistry, respiratory analysis, electrocardiograms, ultra-sounds, in-vitro tests, wearable sensors), structured sequence data (e.g., genomics, proteomics, metabolomics) and unstructured sequence data (e.g., notes, forms, written reports, voice recordings, video) among other sources 6 . Recently, AI/ML models leveraging multiple data modalities have been demonstrated for the domains of cardiology 7 – 9 , dermatology 10 , gastroenterology 11 , gynecology 12 , hematology 13 , immunology 14 , nephrology 15 , neurology 16 , 17 , oncology 18 – 20 , ophthalmology 21 , psychiatry 22 , radiology 23 – 25 , public health 26 and healthcare operational analytics (i.e., mortality, length-of-stay, and discharge predictions) 27 – 30 . Furthermore, it has been shown that multimodality in most of these domains can increase the performance of AI/ML systems (accuracy: 1.2–27.7%) compared to single-modality approaches for the same task 2 .…”