While real-world evidence from electronic medical records (EMR) data can help inform drug development by advancing the understanding of patient care, there are many technological challenges to unlocking and effectively using EMR data. EMR data is traditionally difficult to use due to its inaccessibility, quality, breadth, and multimodal nature. Our platform, powered by machine learning and natural language processing, provides continuous access to deep, longitudinal multimodal EMR data in a privacy-preserving, federated manner. The platform is built on de-identified EMRs of 11+ million patients through exclusive partnerships with Mayo Clinic and Duke Health. It includes structured data (diagnoses, medications, lab results, procedures, vitals, etc.), unstructured clinical text (physician notes, pathology, and radiology reports), imaging (ECGs, PET, CT, etc.), digital pathology, and sequencing data. Sourced from academic medical centers (AMCs) and their affiliates across almost all US states, our data reflects the leading edge of clinical care, spans all therapeutic areas, and is enriched for rare diseases. Our data platform is “self-serve” and provides tools for both technical and non-technical users. Our platform has been used to provide consulting services to top pharma companies over the last 3 years, leading to the publication of 30+ peer-reviewed studies and the development of multiple FDA “Breakthrough Designation” diagnostic algorithms.