Gilead Presents Findings at ESMO AI 2025, Leveraging Cornerstone AI for Stacking and Harmonizing Two EMR Breast Cancer Databases

We are proud to announce that the team at Gilead recently presented a poster at ESMO AI 2025, highlighting how Cornerstone’s AI-driven platform enabled breast cancer datasets from Flatiron Health and ConcertAI to be harmonized into a single, unified cohort for analysis. This collaboration demonstrates how AI-powered data harmonization can create high-quality research assets—helping teams maximize sample sizes, reduce manual data preparation timelines, and accelerate access to analysis-ready RWD.

See here for a link to the abstract, and below for a brief recap.

Methodology

The Cornerstone AI platform, purpose-built for clinical data transformation and cleaning, was used to convert ConcertAI (CAI) breast cancer (BC) data into the Flatiron (FHRD) schema. Large language models mapped CAI tables and fields into the FHRD target schema (following Datavant tokenization). Differences in variable names were standardized using custom embedding models, and when necessary, data across multiple CAI tables were combined to form derived tables aligned to FHRD structure. An interactive UI enabled transparent review and adjustments tailored to downstream analytic needs.

Key Results

Increased sample size:
Eligible sample size increased by 45%, resulting in a final combined cohort of 1,935 patients.

De-duplication:
Cornerstone AI identified and de-duplicated 8.7% of overlapping patients across the two datasets.

Table and field harmonization:

  • 78% of ConcertAI tables (28/36) mapped successfully to the Flatiron schema

  • 278 source fields from CAI were aligned to 153 target fields in FHRD

  • Intermediate derived tables were created where needed—for example, FHRD’s cancer diagnosis and staging tables required logic that pulled from three separate CAI tables

Impact: 

AI-enabled harmonization of BC RWD across CAI and FHRD is feasible. It increases patient cohort size, therefore statistical power, and supports downstream analytical code reuse, all while maintaining transparency,  key requirements for clinical research and healthcare applications. Special thanks to our Chief AI Science Officer Michael Elashoff for his work co-authoring this important research.

If you’d like to explore how Cornerstone can help support your data harmonization needs, contact us at partnerships@cornerstoneai.com

Next
Next

Cornerstone AI and Bristol Myers Squibb Present Findings at 2025 IMPACCT RWE