AI & Machine Learning

AI-Driven Patient Matching for Clinical Trials: How CROs Are Reducing Enrollment Delays

Dylan Okoye May 1, 2026 AI & Machine Learning

Clinical trial enrollment has a math problem. For every patient who makes it into a Phase II or III trial, a mid-size CRO has typically screened somewhere between 8 and 25 others who didn’t qualify. That screen-failure burden costs $3,000–$8,000 per failed candidate in coordinator labor and site fees, and it adds months to enrollment timelines that sponsors are already watching closely.

The core issue isn’t that eligible patients don’t exist. They do. The problem is finding them before coordinators spend 45 minutes per candidate manually cross-referencing a protocol’s inclusion and exclusion criteria against paper charts, printed EHR exports, and disconnected lab systems.

Where Manual Screening Breaks Down

Most eligibility screening failures happen at a predictable point: secondary exclusion criteria that could have been detected from structured EHR data before anyone spent time on the patient. A coordinator works through a promising candidate for 40 minutes, then finds a prior diagnosis in a medication history that disqualifies them on a criterion that was checkable from an ICD-10 code at the outset.

The structural problem is that protocol criteria are written in clinical language, not database queries. “No prior treatment with any VEGF inhibitor in the last 12 months” needs to be translated into RxNorm medication class lookups, date range filters, and data source cross-references before it can be evaluated algorithmically. That translation work typically doesn’t happen systematically—it happens in the head of each coordinator, inconsistently, across hundreds of candidate reviews.

A Two-Pass Approach to Screening

AI-assisted patient matching addresses this by separating the deterministic work from the judgment work. The first pass applies rule-based filters against structured EHR data: ICD-10 diagnosis codes, LOINC lab values, RxNorm medication records, CPT procedure history. These filters run in under two seconds per patient and typically eliminate 80–90% of the screened population before any coordinator time is spent.

What remains after the first pass is a smaller set of candidates who have cleared every hard exclusion criterion checkable from structured data. The second pass then applies NLP models to clinical notes—pathology reports, discharge summaries, progress notes—to surface patients whose eligibility evidence lives in narrative documentation rather than structured fields. This is where rare disease and oncology trials diverge most sharply from simpler protocols: a patient’s prior treatment response or disease severity classification often exists only as attending physician prose, not as a discrete data field.

What Changes for Coordinators

The output of a well-implemented patient matching system isn’t just a shorter list. It’s a different kind of list. Each candidate arrives with an eligibility score, a criterion-by-criterion pass/fail breakdown, and the specific data fields that drove each determination. A coordinator reviewing a flagged candidate knows exactly which criteria require manual verification and why—reducing per-candidate review time from 45 minutes to roughly 8 minutes on prioritized candidates.

For mid-size CROs managing 3–25 active trials, that efficiency difference compounds quickly. The bottleneck shifts from screening throughput to site coordinator capacity for the higher-value work: consent conversations, protocol clarifications, and study management tasks that actually require clinical judgment.

Integration Considerations

Effective patient matching requires structured access to EHR data at the site level. FHIR R4 APIs from Epic, Cerner, and MEDITECH provide the data layer—but connection is only the starting point. Eligibility rule sets need to be built from the protocol, tested against historical patient populations, and reviewed with the sponsor’s medical monitor before deployment. Protocol amendments require versioned rule updates, not a rebuild from scratch.

The CROs seeing the strongest results are treating patient matching as a protocol intake workflow, not an IT integration project. The technology infrastructure matters, but the clinical accuracy of the eligibility logic matters more. Getting that right—for each trial, each protocol version, each site’s particular EHR configuration—is what separates a genuinely useful screening tool from one that coordinators stop trusting after the first round of edge cases.

Clinical trial enrollment delays cost the life sciences industry billions of dollars each year, but the most persistent bottleneck rarely gets discussed as a systems problem. In most Phase II and Phase III studies, the primary cause of enrollment slippage isn’t a shortage of potentially eligible patients — it’s the manual process coordinators use to find them.

The numbers that frame this problem are worth sitting with. Mid-size CROs managing oncology and rare disease trials routinely screen 8 to 25 patients for every one who enrolls. A significant portion of those screen failures happen on secondary exclusion criteria that are fully computable from structured EHR data — a prior diagnosis code, a recent lab value outside the protocol’s acceptable range, a medication record that disqualifies participation. The coordinator doing the manual chart review often doesn’t encounter that disqualifying data point until 35 or 40 minutes into the review, after the patient’s record has already been pulled, printed, and annotated.

Where Manual Screening Actually Breaks Down

The challenge with manual eligibility screening isn’t that coordinators are slow or careless — it’s that the process is structured in a way that guarantees inefficiency. A typical paper or spreadsheet-based pre-screening workflow checks eligibility criteria sequentially, in the order they appear in the protocol. Hard exclusion criteria that could be resolved in seconds from a diagnosis code database often appear at positions 28, 33, or 38 in a 40-criterion list, after 20 minutes of softer criteria have already been evaluated.

Rare disease and oncology protocols compound this problem. Eligibility criteria in these trials tend to be more numerous and more nuanced than in broader-indication studies — specific prior treatment histories, narrow laboratory value windows, time-from-diagnosis requirements that require date math across multiple records. A coordinator managing three active trials simultaneously, each with its own eligibility checklist, is doing a huge amount of cognitive switching just to keep the criteria straight.

The downstream effect is measured in time and money. Screen failures cost mid-size CROs an estimated $3,000 to $8,000 per failure in coordinator labor and site fees. Across a 200-patient trial with a 70% screen-failure rate, that figure reaches well into seven figures. And that’s before accounting for the enrollment delay itself, which typically runs 4 to 8 months for Phase II-III trials where manual screening is the primary pre-qualification method.

What AI-Driven Patient Matching Actually Does

The phrase “AI patient matching” covers a lot of ground, so it’s worth being specific about what the approach actually involves. The most effective systems — and the design Cohortbridge uses — are built around a two-pass architecture that separates deterministic filtering from probabilistic NLP.

The first pass is entirely rule-based. Every hard exclusion criterion that can be resolved from structured EHR data — ICD-10 diagnosis codes, LOINC lab values, RxNorm medication records, CPT procedure history — is evaluated automatically before any coordinator time is spent. This pass typically runs in under two seconds per patient record and eliminates 80 to 90 percent of the screened population. The patients who remain are those who have already cleared the computable gate.

The second pass applies a transformer-based NLP model to the clinical notes of remaining candidates. Oncology and rare disease trials often require eligibility evidence that only exists in narrative form — a pathology report describing disease stage in clinical language, a physician’s note documenting prior treatment response, a discharge summary that records a procedure in text before the CPT code is billed. The NLP model processes discharge summaries, progress notes, and pathology reports, scoring each candidate on likelihood of meeting note-dependent inclusion criteria and surfacing the specific passages that drove the score.

What Gets Delivered to the Coordinator

The output of this process isn’t a separate tool or a new inbox to check. It’s a ranked candidate list delivered directly into the site coordinator’s existing trial management system — whether that’s Medidata Rave or Veeva Vault CTMS. Each candidate record shows the overall eligibility score, a criterion-by-criterion pass/fail breakdown with supporting data, flags for criteria that require coordinator verification, and a one-page consent-ready clinical summary.

The coordinator’s job shifts from manual chart review to verification and consent preparation. Pre-screening time drops from 45 minutes per patient to approximately 8 minutes per prioritized candidate. Coordinators aren’t taken out of the loop — they’re brought in at the point where their judgment actually matters, rather than at the beginning of a process that an automated system can handle most of.

The Integration Side

Making this work at a mid-size CRO requires clean integration with the clinical sites’ EHR systems. The standard connection path runs through HL7 FHIR R4 APIs, which are now widely supported by Epic, Cerner Millennium, and MEDITECH Expanse. Epic’s MyChart Bedrock API supports direct, high-fidelity access to structured patient data for authorized research applications. Under a HIPAA Business Associate Agreement executed with each participating site, patient records can be queried and scored ephemerally — PHI isn’t retained after the screening session completes.

Protocol amendments, which are common in Phase II-III oncology trials as sponsors adjust eligibility criteria mid-enrollment, are handled through versioned rule sets. An amendment that changes a laboratory value threshold or adds an exclusion criterion can be incorporated without restarting the screening model from scratch — the rule set is updated, reviewed with the medical monitor, and redeployed against the candidate pool.

What This Means for CRO Operations

The operational case for AI-driven patient matching at a mid-size CRO comes down to a simple question: where is coordinator time currently being spent, and is that the highest-value use of it? In most trials, the answer is that coordinators are spending a significant portion of their time on pre-screening work that is at least partially computable from data the site already has. Automating that work doesn’t eliminate the coordinator’s role — it concentrates their attention on the cases where their clinical judgment actually adds value.

For CROs managing 10 or more active trials simultaneously, the compounding effect is significant. Reducing pre-screening time by 80% across a portfolio of trials frees up coordinator capacity that can be directed toward enrollment, patient communication, and protocol compliance — the parts of the job that can’t be automated and where human attention produces the most direct impact on trial quality.