Well-designed randomized trials provide high-quality clinical evidence but are not always feasible or ethical. In their absence, the electronic medical record (EMR) presents a platform to conduct comparative effectiveness research, central to the emerging academic learning health system (aLHS) model. A barrier to realizing this vision is the lack of a process to efficiently generate a reference comparison group for each patient.
To test a multi-step process for the selection of comparators in the EMR.
We conducted a mixed-methods study within a large aLHS in North Carolina. We (1) created a list of 35 candidate variables; (2) surveyed 270 researchers to assess the importance of candidate variables; and (3) built consensus rankings around survey-identified variables (ie, importance scores >7) across two panels of 7–8 clinical research experts. Prioritized algorithm inputs were collected from the EMR and applied using a greedy matching technique. Feasibility was measured as the percentage of patients with 100 matched comparators and performance was measured via computational time and Euclidean distance.
Nine variables were selected: age, sex, race, ethnicity, body mass index, insurance status, smoking status, Charlson Comorbidity Index, and neighborhood percentage in poverty. The final process successfully generated 100 matched comparators for each of 1.8 million candidate patients, executed in less than 100 min for the majority of strata, and had average Euclidean distance 0.043.
EMR-derived matching is feasible to implement across a diverse patient population and can provide a reproducible, efficient source of comparator data for observational studies, with additional testing in clinical research applications needed.