TriNetX Dataworks-USA: Overview of a Multi-Purpose, De-Identified, Federated Electronic Health Record Real-World Data and Analytics Network and Comparison to the US Census.
Ellen Stein, Matthias Hüser, E Susan Amirian, Matvey B Palchuk, Jeffrey S Brown
{"title":"TriNetX Dataworks-USA: Overview of a Multi-Purpose, De-Identified, Federated Electronic Health Record Real-World Data and Analytics Network and Comparison to the US Census.","authors":"Ellen Stein, Matthias Hüser, E Susan Amirian, Matvey B Palchuk, Jeffrey S Brown","doi":"10.1002/pds.70198","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Many clinical data networks often focus on a single use-case or disease. By contrast, the TriNetX Dataworks-USA Network contains real-world clinical information that can be applied to multiple research questions and use cases. The purpose of this study is to describe the Network's characteristics, as well as its generalizability to the US population, particularly the healthcare-seeking population.</p><p><strong>Methods: </strong>Using the Dataworks-USA Network, a large, regularly updated data network containing de-identified patient electronic health record (EHR) information from across the United States, basic demographics were summarized and compared to the US Census Bureau International Database (IDB) 2022 data and the National Cancer Institute's version of the Census Bureau's U.S. County Population Data for 2022 to examine the generalizability of the Network.</p><p><strong>Results: </strong>Patients in the Dataworks-USA Network are approximately 5 years older than the Census, and the Network has a larger proportion of female patients. The Network has a lower proportion of patients identified as Asian and White race, and a higher proportion who identify as other relative to the Census; other races are similar between the two data sources (< 1% difference). Regionally, Dataworks-USA has a smaller proportion of patients in all race categories compared with the Census due to the larger proportion of patients of Unknown or Other race.</p><p><strong>Conclusions: </strong>TriNetX's Dataworks-USA Network provides a robust data source for many use cases and is broadly generalizable to the US population, particularly the healthcare-seeking population, with differences related to the underlying nature of the data sources.</p>","PeriodicalId":19782,"journal":{"name":"Pharmacoepidemiology and Drug Safety","volume":"34 9","pages":"e70198"},"PeriodicalIF":2.4000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12414656/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pharmacoepidemiology and Drug Safety","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/pds.70198","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Many clinical data networks often focus on a single use-case or disease. By contrast, the TriNetX Dataworks-USA Network contains real-world clinical information that can be applied to multiple research questions and use cases. The purpose of this study is to describe the Network's characteristics, as well as its generalizability to the US population, particularly the healthcare-seeking population.
Methods: Using the Dataworks-USA Network, a large, regularly updated data network containing de-identified patient electronic health record (EHR) information from across the United States, basic demographics were summarized and compared to the US Census Bureau International Database (IDB) 2022 data and the National Cancer Institute's version of the Census Bureau's U.S. County Population Data for 2022 to examine the generalizability of the Network.
Results: Patients in the Dataworks-USA Network are approximately 5 years older than the Census, and the Network has a larger proportion of female patients. The Network has a lower proportion of patients identified as Asian and White race, and a higher proportion who identify as other relative to the Census; other races are similar between the two data sources (< 1% difference). Regionally, Dataworks-USA has a smaller proportion of patients in all race categories compared with the Census due to the larger proportion of patients of Unknown or Other race.
Conclusions: TriNetX's Dataworks-USA Network provides a robust data source for many use cases and is broadly generalizable to the US population, particularly the healthcare-seeking population, with differences related to the underlying nature of the data sources.
期刊介绍:
The aim of Pharmacoepidemiology and Drug Safety is to provide an international forum for the communication and evaluation of data, methods and opinion in the discipline of pharmacoepidemiology. The Journal publishes peer-reviewed reports of original research, invited reviews and a variety of guest editorials and commentaries embracing scientific, medical, statistical, legal and economic aspects of pharmacoepidemiology and post-marketing surveillance of drug safety. Appropriate material in these categories may also be considered for publication as a Brief Report.
Particular areas of interest include:
design, analysis, results, and interpretation of studies looking at the benefit or safety of specific pharmaceuticals, biologics, or medical devices, including studies in pharmacovigilance, postmarketing surveillance, pharmacoeconomics, patient safety, molecular pharmacoepidemiology, or any other study within the broad field of pharmacoepidemiology;
comparative effectiveness research relating to pharmaceuticals, biologics, and medical devices. Comparative effectiveness research is the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition, as these methods are truly used in the real world;
methodologic contributions of relevance to pharmacoepidemiology, whether original contributions, reviews of existing methods, or tutorials for how to apply the methods of pharmacoepidemiology;
assessments of harm versus benefit in drug therapy;
patterns of drug utilization;
relationships between pharmacoepidemiology and the formulation and interpretation of regulatory guidelines;
evaluations of risk management plans and programmes relating to pharmaceuticals, biologics and medical devices.