Linking The Cancer Imaging Archive and GenBank to the National Clinical Cohort Collaborative

IF 2.6 Q2 HEALTH POLICY & SERVICES
Ahmad Baghal, Joel Saltz, Tahsin Kurc, Prateek Prasanna, Samantha Baghal, Janos Hajagos, Erich Bremer, Shaymaa Al-Shukri, Joshua L. Kennedy, Michael Rutherford, Tracy Nolan, Kirk Smith, Christopher G. Chute, Fred Prior
{"title":"Linking The Cancer Imaging Archive and GenBank to the National Clinical Cohort Collaborative","authors":"Ahmad Baghal,&nbsp;Joel Saltz,&nbsp;Tahsin Kurc,&nbsp;Prateek Prasanna,&nbsp;Samantha Baghal,&nbsp;Janos Hajagos,&nbsp;Erich Bremer,&nbsp;Shaymaa Al-Shukri,&nbsp;Joshua L. Kennedy,&nbsp;Michael Rutherford,&nbsp;Tracy Nolan,&nbsp;Kirk Smith,&nbsp;Christopher G. Chute,&nbsp;Fred Prior","doi":"10.1002/lrh2.10457","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Objective</h3>\n \n <p>This project demonstrates the feasibility of connecting medical imaging data and features, SARS-CoV-2 genome variants, with clinical data in the National Clinical Cohort Collaborative (N3C) repository to accelerate integrative research on detection, diagnosis, and treatment of COVID-19-related morbidities. The N3C curated a rich collection of aggregated and de-identified electronic health records (EHR) data of over 18 million patients, including 7.5 million COVID-positive patients, seen at hospitals across the United States. Medical imaging data and variant samples are important data modalities used in the study of COVID-19.</p>\n </section>\n \n <section>\n \n <h3> Materials and Methods</h3>\n \n <p>Imaging data and features are hosted on the Cancer Imaging Archive (TCIA), and sequenced variant samples are analyzed and stored at the NIH GenBank. The University of Arkansas for Medical Sciences (UAMS) published the first COVID-19 data set of 105 patients on TCIA and 37 patients on GenBank. We developed a process to link imaging and genomic variants and N3C EHR data through Privacy Preserving Record Linkage (PPRL) using de-identified cryptographic hashes to match records associated with the same individuals without using patient identifiers.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The PPRL techniques were piloted using clinical and imaging data sets provided by UAMS. Developed software components and processes executed properly, and linked data were returned and processed for visualization.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>Linking across clinical data sources at the patient level provides opportunities to gain insights from data that may not be known otherwise. The PPRL prototype and the pilot serve as a model to link disparate and diverse data repositories to enhance clinical research.</p>\n </section>\n </div>","PeriodicalId":43916,"journal":{"name":"Learning Health Systems","volume":"9 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11733468/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Learning Health Systems","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/lrh2.10457","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH POLICY & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

This project demonstrates the feasibility of connecting medical imaging data and features, SARS-CoV-2 genome variants, with clinical data in the National Clinical Cohort Collaborative (N3C) repository to accelerate integrative research on detection, diagnosis, and treatment of COVID-19-related morbidities. The N3C curated a rich collection of aggregated and de-identified electronic health records (EHR) data of over 18 million patients, including 7.5 million COVID-positive patients, seen at hospitals across the United States. Medical imaging data and variant samples are important data modalities used in the study of COVID-19.

Materials and Methods

Imaging data and features are hosted on the Cancer Imaging Archive (TCIA), and sequenced variant samples are analyzed and stored at the NIH GenBank. The University of Arkansas for Medical Sciences (UAMS) published the first COVID-19 data set of 105 patients on TCIA and 37 patients on GenBank. We developed a process to link imaging and genomic variants and N3C EHR data through Privacy Preserving Record Linkage (PPRL) using de-identified cryptographic hashes to match records associated with the same individuals without using patient identifiers.

Results

The PPRL techniques were piloted using clinical and imaging data sets provided by UAMS. Developed software components and processes executed properly, and linked data were returned and processed for visualization.

Conclusion

Linking across clinical data sources at the patient level provides opportunities to gain insights from data that may not be known otherwise. The PPRL prototype and the pilot serve as a model to link disparate and diverse data repositories to enhance clinical research.

Abstract Image

将癌症影像档案和基因库与国家临床队列协作连接起来。
目的:本项目论证将医学影像数据、特征、SARS-CoV-2基因组变异与国家临床队列协作(N3C)知识库中的临床数据连接起来的可行性,以加快对covid -19相关疾病的检测、诊断和治疗的一体化研究。N3C收集了丰富的汇总和去识别电子健康记录(EHR)数据,这些数据来自美国各地医院的1800多万名患者,其中包括750万名新冠病毒阳性患者。医学影像数据和变异样本是COVID-19研究中使用的重要数据模式。材料和方法:成像数据和特征托管在癌症成像档案(TCIA)上,测序的变异样本被分析并存储在NIH GenBank中。阿肯色大学医学科学学院(UAMS)发表了首个COVID-19数据集,其中105例患者在TCIA上,37例患者在GenBank上。我们开发了一种流程,通过隐私保护记录链接(PPRL)将成像和基因组变异与N3C EHR数据联系起来,使用去识别的加密哈希来匹配与同一个人相关的记录,而不使用患者标识符。结果:利用UAMS提供的临床和影像学数据集对PPRL技术进行了试点。正确执行已开发的软件组件和流程,并返回并处理链接数据以实现可视化。结论:在患者层面上,跨临床数据源的链接提供了从数据中获得见解的机会,否则可能不知道。PPRL原型和试点作为一个模型,将不同的和不同的数据存储库联系起来,以加强临床研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Learning Health Systems
Learning Health Systems HEALTH POLICY & SERVICES-
CiteScore
5.60
自引率
22.60%
发文量
55
审稿时长
20 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信