Systematic curation and analysis of ovarian cancer data across multiple electronic record systems held within the UK National Health Service: a tertiary referral centre experience

A. Samani , G. Giannone , L. Mercuri , R. Jiang , Y. Nadkarni , A. Chadha , E. Xing , S. Ghaem-Maghami , J. Krell , D. Lyons , C. Fotopoulou , D. Papadimitriou , B. Glampson , E. Mayer , I. McNeish , L. Tookman
{"title":"Systematic curation and analysis of ovarian cancer data across multiple electronic record systems held within the UK National Health Service: a tertiary referral centre experience","authors":"A. Samani ,&nbsp;G. Giannone ,&nbsp;L. Mercuri ,&nbsp;R. Jiang ,&nbsp;Y. Nadkarni ,&nbsp;A. Chadha ,&nbsp;E. Xing ,&nbsp;S. Ghaem-Maghami ,&nbsp;J. Krell ,&nbsp;D. Lyons ,&nbsp;C. Fotopoulou ,&nbsp;D. Papadimitriou ,&nbsp;B. Glampson ,&nbsp;E. Mayer ,&nbsp;I. McNeish ,&nbsp;L. Tookman","doi":"10.1016/j.esmorw.2025.100150","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Manual curation of real-world data (RWD) for patients with ovarian cancer is complex and costly. We set up a novel collaboration between informatics and clinical teams generating automated data curation at scale. This enabled integrated and timely access to RWD across all ovarian cancer patients treated within a tertiary gynaecological cancer centre of the UK National Health System, setting the basis for research and operational use.</div></div><div><h3>Materials and methods</h3><div>The collaboration defined high-yield, accessible data which were pulled into tables representing various clinical domains followed by a systematic integration, cleaning and analysis within the iCARE Secure Data Environment.</div></div><div><h3>Results</h3><div>We curated data for 1581 patients diagnosed between 1 January 2014 and 31 December 2022. We showed that referrals to the specialist tumour board consistently increased over time while baseline characteristics did not change significantly. The number of patients receiving a new line of therapy decreased in 2020, the first year of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak. Data robustness was supported using multivariate survival modelling demonstrating the expected impact of known prognostic factors. There was a paucity of available data for some variables (e.g. ethnicity) while others lacked a consistent storage mechanism within source systems (genomic data).</div></div><div><h3>Conclusions</h3><div>Automated curation and analysis of RWD is possible at scale, in real time. Analysis yielded clinical findings consistent with the prevalent literature and showed evolution of treatment practice. While not all unstructured data could be explored, we demonstrate that automated curation of clinically important real-world variables is feasible and can yield robust data for both research and operational purposes.</div></div>","PeriodicalId":100491,"journal":{"name":"ESMO Real World Data and Digital Oncology","volume":"9 ","pages":"Article 100150"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ESMO Real World Data and Digital Oncology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949820125000396","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Manual curation of real-world data (RWD) for patients with ovarian cancer is complex and costly. We set up a novel collaboration between informatics and clinical teams generating automated data curation at scale. This enabled integrated and timely access to RWD across all ovarian cancer patients treated within a tertiary gynaecological cancer centre of the UK National Health System, setting the basis for research and operational use.

Materials and methods

The collaboration defined high-yield, accessible data which were pulled into tables representing various clinical domains followed by a systematic integration, cleaning and analysis within the iCARE Secure Data Environment.

Results

We curated data for 1581 patients diagnosed between 1 January 2014 and 31 December 2022. We showed that referrals to the specialist tumour board consistently increased over time while baseline characteristics did not change significantly. The number of patients receiving a new line of therapy decreased in 2020, the first year of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak. Data robustness was supported using multivariate survival modelling demonstrating the expected impact of known prognostic factors. There was a paucity of available data for some variables (e.g. ethnicity) while others lacked a consistent storage mechanism within source systems (genomic data).

Conclusions

Automated curation and analysis of RWD is possible at scale, in real time. Analysis yielded clinical findings consistent with the prevalent literature and showed evolution of treatment practice. While not all unstructured data could be explored, we demonstrate that automated curation of clinically important real-world variables is feasible and can yield robust data for both research and operational purposes.
系统管理和分析卵巢癌数据跨越多个电子记录系统在英国国家卫生服务:三级转诊中心的经验
卵巢癌患者真实世界数据(RWD)的人工管理是复杂且昂贵的。我们在信息学和临床团队之间建立了一种新型的协作,产生大规模的自动化数据管理。这使得在英国国家卫生系统的三级妇科癌症中心接受治疗的所有卵巢癌患者能够综合及时地获得RWD,为研究和操作使用奠定了基础。材料和方法该合作定义了高产、可访问的数据,这些数据被拉入代表不同临床领域的表格中,随后在iCARE安全数据环境中进行系统集成、清理和分析。结果我们整理了2014年1月1日至2022年12月31日诊断的1581例患者的数据。我们发现,随着时间的推移,转诊到专科肿瘤委员会的人数持续增加,而基线特征没有显著变化。2020年是严重急性呼吸综合征冠状病毒2 (SARS-CoV-2)爆发的第一年,接受新疗法的患者数量有所减少。数据稳健性通过多变量生存模型得到支持,该模型展示了已知预后因素的预期影响。一些变量(如种族)缺乏可用数据,而其他变量在源系统内缺乏一致的存储机制(基因组数据)。结论对RWD进行实时、大规模的自动化管理和分析是可行的。分析得出的临床结果与流行文献一致,并显示了治疗实践的演变。虽然不是所有的非结构化数据都可以被探索,但我们证明了临床上重要的现实世界变量的自动化管理是可行的,并且可以为研究和操作目的产生可靠的数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信