Assessment of BRCA1 and BRCA2 Germline Variant Data From Patients With Breast Cancer in a Real-World Data Registry.

IF 3.3 Q2 ONCOLOGY

JCO Clinical Cancer Informatics Pub Date : 2024-05-01 DOI:10.1200/CCI.23.00251

Thales C Nepomuceno, Paulo Lyra, Jianbin Zhu, Fanchao Yi, Rachael H Martin, Daniel Lupu, Luke Peterson, Lauren C Peres, Anna Berry, Edwin S Iversen, Fergus J Couch, Qianxing Mo, Alvaro N Monteiro

{"title":"Assessment of BRCA1 and BRCA2 Germline Variant Data From Patients With Breast Cancer in a Real-World Data Registry.","authors":"Thales C Nepomuceno, Paulo Lyra, Jianbin Zhu, Fanchao Yi, Rachael H Martin, Daniel Lupu, Luke Peterson, Lauren C Peres, Anna Berry, Edwin S Iversen, Fergus J Couch, Qianxing Mo, Alvaro N Monteiro","doi":"10.1200/CCI.23.00251","DOIUrl":null,"url":null,"abstract":"Purpose: The emergence of large real-world clinical databases and tools to mine electronic medical records has allowed for an unprecedented look at large data sets with clinical and epidemiologic correlates. In clinical cancer genetics, real-world databases allow for the investigation of prevalence and effectiveness of prevention strategies and targeted treatments and for the identification of barriers to better outcomes. However, real-world data sets have inherent biases and problems (eg, selection bias, incomplete data, measurement error) that may hamper adequate analysis and affect statistical power.Methods: Here, we leverage a real-world clinical data set from a large health network for patients with breast cancer tested for variants in BRCA1 and BRCA2 (N = 12,423). We conducted data cleaning and harmonization, cross-referenced with publicly available databases, performed variant reassessment and functional assays, and used functional data to inform a variant's clinical significance applying American College of Medical Geneticists and the Association of Molecular Pathology guidelines.Results: In the cohort, White and Black patients were over-represented, whereas non-White Hispanic and Asian patients were under-represented. Incorrect or missing variant designations were the most significant contributor to data loss. While manual curation corrected many incorrect designations, a sizable fraction of patient carriers remained with incorrect or missing variant designations. Despite the large number of patients with clinical significance not reported, original reported clinical significance assessments were accurate. Reassessment of variants in which clinical significance was not reported led to a marked improvement in data quality.Conclusion: We identify the most common issues with BRCA1 and BRCA2 testing data entry and suggest approaches to minimize data loss and keep interpretation of clinical significance of variants up to date.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2300251"},"PeriodicalIF":3.3000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11161245/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI.23.00251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: The emergence of large real-world clinical databases and tools to mine electronic medical records has allowed for an unprecedented look at large data sets with clinical and epidemiologic correlates. In clinical cancer genetics, real-world databases allow for the investigation of prevalence and effectiveness of prevention strategies and targeted treatments and for the identification of barriers to better outcomes. However, real-world data sets have inherent biases and problems (eg, selection bias, incomplete data, measurement error) that may hamper adequate analysis and affect statistical power.

Methods: Here, we leverage a real-world clinical data set from a large health network for patients with breast cancer tested for variants in BRCA1 and BRCA2 (N = 12,423). We conducted data cleaning and harmonization, cross-referenced with publicly available databases, performed variant reassessment and functional assays, and used functional data to inform a variant's clinical significance applying American College of Medical Geneticists and the Association of Molecular Pathology guidelines.

Results: In the cohort, White and Black patients were over-represented, whereas non-White Hispanic and Asian patients were under-represented. Incorrect or missing variant designations were the most significant contributor to data loss. While manual curation corrected many incorrect designations, a sizable fraction of patient carriers remained with incorrect or missing variant designations. Despite the large number of patients with clinical significance not reported, original reported clinical significance assessments were accurate. Reassessment of variants in which clinical significance was not reported led to a marked improvement in data quality.

Conclusion: We identify the most common issues with BRCA1 and BRCA2 testing data entry and suggest approaches to minimize data loss and keep interpretation of clinical significance of variants up to date.

查看原文本刊更多论文

评估真实世界数据登记册中乳腺癌患者的 BRCA1 和 BRCA2 基因变异数据。

目的：随着大型真实世界临床数据库和电子病历挖掘工具的出现，人们可以前所未有地查看与临床和流行病学相关的大型数据集。在临床癌症遗传学中，真实世界数据库可用于调查预防策略和靶向治疗的流行率和有效性，并确定获得更好结果的障碍。然而，真实世界的数据集存在固有的偏差和问题（如选择偏差、数据不完整、测量误差），可能会妨碍充分的分析并影响统计能力。方法：在此，我们利用一个大型医疗网络的真实世界临床数据集，对乳腺癌患者进行 BRCA1 和 BRCA2 变异检测（N = 12,423）。我们对数据进行了清理和统一，与公开数据库进行了交叉比对，进行了变异再评估和功能测定，并根据美国医学遗传学家学会和分子病理学协会的指导原则使用功能数据来确定变异的临床意义：在队列中，白人和黑人患者所占比例较高，而非白人的西班牙裔和亚裔患者所占比例较低。不正确或缺失的变异名称是造成数据丢失的最主要原因。虽然人工整理纠正了许多错误的指定，但仍有相当一部分患者携带者的变异体指定不正确或缺失。尽管有大量患者未报告临床意义，但原始报告的临床意义评估是准确的。对未报告临床意义的变异进行重新评估后，数据质量明显提高：我们找出了 BRCA1 和 BRCA2 检测数据录入中最常见的问题，并提出了尽量减少数据丢失和及时解释变异临床意义的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JCO Clinical Cancer Informatics ONCOLOGY-

CiteScore

6.20

自引率

4.80%

发文量

190