错误分类错误对基于下一代测序的生物标志物性能的影响，一项模拟研究。

IF 1.2 4区医学 Q4 PHARMACOLOGY & PHARMACY

Journal of Biopharmaceutical Statistics Pub Date : 2024-08-01 Epub Date: 2023-10-11 DOI:10.1080/10543406.2023.2269251

Dong Wang, Sue-Jane Wang, Samir Lababidi

{"title":"错误分类错误对基于下一代测序的生物标志物性能的影响，一项模拟研究。","authors":"Dong Wang, Sue-Jane Wang, Samir Lababidi","doi":"10.1080/10543406.2023.2269251","DOIUrl":null,"url":null,"abstract":"The development of next-generation sequencing (NGS) opens opportunities for new applications such as liquid biopsy, in which tumor mutation genotypes can be determined by sequencing circulating tumor DNA after blood draws. However, with highly diluted samples like those obtained with liquid biopsy, NGS invariably introduces a certain level of misclassification, even with improved technology. Recently, there has been a high demand to use mutation genotypes as biomarkers for predicting prognosis and treatment selection. Many methods have also been proposed to build classifiers based on multiple loci with machine learning algorithms as biomarkers. How the higher misclassification rate introduced by liquid biopsy will affect the performance of these biomarkers has not been thoroughly investigated. In this paper, we report the results from a simulation study focused on the clinical utility of biomarkers when misclassification is present due to the current technological limit of NGS in the liquid biopsy setting. The simulation covers a range of performance profiles for current NGS platforms with different machine learning algorithms and uses actual patient genotypes. Our results show that, at the high end of the performance spectrum, the misclassification introduced by NGS had very little effect on the clinical utility of the biomarker. However, in more challenging applications with lower accuracy, misclassification could have a notable effect on clinical utility. The pattern of this effect can be complex, especially for machine learning-based classifiers. Our results show that simulation can be an effective tool for assessing different scenarios of misclassification.","PeriodicalId":54870,"journal":{"name":"Journal of Biopharmaceutical Statistics","volume":" ","pages":"700-718"},"PeriodicalIF":1.2000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The impact of misclassification errors on the performance of biomarkers based on next-generation sequencing, a simulation study.\",\"authors\":\"Dong Wang, Sue-Jane Wang, Samir Lababidi\",\"doi\":\"10.1080/10543406.2023.2269251\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The development of next-generation sequencing (NGS) opens opportunities for new applications such as liquid biopsy, in which tumor mutation genotypes can be determined by sequencing circulating tumor DNA after blood draws. However, with highly diluted samples like those obtained with liquid biopsy, NGS invariably introduces a certain level of misclassification, even with improved technology. Recently, there has been a high demand to use mutation genotypes as biomarkers for predicting prognosis and treatment selection. Many methods have also been proposed to build classifiers based on multiple loci with machine learning algorithms as biomarkers. How the higher misclassification rate introduced by liquid biopsy will affect the performance of these biomarkers has not been thoroughly investigated. In this paper, we report the results from a simulation study focused on the clinical utility of biomarkers when misclassification is present due to the current technological limit of NGS in the liquid biopsy setting. The simulation covers a range of performance profiles for current NGS platforms with different machine learning algorithms and uses actual patient genotypes. Our results show that, at the high end of the performance spectrum, the misclassification introduced by NGS had very little effect on the clinical utility of the biomarker. However, in more challenging applications with lower accuracy, misclassification could have a notable effect on clinical utility. The pattern of this effect can be complex, especially for machine learning-based classifiers. Our results show that simulation can be an effective tool for assessing different scenarios of misclassification.\",\"PeriodicalId\":54870,\"journal\":{\"name\":\"Journal of Biopharmaceutical Statistics\",\"volume\":\" \",\"pages\":\"700-718\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Biopharmaceutical Statistics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/10543406.2023.2269251\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/10/11 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biopharmaceutical Statistics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/10543406.2023.2269251","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/10/11 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}

引用次数: 0

摘要

下一代测序（NGS）的发展为液体活检等新应用开辟了机会，在液体活检中，可以通过抽血后对循环肿瘤DNA进行测序来确定肿瘤突变基因型。然而，对于像液体活检一样的高度稀释的样本，即使技术有所改进，NGS也总是会引入一定程度的错误分类。最近，使用突变基因型作为预测预后和治疗选择的生物标志物的需求很高。还提出了许多方法来构建基于多个位点的分类器，并将机器学习算法作为生物标志物。液体活检引入的较高错误分类率将如何影响这些生物标志物的性能尚未得到彻底研究。在本文中，我们报告了一项模拟研究的结果，该研究侧重于当由于NGS在液体活检设置中的当前技术限制而出现错误分类时生物标志物的临床效用。该模拟涵盖了具有不同机器学习算法的当前NGS平台的一系列性能概况，并使用了实际的患者基因型。我们的研究结果表明，在性能谱的高端，NGS引入的错误分类对生物标志物的临床实用性几乎没有影响。然而，在精度较低的更具挑战性的应用中，错误分类可能会对临床效用产生显著影响。这种效应的模式可能很复杂，尤其是对于基于机器学习的分类器。我们的结果表明，模拟可以成为评估不同错误分类场景的有效工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The impact of misclassification errors on the performance of biomarkers based on next-generation sequencing, a simulation study.

The development of next-generation sequencing (NGS) opens opportunities for new applications such as liquid biopsy, in which tumor mutation genotypes can be determined by sequencing circulating tumor DNA after blood draws. However, with highly diluted samples like those obtained with liquid biopsy, NGS invariably introduces a certain level of misclassification, even with improved technology. Recently, there has been a high demand to use mutation genotypes as biomarkers for predicting prognosis and treatment selection. Many methods have also been proposed to build classifiers based on multiple loci with machine learning algorithms as biomarkers. How the higher misclassification rate introduced by liquid biopsy will affect the performance of these biomarkers has not been thoroughly investigated. In this paper, we report the results from a simulation study focused on the clinical utility of biomarkers when misclassification is present due to the current technological limit of NGS in the liquid biopsy setting. The simulation covers a range of performance profiles for current NGS platforms with different machine learning algorithms and uses actual patient genotypes. Our results show that, at the high end of the performance spectrum, the misclassification introduced by NGS had very little effect on the clinical utility of the biomarker. However, in more challenging applications with lower accuracy, misclassification could have a notable effect on clinical utility. The pattern of this effect can be complex, especially for machine learning-based classifiers. Our results show that simulation can be an effective tool for assessing different scenarios of misclassification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Biopharmaceutical Statistics 医学-统计学与概率论

CiteScore

2.50

自引率

18.20%

发文量

审稿时长

6-12 weeks

期刊介绍： The Journal of Biopharmaceutical Statistics, a rapid publication journal, discusses quality applications of statistics in biopharmaceutical research and development. Now publishing six times per year, it includes expositions of statistical methodology with immediate applicability to biopharmaceutical research in the form of full-length and short manuscripts, review articles, selected/invited conference papers, short articles, and letters to the editor. Addressing timely and provocative topics important to the biostatistical profession, the journal covers: Drug, device, and biological research and development; Drug screening and drug design; Assessment of pharmacological activity; Pharmaceutical formulation and scale-up; Preclinical safety assessment; Bioavailability, bioequivalence, and pharmacokinetics; Phase, I, II, and III clinical development including complex innovative designs; Premarket approval assessment of clinical safety; Postmarketing surveillance; Big data and artificial intelligence and applications.