Genotyping from targeted NGS data based on a small set of SNPs correctly matches patient samples.

IF 1.7 Q2 MULTIDISCIPLINARY SCIENCES

BMC Research Notes Pub Date : 2025-07-02 DOI:10.1186/s13104-025-07348-3

Deyan Yordanov Yosifov, Christof Schneider, Stephan Stilgenbauer, Daniel Mertens, Eugen Tausch

{"title":"Genotyping from targeted NGS data based on a small set of SNPs correctly matches patient samples.","authors":"Deyan Yordanov Yosifov, Christof Schneider, Stephan Stilgenbauer, Daniel Mertens, Eugen Tausch","doi":"10.1186/s13104-025-07348-3","DOIUrl":null,"url":null,"abstract":"Objective: Mislabelling and swapping of laboratory samples are handling errors that can lead to erroneous interpretation of data and/or patient harm. Sequenced samples can be traced back to the respective donors by matching of single nucleotide polymorphisms (SNPs). Frameworks and software to do this have been developed for use with whole genome/exome sequencing data but not for targeted next-generation sequencing (tNGS), possibly due to the limited genomic coverage with tNGS and the need for individualization of the set of interrogated SNPs. We decided to adapt a popular tool for use with tNGS data, to demonstrate the possibility of selecting informative SNPs from a typical tNGS panel and to create an automated workflow for detection of sample handling errors.Results: We compiled a custom list of 28 SNPs and with its help we demonstrated the practicability of using only tNGS data to cost-effectively detect mislabelled samples. In two cohorts of totally 1441 patients with sequential samples, we could identify 3 sample swaps, 7 mislabelled samples (3 externally and 4 internally) and 1 mistake of unknown origin. We provide an R function for automated detection of sample swaps and mislabelling to the community as a free and open-source tool.","PeriodicalId":9234,"journal":{"name":"BMC Research Notes","volume":"18 1","pages":"270"},"PeriodicalIF":1.7000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12225085/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Research Notes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13104-025-07348-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: Mislabelling and swapping of laboratory samples are handling errors that can lead to erroneous interpretation of data and/or patient harm. Sequenced samples can be traced back to the respective donors by matching of single nucleotide polymorphisms (SNPs). Frameworks and software to do this have been developed for use with whole genome/exome sequencing data but not for targeted next-generation sequencing (tNGS), possibly due to the limited genomic coverage with tNGS and the need for individualization of the set of interrogated SNPs. We decided to adapt a popular tool for use with tNGS data, to demonstrate the possibility of selecting informative SNPs from a typical tNGS panel and to create an automated workflow for detection of sample handling errors.

Results: We compiled a custom list of 28 SNPs and with its help we demonstrated the practicability of using only tNGS data to cost-effectively detect mislabelled samples. In two cohorts of totally 1441 patients with sequential samples, we could identify 3 sample swaps, 7 mislabelled samples (3 externally and 4 internally) and 1 mistake of unknown origin. We provide an R function for automated detection of sample swaps and mislabelling to the community as a free and open-source tool.

查看原文本刊更多论文

基于一小组snp的靶向NGS数据的基因分型正确匹配患者样本。

目的：实验室样品的错误标记和交换是处理错误，可能导致数据的错误解释和/或患者伤害。测序样本可以通过单核苷酸多态性（snp）的匹配追溯到各自的供体。已经开发了用于全基因组/外显子组测序数据的框架和软件，但不用于靶向下一代测序（tNGS），可能是由于tNGS的基因组覆盖范围有限，并且需要对所查询的snp集进行个性化处理。我们决定采用一种流行的工具来处理tNGS数据，以证明从典型的tNGS面板中选择信息snp的可能性，并创建一个检测样品处理错误的自动化工作流程。结果：我们编制了28个snp的自定义列表，并在其帮助下证明了仅使用tNGS数据来经济有效地检测错标样品的实用性。在两组共有1441例患者的连续样本中，我们发现了3例样本互换，7例错误标记样本（3例外部标记，4例内部标记）和1例来源不明的错误。我们提供了一个R函数，用于自动检测样本交换和错误标签，作为一个免费的开源工具提供给社区。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Research Notes Biochemistry, Genetics and Molecular Biology-Biochemistry, Genetics and Molecular Biology (all)

CiteScore

3.60

自引率

0.00%

发文量

363

审稿时长

15 weeks

期刊介绍： BMC Research Notes publishes scientifically valid research outputs that cannot be considered as full research or methodology articles. We support the research community across all scientific and clinical disciplines by providing an open access forum for sharing data and useful information; this includes, but is not limited to, updates to previous work, additions to established methods, short publications, null results, research proposals and data management plans.