Yue Zhai, Claire Bardel, Maxime Vallée, Jean Iwaz, Pascal Roy
{"title":"Place of concordance-discordance model in evaluating NGS performance.","authors":"Yue Zhai, Claire Bardel, Maxime Vallée, Jean Iwaz, Pascal Roy","doi":"10.1159/000538401","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Ideally, evaluating NGS performance requires a gold standard; in its absence, concordance between replicates is often used as substitute standard. However, the appropriateness of the concordance-discordance criterion has been rarely evaluated. This study analyzes the relationship between the probability of discordance and the probability of error under different conditions.</p><p><strong>Methods: </strong>This study used a conditional probability approach under conditional dependence then conditional independence between two sequencing results and compares the probabilities of discordance and error in different theoretical conditions of sensitivity, specificity, and correlation between replicates, then on real results of sequencing genome NA12878. The study examines also covariate effects on discordance and error using generalized additive models with smooth functions.</p><p><strong>Results: </strong>With 99% sensitivity and 99.9% specificity under conditional independence, the probability of error for a positive concordant pair of calls is 0.1%. With additional hypotheses of 0.1% prevalence and 0.9 correlation between replicates, the probability of error for a positive concordant pair is 47.4%. With real data, the estimated sensitivity, specificity, and correlation between tests for variants are around 98.98%, 99.996%, and 93%, respectively, and the error rate for positive concordant calls approximates 2.5%. In covariate effect analyses, the effects' functional form are close between discordance and error models, though the parts of deviance explained by the covariates differ between discordance and error models.</p><p><strong>Conclusion: </strong>With conditional independence of two sequencing results, the concordance-discordance criterion seems acceptable as substitute standard. However, with high correlation, the criterion becomes questionable because a high percentage of false concordant results appears among concordant results.</p>","PeriodicalId":13226,"journal":{"name":"Human Heredity","volume":null,"pages":null},"PeriodicalIF":1.1000,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Heredity","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1159/000538401","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Ideally, evaluating NGS performance requires a gold standard; in its absence, concordance between replicates is often used as substitute standard. However, the appropriateness of the concordance-discordance criterion has been rarely evaluated. This study analyzes the relationship between the probability of discordance and the probability of error under different conditions.
Methods: This study used a conditional probability approach under conditional dependence then conditional independence between two sequencing results and compares the probabilities of discordance and error in different theoretical conditions of sensitivity, specificity, and correlation between replicates, then on real results of sequencing genome NA12878. The study examines also covariate effects on discordance and error using generalized additive models with smooth functions.
Results: With 99% sensitivity and 99.9% specificity under conditional independence, the probability of error for a positive concordant pair of calls is 0.1%. With additional hypotheses of 0.1% prevalence and 0.9 correlation between replicates, the probability of error for a positive concordant pair is 47.4%. With real data, the estimated sensitivity, specificity, and correlation between tests for variants are around 98.98%, 99.996%, and 93%, respectively, and the error rate for positive concordant calls approximates 2.5%. In covariate effect analyses, the effects' functional form are close between discordance and error models, though the parts of deviance explained by the covariates differ between discordance and error models.
Conclusion: With conditional independence of two sequencing results, the concordance-discordance criterion seems acceptable as substitute standard. However, with high correlation, the criterion becomes questionable because a high percentage of false concordant results appears among concordant results.
期刊介绍:
Gathering original research reports and short communications from all over the world, ''Human Heredity'' is devoted to methodological and applied research on the genetics of human populations, association and linkage analysis, genetic mechanisms of disease, and new methods for statistical genetics, for example, analysis of rare variants and results from next generation sequencing. The value of this information to many branches of medicine is shown by the number of citations the journal receives in fields ranging from immunology and hematology to epidemiology and public health planning, and the fact that at least 50% of all ''Human Heredity'' papers are still cited more than 8 years after publication (according to ISI Journal Citation Reports). Special issues on methodological topics (such as ‘Consanguinity and Genomics’ in 2014; ‘Analyzing Rare Variants in Complex Diseases’ in 2012) or reviews of advances in particular fields (‘Genetic Diversity in European Populations: Evolutionary Evidence and Medical Implications’ in 2014; ‘Genes and the Environment in Obesity’ in 2013) are published every year. Renowned experts in the field are invited to contribute to these special issues.