{"title":"变异分类不一致:影响因素与预测模型。","authors":"Hamid Ghaedi , Scott K. Davey , Harriet Feilotter","doi":"10.1016/j.jmoldx.2023.11.002","DOIUrl":null,"url":null,"abstract":"<div><p>An ever-growing catalog of human variants is hosted in the ClinVar database. In this database, submissions on a variant are combined into a multisubmitter record; and in the case of discordance in variant classification between submitters, the record is labeled as conflicting. The current study used ClinVar data to identify characteristics that would make variants more likely to be associated with the conflict class of variants. Furthermore, the Extreme Gradient Boosting algorithm was used to train classifier models to provide prediction of classification discordance for single submission variants in ClinVar database. Population allele frequency, the gene harboring the variant, variant type, consequence on protein, variant deleteriousness score, first submitter identity, and submission count were associated with conflict in variant classification. Using such features, the optimized classifier showed accuracy on the test set of 88% with the weighted average of precision, recall, and f1-score of 0.84, 0.88, and 0.85, respectively. There were pronounced associations between variant classification discordance and allele frequency, gene type, and the identity of the first submitter. The study provides the predicted discordance status for single-submitter variants deposited in ClinVar. This approach can be used to assess whether single-submitter variants are likely to be supported, or in conflict with, future entries; this knowledge may help laboratories with clinical variant assessment.</p></div>","PeriodicalId":50128,"journal":{"name":"Journal of Molecular Diagnostics","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2023-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1525157823002738/pdfft?md5=e89086f67f83997c28d16831557eb9b2&pid=1-s2.0-S1525157823002738-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Variant Classification Discordance\",\"authors\":\"Hamid Ghaedi , Scott K. Davey , Harriet Feilotter\",\"doi\":\"10.1016/j.jmoldx.2023.11.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>An ever-growing catalog of human variants is hosted in the ClinVar database. In this database, submissions on a variant are combined into a multisubmitter record; and in the case of discordance in variant classification between submitters, the record is labeled as conflicting. The current study used ClinVar data to identify characteristics that would make variants more likely to be associated with the conflict class of variants. Furthermore, the Extreme Gradient Boosting algorithm was used to train classifier models to provide prediction of classification discordance for single submission variants in ClinVar database. Population allele frequency, the gene harboring the variant, variant type, consequence on protein, variant deleteriousness score, first submitter identity, and submission count were associated with conflict in variant classification. Using such features, the optimized classifier showed accuracy on the test set of 88% with the weighted average of precision, recall, and f1-score of 0.84, 0.88, and 0.85, respectively. There were pronounced associations between variant classification discordance and allele frequency, gene type, and the identity of the first submitter. The study provides the predicted discordance status for single-submitter variants deposited in ClinVar. This approach can be used to assess whether single-submitter variants are likely to be supported, or in conflict with, future entries; this knowledge may help laboratories with clinical variant assessment.</p></div>\",\"PeriodicalId\":50128,\"journal\":{\"name\":\"Journal of Molecular Diagnostics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2023-11-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1525157823002738/pdfft?md5=e89086f67f83997c28d16831557eb9b2&pid=1-s2.0-S1525157823002738-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Molecular Diagnostics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1525157823002738\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Molecular Diagnostics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1525157823002738","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PATHOLOGY","Score":null,"Total":0}
An ever-growing catalog of human variants is hosted in the ClinVar database. In this database, submissions on a variant are combined into a multisubmitter record; and in the case of discordance in variant classification between submitters, the record is labeled as conflicting. The current study used ClinVar data to identify characteristics that would make variants more likely to be associated with the conflict class of variants. Furthermore, the Extreme Gradient Boosting algorithm was used to train classifier models to provide prediction of classification discordance for single submission variants in ClinVar database. Population allele frequency, the gene harboring the variant, variant type, consequence on protein, variant deleteriousness score, first submitter identity, and submission count were associated with conflict in variant classification. Using such features, the optimized classifier showed accuracy on the test set of 88% with the weighted average of precision, recall, and f1-score of 0.84, 0.88, and 0.85, respectively. There were pronounced associations between variant classification discordance and allele frequency, gene type, and the identity of the first submitter. The study provides the predicted discordance status for single-submitter variants deposited in ClinVar. This approach can be used to assess whether single-submitter variants are likely to be supported, or in conflict with, future entries; this knowledge may help laboratories with clinical variant assessment.
期刊介绍:
The Journal of Molecular Diagnostics, the official publication of the Association for Molecular Pathology (AMP), co-owned by the American Society for Investigative Pathology (ASIP), seeks to publish high quality original papers on scientific advances in the translation and validation of molecular discoveries in medicine into the clinical diagnostic setting, and the description and application of technological advances in the field of molecular diagnostic medicine. The editors welcome for review articles that contain: novel discoveries or clinicopathologic correlations including studies in oncology, infectious diseases, inherited diseases, predisposition to disease, clinical informatics, or the description of polymorphisms linked to disease states or normal variations; the application of diagnostic methodologies in clinical trials; or the development of new or improved molecular methods which may be applied to diagnosis or monitoring of disease or disease predisposition.