Ron B Schifman, Keri Donaldson, Daniel Luevano, Raul Benavides, Jeffery A Hunt
{"title":"使用血细胞计数参数和促红细胞生成素的机器学习分类器预测红细胞增多症患者的JAK2突变。","authors":"Ron B Schifman, Keri Donaldson, Daniel Luevano, Raul Benavides, Jeffery A Hunt","doi":"10.5858/arpa.2023-0262-OA","DOIUrl":null,"url":null,"abstract":"<p><strong>Context.—: </strong>Differentiating polycythemia vera from other causes of erythrocytosis is a diagnostic challenge. Although most patients with polycythemia vera have Janus kinase 2 (JAK2) mutations, extensive testing is impractical because this is an uncommon cause of erythrocytosis. Identifying polycythemic patients most likely to benefit from JAK2 testing would improve use of this test.</p><p><strong>Objective.—: </strong>To develop an artificial intelligence analysis/machine learning classifier using blood count parameters and erythropoietin to predict JAK2 results in patients with erythrocytosis.</p><p><strong>Design.—: </strong>Results from the Veterans Affairs data warehouse were used for training and validation. Cases with JAK2 results and hemoglobin values 15 g/dL or higher and 17 g/dL or higher in females and males respectively were included. Erythropoietin was optional. The highest performing model was evaluated with an out-of-sample data set.</p><p><strong>Results.—: </strong>Among 31 models trained on data from 8479 individuals, including 540 (6.4%) positive for JAK2, Light Gradient Boosted Trees Classifier performed best. When applied to 330 out-of-sample cases with 9 (2.7%) positive for JAK2, the classifier's sensitivity, specificity, positive predictive value, and negative predictive value, were 100%, 92.8%, 28.1%, and 100%, respectively. Among a subset of 183 out-of-sample cases, the model's algorithm would have potentially reduced JAK2 testing by 89% compared with 50% to 62% reduction using previously reported rule-based systems that similarly used blood count parameters. Platelet count had the greatest impact on the model, followed by relative distribution width and erythropoietin.</p><p><strong>Conclusions.—: </strong>These results show that a machine learning classifier may be beneficial as a decision support aid for JAK2 testing in polycythemic patients.</p>","PeriodicalId":93883,"journal":{"name":"Archives of pathology & laboratory medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning Classifier Using Blood Count Parameters and Erythropoietin to Predict JAK2 Mutations in Patients With Erythrocytosis.\",\"authors\":\"Ron B Schifman, Keri Donaldson, Daniel Luevano, Raul Benavides, Jeffery A Hunt\",\"doi\":\"10.5858/arpa.2023-0262-OA\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Context.—: </strong>Differentiating polycythemia vera from other causes of erythrocytosis is a diagnostic challenge. Although most patients with polycythemia vera have Janus kinase 2 (JAK2) mutations, extensive testing is impractical because this is an uncommon cause of erythrocytosis. Identifying polycythemic patients most likely to benefit from JAK2 testing would improve use of this test.</p><p><strong>Objective.—: </strong>To develop an artificial intelligence analysis/machine learning classifier using blood count parameters and erythropoietin to predict JAK2 results in patients with erythrocytosis.</p><p><strong>Design.—: </strong>Results from the Veterans Affairs data warehouse were used for training and validation. Cases with JAK2 results and hemoglobin values 15 g/dL or higher and 17 g/dL or higher in females and males respectively were included. Erythropoietin was optional. The highest performing model was evaluated with an out-of-sample data set.</p><p><strong>Results.—: </strong>Among 31 models trained on data from 8479 individuals, including 540 (6.4%) positive for JAK2, Light Gradient Boosted Trees Classifier performed best. When applied to 330 out-of-sample cases with 9 (2.7%) positive for JAK2, the classifier's sensitivity, specificity, positive predictive value, and negative predictive value, were 100%, 92.8%, 28.1%, and 100%, respectively. Among a subset of 183 out-of-sample cases, the model's algorithm would have potentially reduced JAK2 testing by 89% compared with 50% to 62% reduction using previously reported rule-based systems that similarly used blood count parameters. Platelet count had the greatest impact on the model, followed by relative distribution width and erythropoietin.</p><p><strong>Conclusions.—: </strong>These results show that a machine learning classifier may be beneficial as a decision support aid for JAK2 testing in polycythemic patients.</p>\",\"PeriodicalId\":93883,\"journal\":{\"name\":\"Archives of pathology & laboratory medicine\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Archives of pathology & laboratory medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5858/arpa.2023-0262-OA\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of pathology & laboratory medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5858/arpa.2023-0262-OA","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
上下文。鉴别真性红细胞增多症与其他原因的红细胞增多症是一个诊断挑战。虽然大多数真性红细胞增多症患者有JAK2突变,但广泛的检测是不切实际的,因为这是红细胞增多症的罕见原因。确定最有可能从JAK2检测中获益的红细胞增多症患者将改善该检测的使用。-:开发一种人工智能分析/机器学习分类器,使用血细胞计数参数和促红细胞生成素来预测红细胞增多症患者的JAK2结果。-:退伍军人事务数据仓库的结果用于培训和验证。包括JAK2结果和血红蛋白值分别为15 g/dL或更高的女性和17 g/dL或更高的男性病例。促红细胞生成素是可选的。使用样本外数据集评估性能最高的模型。-:在来自8479个个体的数据训练的31个模型中,包括540个(6.4%)JAK2阳性,Light Gradient boosting Trees Classifier表现最好。当应用于330例样本外病例(其中9例(2.7%)为JAK2阳性)时,分类器的敏感性、特异性、阳性预测值和阴性预测值分别为100%、92.8%、28.1%和100%。在183例样本外病例的子集中,该模型的算法可能会将JAK2检测减少89%,而使用先前报道的类似使用血细胞计数参数的基于规则的系统则可能减少50%至62%。血小板计数对模型的影响最大,其次是相对分布宽度和促红细胞生成素。-:这些结果表明,机器学习分类器可能有助于在红细胞增多症患者中进行JAK2检测的决策支持辅助。
Machine Learning Classifier Using Blood Count Parameters and Erythropoietin to Predict JAK2 Mutations in Patients With Erythrocytosis.
Context.—: Differentiating polycythemia vera from other causes of erythrocytosis is a diagnostic challenge. Although most patients with polycythemia vera have Janus kinase 2 (JAK2) mutations, extensive testing is impractical because this is an uncommon cause of erythrocytosis. Identifying polycythemic patients most likely to benefit from JAK2 testing would improve use of this test.
Objective.—: To develop an artificial intelligence analysis/machine learning classifier using blood count parameters and erythropoietin to predict JAK2 results in patients with erythrocytosis.
Design.—: Results from the Veterans Affairs data warehouse were used for training and validation. Cases with JAK2 results and hemoglobin values 15 g/dL or higher and 17 g/dL or higher in females and males respectively were included. Erythropoietin was optional. The highest performing model was evaluated with an out-of-sample data set.
Results.—: Among 31 models trained on data from 8479 individuals, including 540 (6.4%) positive for JAK2, Light Gradient Boosted Trees Classifier performed best. When applied to 330 out-of-sample cases with 9 (2.7%) positive for JAK2, the classifier's sensitivity, specificity, positive predictive value, and negative predictive value, were 100%, 92.8%, 28.1%, and 100%, respectively. Among a subset of 183 out-of-sample cases, the model's algorithm would have potentially reduced JAK2 testing by 89% compared with 50% to 62% reduction using previously reported rule-based systems that similarly used blood count parameters. Platelet count had the greatest impact on the model, followed by relative distribution width and erythropoietin.
Conclusions.—: These results show that a machine learning classifier may be beneficial as a decision support aid for JAK2 testing in polycythemic patients.