Ron B Schifman, Keri Donaldson, Daniel Luevano, Raul Benavides, Jeffery A Hunt
{"title":"Machine Learning Classifier Using Blood Count Parameters and Erythropoietin to Predict JAK2 Mutations in Patients With Erythrocytosis.","authors":"Ron B Schifman, Keri Donaldson, Daniel Luevano, Raul Benavides, Jeffery A Hunt","doi":"10.5858/arpa.2023-0262-OA","DOIUrl":null,"url":null,"abstract":"<p><strong>Context.—: </strong>Differentiating polycythemia vera from other causes of erythrocytosis is a diagnostic challenge. Although most patients with polycythemia vera have Janus kinase 2 (JAK2) mutations, extensive testing is impractical because this is an uncommon cause of erythrocytosis. Identifying polycythemic patients most likely to benefit from JAK2 testing would improve use of this test.</p><p><strong>Objective.—: </strong>To develop an artificial intelligence analysis/machine learning classifier using blood count parameters and erythropoietin to predict JAK2 results in patients with erythrocytosis.</p><p><strong>Design.—: </strong>Results from the Veterans Affairs data warehouse were used for training and validation. Cases with JAK2 results and hemoglobin values 15 g/dL or higher and 17 g/dL or higher in females and males respectively were included. Erythropoietin was optional. The highest performing model was evaluated with an out-of-sample data set.</p><p><strong>Results.—: </strong>Among 31 models trained on data from 8479 individuals, including 540 (6.4%) positive for JAK2, Light Gradient Boosted Trees Classifier performed best. When applied to 330 out-of-sample cases with 9 (2.7%) positive for JAK2, the classifier's sensitivity, specificity, positive predictive value, and negative predictive value, were 100%, 92.8%, 28.1%, and 100%, respectively. Among a subset of 183 out-of-sample cases, the model's algorithm would have potentially reduced JAK2 testing by 89% compared with 50% to 62% reduction using previously reported rule-based systems that similarly used blood count parameters. Platelet count had the greatest impact on the model, followed by relative distribution width and erythropoietin.</p><p><strong>Conclusions.—: </strong>These results show that a machine learning classifier may be beneficial as a decision support aid for JAK2 testing in polycythemic patients.</p>","PeriodicalId":93883,"journal":{"name":"Archives of pathology & laboratory medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of pathology & laboratory medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5858/arpa.2023-0262-OA","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Context.—: Differentiating polycythemia vera from other causes of erythrocytosis is a diagnostic challenge. Although most patients with polycythemia vera have Janus kinase 2 (JAK2) mutations, extensive testing is impractical because this is an uncommon cause of erythrocytosis. Identifying polycythemic patients most likely to benefit from JAK2 testing would improve use of this test.
Objective.—: To develop an artificial intelligence analysis/machine learning classifier using blood count parameters and erythropoietin to predict JAK2 results in patients with erythrocytosis.
Design.—: Results from the Veterans Affairs data warehouse were used for training and validation. Cases with JAK2 results and hemoglobin values 15 g/dL or higher and 17 g/dL or higher in females and males respectively were included. Erythropoietin was optional. The highest performing model was evaluated with an out-of-sample data set.
Results.—: Among 31 models trained on data from 8479 individuals, including 540 (6.4%) positive for JAK2, Light Gradient Boosted Trees Classifier performed best. When applied to 330 out-of-sample cases with 9 (2.7%) positive for JAK2, the classifier's sensitivity, specificity, positive predictive value, and negative predictive value, were 100%, 92.8%, 28.1%, and 100%, respectively. Among a subset of 183 out-of-sample cases, the model's algorithm would have potentially reduced JAK2 testing by 89% compared with 50% to 62% reduction using previously reported rule-based systems that similarly used blood count parameters. Platelet count had the greatest impact on the model, followed by relative distribution width and erythropoietin.
Conclusions.—: These results show that a machine learning classifier may be beneficial as a decision support aid for JAK2 testing in polycythemic patients.