Sándor Baráth, Parvind Singh, Zsuzsanna Hevessy, Anikó Ujfalusi, Zoltán Mezei, Mária Balogh, Marianna Száraz Széles, János Kappelmayer
{"title":"Enhancing HLA-B27 antigen detection: Leveraging machine learning algorithms for flow cytometric analysis.","authors":"Sándor Baráth, Parvind Singh, Zsuzsanna Hevessy, Anikó Ujfalusi, Zoltán Mezei, Mária Balogh, Marianna Száraz Széles, János Kappelmayer","doi":"10.1002/cyto.b.22164","DOIUrl":null,"url":null,"abstract":"<p><p>As the association of human leukocyte antigen B27 (HLA-B27) with spondylarthropathies is widely known, HLA-B27 antigen expression is frequently identified using flow cytometric or other techniques. Because of the possibility of cross-reaction with off target antigens, such as HLA-B7, each flow cytometric technique applies a \"gray zone\" reserved for equivocal findings. Our aim was to use machine learning (ML) methods to classify such equivocal data as positive or negative. Equivocal samples (n = 99) were selected from samples submitted to our institution for clinical evaluation by HLA-B27 antigen testing. Samples were analyzed by flow cytometry and polymerase chain reaction. Features of histograms generated by flow cytometry were used to train and validate ML methods for classification as logistic regression (LR), decision tree (DT), random forest (RF) and light gradient boost method (GBM). All evaluated ML algorithms performed well, with high accuracy, sensitivity, specificity, as well as negative and positive predictive values. Although, gradient boost approaches are proposed as high performance methods; nevertheless, their effectiveness may be lower for smaller sample sizes. On our relatively smaller sample set, the random forest algorithm performed best (AUC: 0.92), but there was no statistically significant difference between the ML algorithms used. AUC values for light GBM, DT, and LR were 0.88, 0.89, 0.89, respectively. Implementing these algorithms into the process of HLA-B27 testing can reduce the number of uncertain, false negative or false positive cases, especially in laboratories where no genetic testing is available.</p>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/cyto.b.22164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0
Abstract
As the association of human leukocyte antigen B27 (HLA-B27) with spondylarthropathies is widely known, HLA-B27 antigen expression is frequently identified using flow cytometric or other techniques. Because of the possibility of cross-reaction with off target antigens, such as HLA-B7, each flow cytometric technique applies a "gray zone" reserved for equivocal findings. Our aim was to use machine learning (ML) methods to classify such equivocal data as positive or negative. Equivocal samples (n = 99) were selected from samples submitted to our institution for clinical evaluation by HLA-B27 antigen testing. Samples were analyzed by flow cytometry and polymerase chain reaction. Features of histograms generated by flow cytometry were used to train and validate ML methods for classification as logistic regression (LR), decision tree (DT), random forest (RF) and light gradient boost method (GBM). All evaluated ML algorithms performed well, with high accuracy, sensitivity, specificity, as well as negative and positive predictive values. Although, gradient boost approaches are proposed as high performance methods; nevertheless, their effectiveness may be lower for smaller sample sizes. On our relatively smaller sample set, the random forest algorithm performed best (AUC: 0.92), but there was no statistically significant difference between the ML algorithms used. AUC values for light GBM, DT, and LR were 0.88, 0.89, 0.89, respectively. Implementing these algorithms into the process of HLA-B27 testing can reduce the number of uncertain, false negative or false positive cases, especially in laboratories where no genetic testing is available.