Camous Moslemi, Susanne Sækmose, Rune Larsen, Thorsten Brodersen, Jakob T. Bay, Maria Didriksen, Kaspar R. Nielsen, Mie T. Bruun, Joseph Dowsett, Khoa M. Dinh, Christina Mikkelsen, Kati Hyvärinen, Jarmo Ritari, Jukka Partanen, Henrik Ullum, Christian Erikstrup, Sisse R. Ostrowski, Martin L. Olsson, Ole B. Pedersen
{"title":"A deep learning approach to prediction of blood group antigens from genomic data","authors":"Camous Moslemi, Susanne Sækmose, Rune Larsen, Thorsten Brodersen, Jakob T. Bay, Maria Didriksen, Kaspar R. Nielsen, Mie T. Bruun, Joseph Dowsett, Khoa M. Dinh, Christina Mikkelsen, Kati Hyvärinen, Jarmo Ritari, Jukka Partanen, Henrik Ullum, Christian Erikstrup, Sisse R. Ostrowski, Martin L. Olsson, Ole B. Pedersen","doi":"10.1111/trf.18013","DOIUrl":null,"url":null,"abstract":"BackgroundDeep learning methods are revolutionizing natural science. In this study, we aim to apply such techniques to develop blood type prediction models based on cheap to analyze and easily scalable screening array genotyping platforms.MethodsCombining existing blood types from blood banks and imputed screening array genotypes for ~111,000 Danish and 1168 Finnish blood donors, we used deep learning techniques to train and validate blood type prediction models for 36 antigens in 15 blood group systems. To account for missing genotypes a denoising autoencoder initial step was utilized, followed by a convolutional neural network blood type classifier.ResultsTwo thirds of the trained blood type prediction models demonstrated an F1‐accuracy above 99%. Models for antigens with low or high frequencies like, for example, C<jats:sup>w</jats:sup>, low training cohorts like, for example, Co<jats:sup>b</jats:sup>, or very complicated genetic underpinning like, for example, RhD, proved to be more challenging for high accuracy (>99%) DL modeling. However, in the Danish cohort only 4 out of 36 models (Co<jats:sup>b</jats:sup>, C<jats:sup>w</jats:sup>, D‐weak, Kp<jats:sup>a</jats:sup>) failed to achieve a prediction F1‐accuracy above 97%. This high predictive performance was replicated in the Finnish cohort.DiscussionHigh accuracy in a variety of blood groups proves viability of deep learning‐based blood type prediction using array chip genotypes, even in blood groups with nontrivial genetic underpinnings. These techniques are suitable for aiding in identifying blood donors with rare blood types by greatly narrowing down the potential pool of candidate donors before clinical grade confirmation.","PeriodicalId":23266,"journal":{"name":"Transfusion","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transfusion","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/trf.18013","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEMATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
BackgroundDeep learning methods are revolutionizing natural science. In this study, we aim to apply such techniques to develop blood type prediction models based on cheap to analyze and easily scalable screening array genotyping platforms.MethodsCombining existing blood types from blood banks and imputed screening array genotypes for ~111,000 Danish and 1168 Finnish blood donors, we used deep learning techniques to train and validate blood type prediction models for 36 antigens in 15 blood group systems. To account for missing genotypes a denoising autoencoder initial step was utilized, followed by a convolutional neural network blood type classifier.ResultsTwo thirds of the trained blood type prediction models demonstrated an F1‐accuracy above 99%. Models for antigens with low or high frequencies like, for example, Cw, low training cohorts like, for example, Cob, or very complicated genetic underpinning like, for example, RhD, proved to be more challenging for high accuracy (>99%) DL modeling. However, in the Danish cohort only 4 out of 36 models (Cob, Cw, D‐weak, Kpa) failed to achieve a prediction F1‐accuracy above 97%. This high predictive performance was replicated in the Finnish cohort.DiscussionHigh accuracy in a variety of blood groups proves viability of deep learning‐based blood type prediction using array chip genotypes, even in blood groups with nontrivial genetic underpinnings. These techniques are suitable for aiding in identifying blood donors with rare blood types by greatly narrowing down the potential pool of candidate donors before clinical grade confirmation.
期刊介绍:
TRANSFUSION is the foremost publication in the world for new information regarding transfusion medicine. Written by and for members of AABB and other health-care workers, TRANSFUSION reports on the latest technical advances, discusses opposing viewpoints regarding controversial issues, and presents key conference proceedings. In addition to blood banking and transfusion medicine topics, TRANSFUSION presents submissions concerning patient blood management, tissue transplantation and hematopoietic, cellular, and gene therapies.