A deep learning approach to prediction of blood group antigens from genomic data

IF 2.5 3区医学 Q2 HEMATOLOGY

Transfusion Pub Date : 2024-09-13 DOI:10.1111/trf.18013

Camous Moslemi, Susanne Sækmose, Rune Larsen, Thorsten Brodersen, Jakob T. Bay, Maria Didriksen, Kaspar R. Nielsen, Mie T. Bruun, Joseph Dowsett, Khoa M. Dinh, Christina Mikkelsen, Kati Hyvärinen, Jarmo Ritari, Jukka Partanen, Henrik Ullum, Christian Erikstrup, Sisse R. Ostrowski, Martin L. Olsson, Ole B. Pedersen

{"title":"A deep learning approach to prediction of blood group antigens from genomic data","authors":"Camous Moslemi, Susanne Sækmose, Rune Larsen, Thorsten Brodersen, Jakob T. Bay, Maria Didriksen, Kaspar R. Nielsen, Mie T. Bruun, Joseph Dowsett, Khoa M. Dinh, Christina Mikkelsen, Kati Hyvärinen, Jarmo Ritari, Jukka Partanen, Henrik Ullum, Christian Erikstrup, Sisse R. Ostrowski, Martin L. Olsson, Ole B. Pedersen","doi":"10.1111/trf.18013","DOIUrl":null,"url":null,"abstract":"BackgroundDeep learning methods are revolutionizing natural science. In this study, we aim to apply such techniques to develop blood type prediction models based on cheap to analyze and easily scalable screening array genotyping platforms.MethodsCombining existing blood types from blood banks and imputed screening array genotypes for ~111,000 Danish and 1168 Finnish blood donors, we used deep learning techniques to train and validate blood type prediction models for 36 antigens in 15 blood group systems. To account for missing genotypes a denoising autoencoder initial step was utilized, followed by a convolutional neural network blood type classifier.ResultsTwo thirds of the trained blood type prediction models demonstrated an F1‐accuracy above 99%. Models for antigens with low or high frequencies like, for example, Cw, low training cohorts like, for example, Cob, or very complicated genetic underpinning like, for example, RhD, proved to be more challenging for high accuracy (>99%) DL modeling. However, in the Danish cohort only 4 out of 36 models (Cob, Cw, D‐weak, Kpa) failed to achieve a prediction F1‐accuracy above 97%. This high predictive performance was replicated in the Finnish cohort.DiscussionHigh accuracy in a variety of blood groups proves viability of deep learning‐based blood type prediction using array chip genotypes, even in blood groups with nontrivial genetic underpinnings. These techniques are suitable for aiding in identifying blood donors with rare blood types by greatly narrowing down the potential pool of candidate donors before clinical grade confirmation.","PeriodicalId":23266,"journal":{"name":"Transfusion","volume":"28 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transfusion","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/trf.18013","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEMATOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

BackgroundDeep learning methods are revolutionizing natural science. In this study, we aim to apply such techniques to develop blood type prediction models based on cheap to analyze and easily scalable screening array genotyping platforms.MethodsCombining existing blood types from blood banks and imputed screening array genotypes for ~111,000 Danish and 1168 Finnish blood donors, we used deep learning techniques to train and validate blood type prediction models for 36 antigens in 15 blood group systems. To account for missing genotypes a denoising autoencoder initial step was utilized, followed by a convolutional neural network blood type classifier.ResultsTwo thirds of the trained blood type prediction models demonstrated an F1‐accuracy above 99%. Models for antigens with low or high frequencies like, for example, Cw, low training cohorts like, for example, Cob, or very complicated genetic underpinning like, for example, RhD, proved to be more challenging for high accuracy (>99%) DL modeling. However, in the Danish cohort only 4 out of 36 models (Cob, Cw, D‐weak, Kpa) failed to achieve a prediction F1‐accuracy above 97%. This high predictive performance was replicated in the Finnish cohort.DiscussionHigh accuracy in a variety of blood groups proves viability of deep learning‐based blood type prediction using array chip genotypes, even in blood groups with nontrivial genetic underpinnings. These techniques are suitable for aiding in identifying blood donors with rare blood types by greatly narrowing down the potential pool of candidate donors before clinical grade confirmation.

查看原文本刊更多论文

从基因组数据预测血型抗原的深度学习方法

背景深度学习方法正在彻底改变自然科学。在本研究中，我们旨在应用这些技术开发基于分析成本低、易于扩展的筛查阵列基因分型平台的血型预测模型。方法结合血库中现有的血型以及约 111,000 名丹麦和 1168 名芬兰献血者的估算筛查阵列基因型，我们使用深度学习技术训练并验证了 15 个血型系统中 36 种抗原的血型预测模型。为了考虑缺失的基因型，我们使用了去噪自编码器初始步骤，然后使用卷积神经网络血型分类器。事实证明，低频率或高频率抗原（如 Cw）、低训练队列（如 Cob）或非常复杂的遗传基础（如 RhD）的模型对于高准确率（99%）DL 建模来说更具挑战性。然而，在丹麦队列中，36 个模型中只有 4 个（Cob、Cw、D-weak、Kpa）未能达到 97% 以上的预测 F1-准确率。讨论在各种血型中的高准确率证明了基于深度学习的血型预测在使用阵列芯片基因型时的可行性，即使在具有非简单遗传基础的血型中也是如此。这些技术适用于帮助识别稀有血型献血者，在临床分级确认之前大大缩小了候选献血者的潜在范围。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Transfusion 医学-血液学

CiteScore

4.70

自引率

20.70%

发文量

426

审稿时长

1 months

期刊介绍： TRANSFUSION is the foremost publication in the world for new information regarding transfusion medicine. Written by and for members of AABB and other health-care workers, TRANSFUSION reports on the latest technical advances, discusses opposing viewpoints regarding controversial issues, and presents key conference proceedings. In addition to blood banking and transfusion medicine topics, TRANSFUSION presents submissions concerning patient blood management, tissue transplantation and hematopoietic, cellular, and gene therapies.