Multimodal Convolutional Neural Networks for Sperm Motility and Concentration Predictions

IF 0.8 Q3 MULTIDISCIPLINARY SCIENCES
Voon Hueh Goh, Muhammad Asraf Mansor, M. A. As’ari, L. Ismail
{"title":"Multimodal Convolutional Neural Networks for Sperm Motility and Concentration Predictions","authors":"Voon Hueh Goh, Muhammad Asraf Mansor, M. A. As’ari, L. Ismail","doi":"10.11113/mjfas.v20n2.3263","DOIUrl":null,"url":null,"abstract":"Semen analysis is an important analysis for male infertility primary investigation and manual semen analysis is a conventional method to assess it. Manual semen analysis has been revealed with accuracy and precision limitations due to noncompliance to guidelines and procedures. Sperm motility and concentration are the main indicators for pregnancy and conception rate hence they were selected for parameters prediction. Convolutional neural network (CNN) has benefited computer vision application industry in recent years and has been widely applied in computer vision research tasks. In this paper, three-dimensional CNN (3DCNN) was designed to extract motion and temporal features, which are vital for sperm motility prediction. For sperm concentration, since two-dimensional CNN (2DCNN) is efficient in recognizing and extracting spatial features, well-established Residual Network (ResNet) architecture was adopted and customized for sperm concentration prediction. Multimodal learning approach is a technique to aggregate learnt features from different deep learning architecture that adopted other forms of modalities, which could provide deep learning model with better insights on their tasks. Hence, a multimodal learning deep learning architecture was designed to receive both image-based (frames extracted from video samples) and video-based (stacked frames pre-processed from video samples) input that could provide well-extracted spatial and temporal features for sperm parameters prediction.  The results obtained using the proposed methodology have surpassed other similar research works who used deep learning approach. For sperm motility, its best achieved average mean absolute error (MAE) was 8.048, and sperm concentration obtained a competent Pearson’s correlation coefficient (RP) value of 0.853.","PeriodicalId":18149,"journal":{"name":"Malaysian Journal of Fundamental and Applied Sciences","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Malaysian Journal of Fundamental and Applied Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11113/mjfas.v20n2.3263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Semen analysis is an important analysis for male infertility primary investigation and manual semen analysis is a conventional method to assess it. Manual semen analysis has been revealed with accuracy and precision limitations due to noncompliance to guidelines and procedures. Sperm motility and concentration are the main indicators for pregnancy and conception rate hence they were selected for parameters prediction. Convolutional neural network (CNN) has benefited computer vision application industry in recent years and has been widely applied in computer vision research tasks. In this paper, three-dimensional CNN (3DCNN) was designed to extract motion and temporal features, which are vital for sperm motility prediction. For sperm concentration, since two-dimensional CNN (2DCNN) is efficient in recognizing and extracting spatial features, well-established Residual Network (ResNet) architecture was adopted and customized for sperm concentration prediction. Multimodal learning approach is a technique to aggregate learnt features from different deep learning architecture that adopted other forms of modalities, which could provide deep learning model with better insights on their tasks. Hence, a multimodal learning deep learning architecture was designed to receive both image-based (frames extracted from video samples) and video-based (stacked frames pre-processed from video samples) input that could provide well-extracted spatial and temporal features for sperm parameters prediction.  The results obtained using the proposed methodology have surpassed other similar research works who used deep learning approach. For sperm motility, its best achieved average mean absolute error (MAE) was 8.048, and sperm concentration obtained a competent Pearson’s correlation coefficient (RP) value of 0.853.
用于精子活力和浓度预测的多模态卷积神经网络
精液分析是男性不育初诊的一项重要分析,而人工精液分析是评估男性不育的传统方法。人工精液分析由于不符合指导原则和程序,在准确性和精确性方面存在局限性。精子活力和浓度是怀孕率和受孕率的主要指标,因此被选为参数预测的指标。近年来,卷积神经网络(CNN)使计算机视觉应用行业受益匪浅,并被广泛应用于计算机视觉研究任务中。本文设计了三维卷积神经网络(3DCNN),以提取对精子活力预测至关重要的运动和时间特征。在精子浓度方面,由于二维 CNN(2DCNN)在识别和提取空间特征方面效率较高,因此采用了成熟的残差网络(ResNet)架构,并为精子浓度预测进行了定制。多模态学习方法是从不同的深度学习架构中汇总所学特征的技术,这些架构采用了其他形式的模态,可以为深度学习模型的任务提供更好的见解。因此,我们设计了一种多模态学习深度学习架构,以接收基于图像(从视频样本中提取的帧)和基于视频(从视频样本中预处理的堆叠帧)的输入,从而为精子参数预测提供提取良好的空间和时间特征。 使用所提方法获得的结果超过了使用深度学习方法的其他类似研究成果。在精子活力方面,其最佳平均绝对误差(MAE)为 8.048,精子浓度的皮尔逊相关系数(RP)为 0.853。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.40
自引率
0.00%
发文量
45
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信