Multimodal Convolutional Neural Networks for Sperm Motility and Concentration Predictions

Pub Date : 2024-04-24 DOI:10.11113/mjfas.v20n2.3263

Voon Hueh Goh, Muhammad Asraf Mansor, M. A. As’ari, L. Ismail

{"title":"Multimodal Convolutional Neural Networks for Sperm Motility and Concentration Predictions","authors":"Voon Hueh Goh, Muhammad Asraf Mansor, M. A. As’ari, L. Ismail","doi":"10.11113/mjfas.v20n2.3263","DOIUrl":null,"url":null,"abstract":"Semen analysis is an important analysis for male infertility primary investigation and manual semen analysis is a conventional method to assess it. Manual semen analysis has been revealed with accuracy and precision limitations due to noncompliance to guidelines and procedures. Sperm motility and concentration are the main indicators for pregnancy and conception rate hence they were selected for parameters prediction. Convolutional neural network (CNN) has benefited computer vision application industry in recent years and has been widely applied in computer vision research tasks. In this paper, three-dimensional CNN (3DCNN) was designed to extract motion and temporal features, which are vital for sperm motility prediction. For sperm concentration, since two-dimensional CNN (2DCNN) is efficient in recognizing and extracting spatial features, well-established Residual Network (ResNet) architecture was adopted and customized for sperm concentration prediction. Multimodal learning approach is a technique to aggregate learnt features from different deep learning architecture that adopted other forms of modalities, which could provide deep learning model with better insights on their tasks. Hence, a multimodal learning deep learning architecture was designed to receive both image-based (frames extracted from video samples) and video-based (stacked frames pre-processed from video samples) input that could provide well-extracted spatial and temporal features for sperm parameters prediction. The results obtained using the proposed methodology have surpassed other similar research works who used deep learning approach. For sperm motility, its best achieved average mean absolute error (MAE) was 8.048, and sperm concentration obtained a competent Pearson’s correlation coefficient (RP) value of 0.853.","PeriodicalId":0,"journal":{"name":"","volume":"50 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11113/mjfas.v20n2.3263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Semen analysis is an important analysis for male infertility primary investigation and manual semen analysis is a conventional method to assess it. Manual semen analysis has been revealed with accuracy and precision limitations due to noncompliance to guidelines and procedures. Sperm motility and concentration are the main indicators for pregnancy and conception rate hence they were selected for parameters prediction. Convolutional neural network (CNN) has benefited computer vision application industry in recent years and has been widely applied in computer vision research tasks. In this paper, three-dimensional CNN (3DCNN) was designed to extract motion and temporal features, which are vital for sperm motility prediction. For sperm concentration, since two-dimensional CNN (2DCNN) is efficient in recognizing and extracting spatial features, well-established Residual Network (ResNet) architecture was adopted and customized for sperm concentration prediction. Multimodal learning approach is a technique to aggregate learnt features from different deep learning architecture that adopted other forms of modalities, which could provide deep learning model with better insights on their tasks. Hence, a multimodal learning deep learning architecture was designed to receive both image-based (frames extracted from video samples) and video-based (stacked frames pre-processed from video samples) input that could provide well-extracted spatial and temporal features for sperm parameters prediction. The results obtained using the proposed methodology have surpassed other similar research works who used deep learning approach. For sperm motility, its best achieved average mean absolute error (MAE) was 8.048, and sperm concentration obtained a competent Pearson’s correlation coefficient (RP) value of 0.853.

查看原文

用于精子活力和浓度预测的多模态卷积神经网络

精液分析是男性不育初诊的一项重要分析，而人工精液分析是评估男性不育的传统方法。人工精液分析由于不符合指导原则和程序，在准确性和精确性方面存在局限性。精子活力和浓度是怀孕率和受孕率的主要指标，因此被选为参数预测的指标。近年来，卷积神经网络（CNN）使计算机视觉应用行业受益匪浅，并被广泛应用于计算机视觉研究任务中。本文设计了三维卷积神经网络（3DCNN），以提取对精子活力预测至关重要的运动和时间特征。在精子浓度方面，由于二维 CNN（2DCNN）在识别和提取空间特征方面效率较高，因此采用了成熟的残差网络（ResNet）架构，并为精子浓度预测进行了定制。多模态学习方法是从不同的深度学习架构中汇总所学特征的技术，这些架构采用了其他形式的模态，可以为深度学习模型的任务提供更好的见解。因此，我们设计了一种多模态学习深度学习架构，以接收基于图像（从视频样本中提取的帧）和基于视频（从视频样本中预处理的堆叠帧）的输入，从而为精子参数预测提供提取良好的空间和时间特征。使用所提方法获得的结果超过了使用深度学习方法的其他类似研究成果。在精子活力方面，其最佳平均绝对误差（MAE）为 8.048，精子浓度的皮尔逊相关系数（RP）为 0.853。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文