P-154 Comparison of conventional embryo evaluation and AI models for predicting viability and competence of vitrified-warmed blastocysts

IF 6 1区医学 Q1 OBSTETRICS & GYNECOLOGY

Human reproduction Pub Date : 2025-06-28 DOI:10.1093/humrep/deaf097.463

S Perez Albala, L Conversa, P Vercet, T Carrión, A Cobo, M Meseguer

{"title":"P-154 Comparison of conventional embryo evaluation and AI models for predicting viability and competence of vitrified-warmed blastocysts","authors":"S Perez Albala, L Conversa, P Vercet, T Carrión, A Cobo, M Meseguer","doi":"10.1093/humrep/deaf097.463","DOIUrl":null,"url":null,"abstract":"Study question Can AI-based models outperform conventional embryo assessment methods in predicting embryo viability and implantation outcomes? Summary answer Conventional evaluation surpassed AI models before vitrification, while one AI algorithm outperformed the other overall, highlighting the potential of combining traditional and AI approaches. What is known already Vitrification is the safest method for cryopreserving human embryos, but abrupt temperature and osmolarity changes can damage blastocysts and deteriorate their quality. Therefore, the increasing number of frozen cycle transfers makes it necessary to optimize the way in which embryos undergoing vitrification are evaluated. Previous studies demonstrate the usefulness of AI in scoring vitrified-warmed embryos by providing objective and reproducible assessments. This study estimated the prediction errors and concordance between a conventional embryo evaluation method and two AI-based models to compare their predictive ability and to find the most accurate approach for the prediction of embryo viability and implantation. Study design, size, duration This single-center retrospective study included 846 blastocysts, 815 of them with known implantation data. They were vitrified and warmed by the Cryotop method (Kitazato, Japan) and placed in EmbryoScope (Vitrolife, Denmark) time-lapse incubators in the period between warming and transfer. Embryos were assessed before and after vitrification by experienced embryologists using ASEBIR morphological criteria (categories A, B, and C) and two different image-based AI models: Embryo Predict by Alife Health and Life Whisperer Viability (LWV). Participants/materials, setting, methods ASEBIR criteria and AI models (Alife and LWV) were compared by estimating prediction errors and concordance between them. Percentage errors in prediction of warming survival and implantation were calculated for ASEBIR categories A and C, and for upper and lower limits of the AI algorithms. These limits were defined by adjusting the percentiles of the AI scores to the frequencies of the ASEBIR categories. Concordance was analysed using the Kappa index in SPSS (IBM®). Main results and the role of chance In predicting implantation outcome, the ASEBIR assessment had total errors of 37.5% and 35.3% before vitrification and after warming, respectively. Setting the LWV score limits to &lt; 2.5 (error means implanted) and &gt;9.2 (error means non-implanted) for pre-vitrification images, the error was 45.9%, and 43.1% for post-warming images with score limits of &lt; 1.8 and &gt;8.5. For the Alife algorithm, score limits were &lt;1.7 and &gt;5.8, with 37.7% error for pre-vitrification, and &lt;1.5 and &gt;4.3, with 38.6% error for post-warming images. In the prediction of survival to warming, pre-vitrification ASEBIR scoring performed with 48.6% of error. Embryologists classified surviving embryos as A, B or C, and non-viable embryos as D after warming, so there is no point in calculating the error in this case. The LWV model achieved an error of 49.0% in the pre-vitrification score and an error of 50.9% after warming. In the other hand, Alife obtained 50.7% and 46.5% errors for pre-vitrification and post-warming images, respectively. For the pre-vitrification assessment, the Kappa index was 0.18 between ASEBIR and LWV, and 0.19 between ASEBIR and Alife (p &lt; 0.001), where 0 means no concordance, and 1 is total agreement. The post-warming evaluation obtained a Kappa of 0.09 ASEBIR-LWV and 0.17 ASEBIR-Alife (p &lt; 0.001). Limitations, reasons for caution AI algorithms were developed externally using fresh embryo images at 120 hours post-ICSI, but not post-thaw images from days 5 and 6. The developers had no access to our clinic’s data or labels. Wider implications of the findings The 3 predictive models, both conventional and AI-based, have similar error rates when applied both before and after vitrification, although they have low concordance with each other. Combining traditional and AI approaches could enhance embryo selection accuracy and optimize frozen embryo transfer outcomes. Trial registration number No","PeriodicalId":13003,"journal":{"name":"Human reproduction","volume":"91 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human reproduction","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/humrep/deaf097.463","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Study question Can AI-based models outperform conventional embryo assessment methods in predicting embryo viability and implantation outcomes? Summary answer Conventional evaluation surpassed AI models before vitrification, while one AI algorithm outperformed the other overall, highlighting the potential of combining traditional and AI approaches. What is known already Vitrification is the safest method for cryopreserving human embryos, but abrupt temperature and osmolarity changes can damage blastocysts and deteriorate their quality. Therefore, the increasing number of frozen cycle transfers makes it necessary to optimize the way in which embryos undergoing vitrification are evaluated. Previous studies demonstrate the usefulness of AI in scoring vitrified-warmed embryos by providing objective and reproducible assessments. This study estimated the prediction errors and concordance between a conventional embryo evaluation method and two AI-based models to compare their predictive ability and to find the most accurate approach for the prediction of embryo viability and implantation. Study design, size, duration This single-center retrospective study included 846 blastocysts, 815 of them with known implantation data. They were vitrified and warmed by the Cryotop method (Kitazato, Japan) and placed in EmbryoScope (Vitrolife, Denmark) time-lapse incubators in the period between warming and transfer. Embryos were assessed before and after vitrification by experienced embryologists using ASEBIR morphological criteria (categories A, B, and C) and two different image-based AI models: Embryo Predict by Alife Health and Life Whisperer Viability (LWV). Participants/materials, setting, methods ASEBIR criteria and AI models (Alife and LWV) were compared by estimating prediction errors and concordance between them. Percentage errors in prediction of warming survival and implantation were calculated for ASEBIR categories A and C, and for upper and lower limits of the AI algorithms. These limits were defined by adjusting the percentiles of the AI scores to the frequencies of the ASEBIR categories. Concordance was analysed using the Kappa index in SPSS (IBM®). Main results and the role of chance In predicting implantation outcome, the ASEBIR assessment had total errors of 37.5% and 35.3% before vitrification and after warming, respectively. Setting the LWV score limits to < 2.5 (error means implanted) and >9.2 (error means non-implanted) for pre-vitrification images, the error was 45.9%, and 43.1% for post-warming images with score limits of < 1.8 and >8.5. For the Alife algorithm, score limits were <1.7 and >5.8, with 37.7% error for pre-vitrification, and <1.5 and >4.3, with 38.6% error for post-warming images. In the prediction of survival to warming, pre-vitrification ASEBIR scoring performed with 48.6% of error. Embryologists classified surviving embryos as A, B or C, and non-viable embryos as D after warming, so there is no point in calculating the error in this case. The LWV model achieved an error of 49.0% in the pre-vitrification score and an error of 50.9% after warming. In the other hand, Alife obtained 50.7% and 46.5% errors for pre-vitrification and post-warming images, respectively. For the pre-vitrification assessment, the Kappa index was 0.18 between ASEBIR and LWV, and 0.19 between ASEBIR and Alife (p < 0.001), where 0 means no concordance, and 1 is total agreement. The post-warming evaluation obtained a Kappa of 0.09 ASEBIR-LWV and 0.17 ASEBIR-Alife (p < 0.001). Limitations, reasons for caution AI algorithms were developed externally using fresh embryo images at 120 hours post-ICSI, but not post-thaw images from days 5 and 6. The developers had no access to our clinic’s data or labels. Wider implications of the findings The 3 predictive models, both conventional and AI-based, have similar error rates when applied both before and after vitrification, although they have low concordance with each other. Combining traditional and AI approaches could enhance embryo selection accuracy and optimize frozen embryo transfer outcomes. Trial registration number No

查看原文本刊更多论文

传统胚胎评估模型与人工智能模型预测玻璃化加热囊胚活力和能力的比较

研究问题：基于人工智能的模型在预测胚胎活力和植入结果方面是否优于传统的胚胎评估方法？传统评估在玻璃化前超过了人工智能模型，而一种人工智能算法总体上优于另一种人工智能算法，凸显了传统和人工智能方法结合的潜力。玻璃化是冷冻保存人类胚胎最安全的方法，但突然的温度和渗透压变化会损害囊胚并降低其质量。因此，越来越多的冷冻周期移植使得有必要优化胚胎玻璃化的评估方式。先前的研究表明，通过提供客观和可重复的评估，人工智能在玻璃化加热胚胎评分方面是有用的。本研究通过对传统胚胎评估方法与两种基于人工智能的模型预测误差和一致性进行估算，比较其预测能力，寻找最准确的胚胎存活和着床预测方法。这项单中心回顾性研究包括846个囊胚，其中815个具有已知的着床数据。通过Cryotop方法（Kitazato，日本）将它们玻璃化并加热，并在加热和转移之间的一段时间内放置在EmbryoScope （Vitrolife，丹麦）延时孵育器中。胚胎在玻璃化前后由经验丰富的胚胎学家使用ASEBIR形态学标准（A、B和C类）和两种不同的基于图像的AI模型进行评估：胚胎预测由Alife Health和Life Whisperer Viability （LWV）。比较ASEBIR标准和人工智能模型（Alife和LWV）的预测误差和一致性。计算了ASEBIR A类和C类以及人工智能算法的上限和下限对增温存活和着床的预测误差百分比。这些限制是通过将AI分数的百分位数调整为ASEBIR类别的频率来定义的。使用SPSS （IBM®）中的Kappa指数分析一致性。在预测种植结果方面，ASEBIR评估在玻璃化前和升温后的总误差分别为37.5%和35.3%。将LWV分数限制设置为&；lt；对于玻璃化前图像，误差为2.5（误差表示植入）和&；gt;9.2（误差表示未植入），误差为45.9%，对于温化后图像，误差为43.1%，评分极限为&；lt；1.8和&；gt;8.5。对于Alife算法，分数上限为&；lt；1.7和&；gt;5.8，玻璃化前图像误差为37.7%，加热后图像误差为&；lt；1.5和&；gt;4.3，误差为38.6%。在预测变暖生存时，玻璃化前ASEBIR评分的误差为48.6%。胚胎学家在升温后将存活的胚胎分类为A、B或C，将不能存活的胚胎分类为D，因此在这种情况下计算误差没有意义。LWV模型的玻璃化前评分误差为49.0%，升温后评分误差为50.9%。另一方面，Alife在玻璃化前和加热后的图像中分别获得50.7%和46.5%的误差。对于玻璃化前评价，ASEBIR与LWV之间的Kappa指数为0.18，ASEBIR与Alife之间的Kappa指数为0.19 (p <；0.001)，其中0表示不一致，1表示完全一致。升温后评价Kappa为0.09 asebirr - lwv和0.17 asebirr - alife (p <；0.001)。人工智能算法是在体外使用icsi后120小时的新鲜胚胎图像开发的，而不是第5天和第6天的解冻后图像。开发人员无法访问我们诊所的数据或标签。这3种预测模型，无论是传统的还是基于人工智能的，在玻璃化前后都有相似的错误率，尽管它们之间的一致性很低。将传统方法与人工智能方法相结合，可以提高胚胎选择的准确性，优化冷冻胚胎移植的效果。试验注册号

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Human reproduction 医学-妇产科学

CiteScore

10.90

自引率

6.60%

发文量

1369

审稿时长

1 months

期刊介绍： Human Reproduction features full-length, peer-reviewed papers reporting original research, concise clinical case reports, as well as opinions and debates on topical issues. Papers published cover the clinical science and medical aspects of reproductive physiology, pathology and endocrinology; including andrology, gonad function, gametogenesis, fertilization, embryo development, implantation, early pregnancy, genetics, genetic diagnosis, oncology, infectious disease, surgery, contraception, infertility treatment, psychology, ethics and social issues.