S Perez Albala, L Conversa, P Vercet, T Carrión, A Cobo, M Meseguer
{"title":"P-154 Comparison of conventional embryo evaluation and AI models for predicting viability and competence of vitrified-warmed blastocysts","authors":"S Perez Albala, L Conversa, P Vercet, T Carrión, A Cobo, M Meseguer","doi":"10.1093/humrep/deaf097.463","DOIUrl":null,"url":null,"abstract":"Study question Can AI-based models outperform conventional embryo assessment methods in predicting embryo viability and implantation outcomes? Summary answer Conventional evaluation surpassed AI models before vitrification, while one AI algorithm outperformed the other overall, highlighting the potential of combining traditional and AI approaches. What is known already Vitrification is the safest method for cryopreserving human embryos, but abrupt temperature and osmolarity changes can damage blastocysts and deteriorate their quality. Therefore, the increasing number of frozen cycle transfers makes it necessary to optimize the way in which embryos undergoing vitrification are evaluated. Previous studies demonstrate the usefulness of AI in scoring vitrified-warmed embryos by providing objective and reproducible assessments. This study estimated the prediction errors and concordance between a conventional embryo evaluation method and two AI-based models to compare their predictive ability and to find the most accurate approach for the prediction of embryo viability and implantation. Study design, size, duration This single-center retrospective study included 846 blastocysts, 815 of them with known implantation data. They were vitrified and warmed by the Cryotop method (Kitazato, Japan) and placed in EmbryoScope (Vitrolife, Denmark) time-lapse incubators in the period between warming and transfer. Embryos were assessed before and after vitrification by experienced embryologists using ASEBIR morphological criteria (categories A, B, and C) and two different image-based AI models: Embryo Predict by Alife Health and Life Whisperer Viability (LWV). Participants/materials, setting, methods ASEBIR criteria and AI models (Alife and LWV) were compared by estimating prediction errors and concordance between them. Percentage errors in prediction of warming survival and implantation were calculated for ASEBIR categories A and C, and for upper and lower limits of the AI algorithms. These limits were defined by adjusting the percentiles of the AI scores to the frequencies of the ASEBIR categories. Concordance was analysed using the Kappa index in SPSS (IBM®). Main results and the role of chance In predicting implantation outcome, the ASEBIR assessment had total errors of 37.5% and 35.3% before vitrification and after warming, respectively. Setting the LWV score limits to < 2.5 (error means implanted) and >9.2 (error means non-implanted) for pre-vitrification images, the error was 45.9%, and 43.1% for post-warming images with score limits of < 1.8 and >8.5. For the Alife algorithm, score limits were <1.7 and >5.8, with 37.7% error for pre-vitrification, and <1.5 and >4.3, with 38.6% error for post-warming images. In the prediction of survival to warming, pre-vitrification ASEBIR scoring performed with 48.6% of error. Embryologists classified surviving embryos as A, B or C, and non-viable embryos as D after warming, so there is no point in calculating the error in this case. The LWV model achieved an error of 49.0% in the pre-vitrification score and an error of 50.9% after warming. In the other hand, Alife obtained 50.7% and 46.5% errors for pre-vitrification and post-warming images, respectively. For the pre-vitrification assessment, the Kappa index was 0.18 between ASEBIR and LWV, and 0.19 between ASEBIR and Alife (p < 0.001), where 0 means no concordance, and 1 is total agreement. The post-warming evaluation obtained a Kappa of 0.09 ASEBIR-LWV and 0.17 ASEBIR-Alife (p < 0.001). Limitations, reasons for caution AI algorithms were developed externally using fresh embryo images at 120 hours post-ICSI, but not post-thaw images from days 5 and 6. The developers had no access to our clinic’s data or labels. Wider implications of the findings The 3 predictive models, both conventional and AI-based, have similar error rates when applied both before and after vitrification, although they have low concordance with each other. Combining traditional and AI approaches could enhance embryo selection accuracy and optimize frozen embryo transfer outcomes. Trial registration number No","PeriodicalId":13003,"journal":{"name":"Human reproduction","volume":"91 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human reproduction","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/humrep/deaf097.463","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Study question Can AI-based models outperform conventional embryo assessment methods in predicting embryo viability and implantation outcomes? Summary answer Conventional evaluation surpassed AI models before vitrification, while one AI algorithm outperformed the other overall, highlighting the potential of combining traditional and AI approaches. What is known already Vitrification is the safest method for cryopreserving human embryos, but abrupt temperature and osmolarity changes can damage blastocysts and deteriorate their quality. Therefore, the increasing number of frozen cycle transfers makes it necessary to optimize the way in which embryos undergoing vitrification are evaluated. Previous studies demonstrate the usefulness of AI in scoring vitrified-warmed embryos by providing objective and reproducible assessments. This study estimated the prediction errors and concordance between a conventional embryo evaluation method and two AI-based models to compare their predictive ability and to find the most accurate approach for the prediction of embryo viability and implantation. Study design, size, duration This single-center retrospective study included 846 blastocysts, 815 of them with known implantation data. They were vitrified and warmed by the Cryotop method (Kitazato, Japan) and placed in EmbryoScope (Vitrolife, Denmark) time-lapse incubators in the period between warming and transfer. Embryos were assessed before and after vitrification by experienced embryologists using ASEBIR morphological criteria (categories A, B, and C) and two different image-based AI models: Embryo Predict by Alife Health and Life Whisperer Viability (LWV). Participants/materials, setting, methods ASEBIR criteria and AI models (Alife and LWV) were compared by estimating prediction errors and concordance between them. Percentage errors in prediction of warming survival and implantation were calculated for ASEBIR categories A and C, and for upper and lower limits of the AI algorithms. These limits were defined by adjusting the percentiles of the AI scores to the frequencies of the ASEBIR categories. Concordance was analysed using the Kappa index in SPSS (IBM®). Main results and the role of chance In predicting implantation outcome, the ASEBIR assessment had total errors of 37.5% and 35.3% before vitrification and after warming, respectively. Setting the LWV score limits to < 2.5 (error means implanted) and >9.2 (error means non-implanted) for pre-vitrification images, the error was 45.9%, and 43.1% for post-warming images with score limits of < 1.8 and >8.5. For the Alife algorithm, score limits were <1.7 and >5.8, with 37.7% error for pre-vitrification, and <1.5 and >4.3, with 38.6% error for post-warming images. In the prediction of survival to warming, pre-vitrification ASEBIR scoring performed with 48.6% of error. Embryologists classified surviving embryos as A, B or C, and non-viable embryos as D after warming, so there is no point in calculating the error in this case. The LWV model achieved an error of 49.0% in the pre-vitrification score and an error of 50.9% after warming. In the other hand, Alife obtained 50.7% and 46.5% errors for pre-vitrification and post-warming images, respectively. For the pre-vitrification assessment, the Kappa index was 0.18 between ASEBIR and LWV, and 0.19 between ASEBIR and Alife (p < 0.001), where 0 means no concordance, and 1 is total agreement. The post-warming evaluation obtained a Kappa of 0.09 ASEBIR-LWV and 0.17 ASEBIR-Alife (p < 0.001). Limitations, reasons for caution AI algorithms were developed externally using fresh embryo images at 120 hours post-ICSI, but not post-thaw images from days 5 and 6. The developers had no access to our clinic’s data or labels. Wider implications of the findings The 3 predictive models, both conventional and AI-based, have similar error rates when applied both before and after vitrification, although they have low concordance with each other. Combining traditional and AI approaches could enhance embryo selection accuracy and optimize frozen embryo transfer outcomes. Trial registration number No
期刊介绍:
Human Reproduction features full-length, peer-reviewed papers reporting original research, concise clinical case reports, as well as opinions and debates on topical issues.
Papers published cover the clinical science and medical aspects of reproductive physiology, pathology and endocrinology; including andrology, gonad function, gametogenesis, fertilization, embryo development, implantation, early pregnancy, genetics, genetic diagnosis, oncology, infectious disease, surgery, contraception, infertility treatment, psychology, ethics and social issues.