Generation of Fundus Fluorescein Angiography Videos for Health Care Data Sharing.

IF 7.8 1区 医学 Q1 OPHTHALMOLOGY
Xinyuan Wu,Lili Wang,Ruoyu Chen,Bowen Liu,Weiyi Zhang,Xi Yang,Yifan Feng,Mingguang He,Danli Shi
{"title":"Generation of Fundus Fluorescein Angiography Videos for Health Care Data Sharing.","authors":"Xinyuan Wu,Lili Wang,Ruoyu Chen,Bowen Liu,Weiyi Zhang,Xi Yang,Yifan Feng,Mingguang He,Danli Shi","doi":"10.1001/jamaophthalmol.2025.1419","DOIUrl":null,"url":null,"abstract":"Importance\r\nMedical data sharing faces strict restrictions. Text-to-video generation shows potential for creating realistic medical data while preserving privacy, offering a solution for cross-center data sharing and medical education.\r\n\r\nObjective\r\nTo develop and evaluate a text-to-video generative artificial intelligence (AI)-driven model that converts the text of reports into dynamic fundus fluorescein angiography (FFA) videos, enabling visualization of retinal vascular and structural abnormalities.\r\n\r\nDesign, Setting, and Participants\r\nThis study retrospectively collected anonymized FFA data from a tertiary hospital in China. The dataset included both the medical records and FFA examinations of patients assessed between November 2016 and December 2019. A text-to-video model was developed and evaluated. The AI-driven model integrated the wavelet-flow variational autoencoder and the diffusion transformer.\r\n\r\nMain Outcomes and Measures\r\nThe AI-driven model's performance was assessed through objective metrics (Fréchet video distance, learned perceptual image patch similarity score, and visual question answering score [VQAScore]). The domain-specific evaluation for the generated FFA videos was measured by the bidirectional encoder representations from transformers score (BERTScore). Image retrieval was evaluated using a Recall@K score. Each video was rated for quality by 3 ophthalmologists on a scale of 1 (excellent) to 5 (very poor).\r\n\r\nResults\r\nA total of 3625 FFA videos were included (2851 videos [78.6%] for training, 387 videos [10.7%] for validation, and 387 videos [10.7%] for testing). The AI-generated FFA videos demonstrated retinal abnormalities from the input text (Fréchet video distance of 2273, a mean learned perceptual image patch similarity score of 0.48 [SD, 0.04], and a mean VQAScore of 0.61 [SD, 0.08]). The domain-specific evaluations showed alignment between the generated videos and textual prompts (mean BERTScore, 0.35 [SD, 0.09]). The Recall@K scores were 0.02 for K = 5, 0.04 for K = 10, and 0.16 for K = 50, yielding a mean score of 0.073, reflecting disparities between AI-generated and real clinical videos and demonstrating privacy-preserving effectiveness. For assessment of visual quality of the FFA videos by the 3 ophthalmologists, the mean score was 1.57 (SD, 0.44).\r\n\r\nConclusions and Relevance\r\nThis study demonstrated that an AI-driven text-to-video model generated FFA videos from textual descriptions, potentially improving visualization for clinical and educational purposes. The privacy-preserving nature of the model may address key challenges in data sharing while trying to ensure compliance with confidentiality standards.","PeriodicalId":14518,"journal":{"name":"JAMA ophthalmology","volume":"148 1","pages":""},"PeriodicalIF":7.8000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMA ophthalmology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1001/jamaophthalmol.2025.1419","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Importance Medical data sharing faces strict restrictions. Text-to-video generation shows potential for creating realistic medical data while preserving privacy, offering a solution for cross-center data sharing and medical education. Objective To develop and evaluate a text-to-video generative artificial intelligence (AI)-driven model that converts the text of reports into dynamic fundus fluorescein angiography (FFA) videos, enabling visualization of retinal vascular and structural abnormalities. Design, Setting, and Participants This study retrospectively collected anonymized FFA data from a tertiary hospital in China. The dataset included both the medical records and FFA examinations of patients assessed between November 2016 and December 2019. A text-to-video model was developed and evaluated. The AI-driven model integrated the wavelet-flow variational autoencoder and the diffusion transformer. Main Outcomes and Measures The AI-driven model's performance was assessed through objective metrics (Fréchet video distance, learned perceptual image patch similarity score, and visual question answering score [VQAScore]). The domain-specific evaluation for the generated FFA videos was measured by the bidirectional encoder representations from transformers score (BERTScore). Image retrieval was evaluated using a Recall@K score. Each video was rated for quality by 3 ophthalmologists on a scale of 1 (excellent) to 5 (very poor). Results A total of 3625 FFA videos were included (2851 videos [78.6%] for training, 387 videos [10.7%] for validation, and 387 videos [10.7%] for testing). The AI-generated FFA videos demonstrated retinal abnormalities from the input text (Fréchet video distance of 2273, a mean learned perceptual image patch similarity score of 0.48 [SD, 0.04], and a mean VQAScore of 0.61 [SD, 0.08]). The domain-specific evaluations showed alignment between the generated videos and textual prompts (mean BERTScore, 0.35 [SD, 0.09]). The Recall@K scores were 0.02 for K = 5, 0.04 for K = 10, and 0.16 for K = 50, yielding a mean score of 0.073, reflecting disparities between AI-generated and real clinical videos and demonstrating privacy-preserving effectiveness. For assessment of visual quality of the FFA videos by the 3 ophthalmologists, the mean score was 1.57 (SD, 0.44). Conclusions and Relevance This study demonstrated that an AI-driven text-to-video model generated FFA videos from textual descriptions, potentially improving visualization for clinical and educational purposes. The privacy-preserving nature of the model may address key challenges in data sharing while trying to ensure compliance with confidentiality standards.
用于医疗保健数据共享的眼底荧光素血管造影视频的生成。
医疗数据共享面临严格限制。文本到视频的生成显示了在保护隐私的同时创建真实医疗数据的潜力,为跨中心数据共享和医学教育提供了解决方案。目的开发和评估一种文本到视频的生成式人工智能(AI)驱动模型,该模型将报告文本转换为动态眼底荧光素血管造影(FFA)视频,从而实现视网膜血管和结构异常的可视化。设计、环境和参与者本研究回顾性收集了中国一家三级医院的匿名FFA数据。该数据集包括2016年11月至2019年12月期间评估的患者的医疗记录和FFA检查。开发并评估了文本到视频的模型。人工智能驱动模型集成了小波流变分自编码器和扩散变压器。人工智能驱动模型的性能通过客观指标(人工智能视频距离、学习感知图像补丁相似度评分和视觉问答评分[VQAScore])进行评估。对生成的FFA视频的特定领域评估是通过变压器分数(BERTScore)的双向编码器表示来测量的。使用Recall@K评分对图像检索进行评估。每个视频的质量由3名眼科医生评定,评分范围从1(优秀)到5(非常差)。结果共纳入FFA视频3625个,其中训练视频2851个(78.6%),验证视频387个(10.7%),测试视频387个(10.7%)。人工智能生成的FFA视频显示,从输入文本来看,视网膜出现异常(fracimchet视频距离为2273,平均学习感知图像patch相似评分为0.48 [SD, 0.04],平均VQAScore为0.61 [SD, 0.08])。特定领域的评估显示生成的视频和文本提示之间的一致性(平均BERTScore, 0.35 [SD, 0.09])。K = 5时Recall@K得分为0.02,K = 10时为0.04,K = 50时为0.16,平均得分为0.073,反映了人工智能生成的临床视频与真实临床视频之间的差异,表明了隐私保护的有效性。3位眼科医生对FFA视频的视觉质量评价,平均得分为1.57分(SD, 0.44)。结论和相关性本研究表明,人工智能驱动的文本到视频模型从文本描述生成FFA视频,潜在地改善了临床和教育目的的可视化。该模型的隐私保护性质可以解决数据共享中的关键挑战,同时努力确保遵守保密标准。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
JAMA ophthalmology
JAMA ophthalmology OPHTHALMOLOGY-
CiteScore
13.20
自引率
3.70%
发文量
340
期刊介绍: JAMA Ophthalmology, with a rich history of continuous publication since 1869, stands as a distinguished international, peer-reviewed journal dedicated to ophthalmology and visual science. In 2019, the journal proudly commemorated 150 years of uninterrupted service to the field. As a member of the esteemed JAMA Network, a consortium renowned for its peer-reviewed general medical and specialty publications, JAMA Ophthalmology upholds the highest standards of excellence in disseminating cutting-edge research and insights. Join us in celebrating our legacy and advancing the frontiers of ophthalmology and visual science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信