P. Celard, E. L. Iglesias, J. M. Sorribes-Fdez, L. Borrajo, A. Seara Vieira
{"title":"New Metrics and Dataset for Biological Development Video Generation","authors":"P. Celard, E. L. Iglesias, J. M. Sorribes-Fdez, L. Borrajo, A. Seara Vieira","doi":"10.1145/3653456","DOIUrl":null,"url":null,"abstract":"<p>Image generative models have advanced in many areas to produce synthetic images of high resolution and detail. This success has enabled its use in the biomedical field, paving the way for the generation of videos showing the biological evolution of its content. Despite the power of generative video models, their use has not yet extended to time-based development, focusing almost exclusively on generating motion in space. This situation is largely due to the lack of specific data sets and metrics to measure the individual quality of videos, particularly when there is no ground truth available for comparison. We propose a new dataset, called GoldenDOT, which tracks the evolution of apples cut in parallel over 10 days, allowing to observe their progress over time while remaining static. In addition, four new metrics are proposed that provide different analyses of the generated videos as a whole and individually. In this paper, the proposed dataset and measures are used to study three state of the art video generative models and their feasibility for video generation with biological development: TemporalGAN (TGANv2), Low Dimensional Video Discriminator GAN (LDVDGAN), and Video Diffusion Model (VDM). Among them, the TGANv2 model has managed to obtain the best results in the vast majority of metrics, including those already known in the state of the art, demonstrating the viability of the new proposed metrics and their congruence with these standard measures.</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"103 1","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3653456","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Image generative models have advanced in many areas to produce synthetic images of high resolution and detail. This success has enabled its use in the biomedical field, paving the way for the generation of videos showing the biological evolution of its content. Despite the power of generative video models, their use has not yet extended to time-based development, focusing almost exclusively on generating motion in space. This situation is largely due to the lack of specific data sets and metrics to measure the individual quality of videos, particularly when there is no ground truth available for comparison. We propose a new dataset, called GoldenDOT, which tracks the evolution of apples cut in parallel over 10 days, allowing to observe their progress over time while remaining static. In addition, four new metrics are proposed that provide different analyses of the generated videos as a whole and individually. In this paper, the proposed dataset and measures are used to study three state of the art video generative models and their feasibility for video generation with biological development: TemporalGAN (TGANv2), Low Dimensional Video Discriminator GAN (LDVDGAN), and Video Diffusion Model (VDM). Among them, the TGANv2 model has managed to obtain the best results in the vast majority of metrics, including those already known in the state of the art, demonstrating the viability of the new proposed metrics and their congruence with these standard measures.
图像生成模型在许多领域都取得了进展,可以生成高分辨率和高细节的合成图像。这一成功使其得以应用于生物医学领域,为生成显示其内容生物进化的视频铺平了道路。尽管生成式视频模型功能强大,但其应用尚未扩展到基于时间的开发领域,几乎只侧重于生成空间运动。造成这种情况的主要原因是缺乏特定的数据集和衡量标准来衡量视频的质量,尤其是在没有地面实况可供比较的情况下。我们提出了一个名为 GoldenDOT 的新数据集,该数据集可在 10 天内并行跟踪苹果切割的演变过程,从而在保持静态的同时观察它们随时间的变化。此外,我们还提出了四个新指标,分别对生成的视频整体和个体进行不同的分析。在本文中,提出的数据集和衡量标准被用于研究三种最先进的视频生成模型及其在生成生物发育视频方面的可行性:这三种模型是:TemporalGAN (TGANv2)、Low Dimensional Video Discriminator GAN (LDVDGAN) 和 Video Diffusion Model (VDM)。在这些模型中,TGANv2 模型在绝大多数指标中都取得了最好的结果,包括那些已知的技术指标,这证明了新提出的指标的可行性及其与这些标准指标的一致性。
期刊介绍:
The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome.
TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.