Expanding Domain-Specific Datasets with Stable Diffusion Generative Models for Simulating Myocardial Infarction.

IF 6.4
International journal of neural systems Pub Date : 2025-10-01 Epub Date: 2025-08-04 DOI:10.1142/S0129065725500522
Gabriel Rojas-Albarracín, António Pereira, Antonio Fernández-Caballero, María T López
{"title":"Expanding Domain-Specific Datasets with Stable Diffusion Generative Models for Simulating Myocardial Infarction.","authors":"Gabriel Rojas-Albarracín, António Pereira, Antonio Fernández-Caballero, María T López","doi":"10.1142/S0129065725500522","DOIUrl":null,"url":null,"abstract":"<p><p>Areas, such as the identification of human activity, have accelerated thanks to the immense development of artificial intelligence (AI). However, the lack of data is a major obstacle to even faster progress. This is particularly true in computer vision, where training a model typically requires at least tens of thousands of images. Moreover, when the activity a researcher is interested in is far from the usual, such as falls, it is difficult to have a sufficiently large dataset. An example of this could be the identification of people suffering from a heart attack. In this sense, this work proposes a novel approach that relies on generative models to extend image datasets, adapting them to generate more domain-relevant images. To this end, a refinement to stable diffusion models was performed using low-rank adaptation. A dataset of 100 images of individuals simulating infarct situations and neutral poses was created, annotated, and used. The images generated with the adapted models were evaluated using learned perceptual image patch similarity to test their closeness to the target scenario. The results obtained demonstrate the potential of synthetic datasets, and in particular the strategy proposed here, to overcome data sparsity in AI-based applications. This approach can not only be more cost-effective than building a dataset in the traditional way, but also reduces the ethical concerns of its applicability in smart environments, health monitoring, and anomaly detection. In fact, all data are owned by the researcher and can be added and modified at any time without requiring additional permissions, streamlining their research.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2550052"},"PeriodicalIF":6.4000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of neural systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/S0129065725500522","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/4 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Areas, such as the identification of human activity, have accelerated thanks to the immense development of artificial intelligence (AI). However, the lack of data is a major obstacle to even faster progress. This is particularly true in computer vision, where training a model typically requires at least tens of thousands of images. Moreover, when the activity a researcher is interested in is far from the usual, such as falls, it is difficult to have a sufficiently large dataset. An example of this could be the identification of people suffering from a heart attack. In this sense, this work proposes a novel approach that relies on generative models to extend image datasets, adapting them to generate more domain-relevant images. To this end, a refinement to stable diffusion models was performed using low-rank adaptation. A dataset of 100 images of individuals simulating infarct situations and neutral poses was created, annotated, and used. The images generated with the adapted models were evaluated using learned perceptual image patch similarity to test their closeness to the target scenario. The results obtained demonstrate the potential of synthetic datasets, and in particular the strategy proposed here, to overcome data sparsity in AI-based applications. This approach can not only be more cost-effective than building a dataset in the traditional way, but also reduces the ethical concerns of its applicability in smart environments, health monitoring, and anomaly detection. In fact, all data are owned by the researcher and can be added and modified at any time without requiring additional permissions, streamlining their research.

用稳定扩散生成模型扩展特定领域数据集模拟心肌梗死。
由于人工智能(AI)的巨大发展,人类活动识别等领域已经加速发展。然而,缺乏数据是取得更快进展的主要障碍。这在计算机视觉中尤其如此,在计算机视觉中,训练一个模型通常需要至少数万张图像。此外,当研究人员感兴趣的活动远离通常的活动(例如跌倒)时,很难拥有足够大的数据集。这方面的一个例子可能是识别患有心脏病的人。从这个意义上说,这项工作提出了一种新的方法,它依赖于生成模型来扩展图像数据集,使它们适应于生成更多领域相关的图像。为此,采用低秩自适应方法对稳定扩散模型进行了细化。一个由100张模拟梗死情况和中性姿势的个体图像组成的数据集被创建、注释和使用。使用学习的感知图像补丁相似度来评估由适应模型生成的图像,以测试它们与目标场景的接近程度。所获得的结果证明了合成数据集的潜力,特别是本文提出的策略,可以克服基于人工智能的应用程序中的数据稀疏性。这种方法不仅比以传统方式构建数据集更具成本效益,而且还减少了其在智能环境、健康监测和异常检测中适用性的伦理问题。事实上,所有的数据都属于研究人员,可以随时添加和修改,而不需要额外的许可,简化他们的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信