专家无法可靠地检测人工智能生成的组织学数据

bioRxiv - Scientific Communication and Education Pub Date : 2024-01-25 DOI:10.1101/2024.01.23.576647

Jan Hartung, Stefanie Reuter, Vera Anna Kulow, Michael Fahling, Cord Spreckelsen, Ralf Mrowka

{"title":"专家无法可靠地检测人工智能生成的组织学数据","authors":"Jan Hartung, Stefanie Reuter, Vera Anna Kulow, Michael Fahling, Cord Spreckelsen, Ralf Mrowka","doi":"10.1101/2024.01.23.576647","DOIUrl":null,"url":null,"abstract":"AI-based methods to generate images have seen unprecedented advances in recent years challenging both image forensic and human perceptual capabilities. Accordingly, they are expected to play an increasingly important role in the fraudulent fabrication of data. This includes images with complicated intrinsic structures like histological tissue samples, which are harder to forge manually. We use stable diffusion, one of the most recent generative algorithms, to create such a set of artificial histological samples and in a large study with over 800 participants, we study the ability of human subjects to discriminate between such artificial and genuine histological images. Although they perform better than naive participants, we find that even experts fail to reliably identify fabricated data. While participant performance depends on the amount of training data used, even low quantities result in convincing images, necessitating methods to detect fabricated data and technical standards such as C2PA to secure data integrity.","PeriodicalId":501568,"journal":{"name":"bioRxiv - Scientific Communication and Education","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Experts fail to reliably detect AI-generated histological data\",\"authors\":\"Jan Hartung, Stefanie Reuter, Vera Anna Kulow, Michael Fahling, Cord Spreckelsen, Ralf Mrowka\",\"doi\":\"10.1101/2024.01.23.576647\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"AI-based methods to generate images have seen unprecedented advances in recent years challenging both image forensic and human perceptual capabilities. Accordingly, they are expected to play an increasingly important role in the fraudulent fabrication of data. This includes images with complicated intrinsic structures like histological tissue samples, which are harder to forge manually. We use stable diffusion, one of the most recent generative algorithms, to create such a set of artificial histological samples and in a large study with over 800 participants, we study the ability of human subjects to discriminate between such artificial and genuine histological images. Although they perform better than naive participants, we find that even experts fail to reliably identify fabricated data. While participant performance depends on the amount of training data used, even low quantities result in convincing images, necessitating methods to detect fabricated data and technical standards such as C2PA to secure data integrity.\",\"PeriodicalId\":501568,\"journal\":{\"name\":\"bioRxiv - Scientific Communication and Education\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv - Scientific Communication and Education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.01.23.576647\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Scientific Communication and Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.01.23.576647","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，基于人工智能的图像生成方法取得了前所未有的进步，对图像取证能力和人类感知能力都提出了挑战。因此，它们有望在数据伪造方面发挥越来越重要的作用。这包括像组织学组织样本这样具有复杂内在结构的图像，因为这些图像更难人工伪造。我们使用最新的生成算法之一--稳定扩散来创建这样一组人工组织样本，并在一项有 800 多人参与的大型研究中，研究了人类受试者分辨此类人工组织图像和真实组织图像的能力。我们发现，虽然人类受试者的表现要好于天真的受试者，但即使是专家也无法可靠地识别伪造数据。虽然受试者的表现取决于所使用的训练数据量，但即使数量很少，也能生成令人信服的图像，因此有必要采用检测伪造数据的方法和 C2PA 等技术标准来确保数据的完整性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Experts fail to reliably detect AI-generated histological data

AI-based methods to generate images have seen unprecedented advances in recent years challenging both image forensic and human perceptual capabilities. Accordingly, they are expected to play an increasingly important role in the fraudulent fabrication of data. This includes images with complicated intrinsic structures like histological tissue samples, which are harder to forge manually. We use stable diffusion, one of the most recent generative algorithms, to create such a set of artificial histological samples and in a large study with over 800 participants, we study the ability of human subjects to discriminate between such artificial and genuine histological images. Although they perform better than naive participants, we find that even experts fail to reliably identify fabricated data. While participant performance depends on the amount of training data used, even low quantities result in convincing images, necessitating methods to detect fabricated data and technical standards such as C2PA to secure data integrity.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

bioRxiv - Scientific Communication and Education

自引率

0.00%

发文量