Human performance in detecting deepfakes: A systematic review and meta-analysis of 56 papers

IF 4.9 Q1 PSYCHOLOGY, EXPERIMENTAL
Alexander Diel , Tania Lalgi , Isabel Carolin Schröter , Karl F. MacDorman , Martin Teufel , Alexander Bäuerle
{"title":"Human performance in detecting deepfakes: A systematic review and meta-analysis of 56 papers","authors":"Alexander Diel ,&nbsp;Tania Lalgi ,&nbsp;Isabel Carolin Schröter ,&nbsp;Karl F. MacDorman ,&nbsp;Martin Teufel ,&nbsp;Alexander Bäuerle","doi":"10.1016/j.chbr.2024.100538","DOIUrl":null,"url":null,"abstract":"<div><div><em>Deepfakes</em> are AI-generated media designed to look real, often with the intent to deceive. Deepfakes threaten public and personal safety by facilitating disinformation, propaganda, and identity theft. Though research has been conducted on human performance in deepfake detection, the results have not yet been synthesized. This systematic review and meta-analysis investigates human deepfake detection accuracy. Searches in PubMed, ScienceGov, JSTOR, Google Scholar, and paper references, conducted in June and October 2024, identified empirical studies measuring human detection of high-quality deepfakes. After pooling accuracy, odds-ratio, and sensitivity (<em>d'</em>) effect sizes (<em>k</em> = 137 effects) from 56 papers involving 86,155 participants, we analyzed 1) overall deepfake detection performance, 2) performance across stimulus types (audio, image, text, and video), and 3) the effects of detection-improvement strategies. Overall deepfake detection rates (<em>sensitivity</em>) were not significantly above chance because 95% confidence intervals crossed 50%. Total deepfake detection accuracy was 55.54% (95% CI [48.87, 62.10], <em>k</em> = 67). For audio, accuracy was 62.08% [38.23, 83.18], <em>k</em> = 8; for images, 53.16% [42.12, 64.64], <em>k</em> = 18; for text, 52.00% [37.42, 65.88], <em>k</em> = 15; and for video, 57.31% [47.80, 66.57], <em>k</em> = 26. Odds ratios were 0.64 [0.52, 0.79], <em>k</em> = 62, indicating 39% detection accuracy, below chance (audio 45%, image 35%, text 40%, video 40%). Moreover, <em>d'</em> values show no significant difference from chance. However, strategies like feedback training, AI support, and deepfake caricaturization improved detection performance above chance levels (65.14% [55.21, 74.46], <em>k</em> = 15), especially for video stimuli.</div></div>","PeriodicalId":72681,"journal":{"name":"Computers in human behavior reports","volume":"16 ","pages":"Article 100538"},"PeriodicalIF":4.9000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in human behavior reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2451958824001714","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Deepfakes are AI-generated media designed to look real, often with the intent to deceive. Deepfakes threaten public and personal safety by facilitating disinformation, propaganda, and identity theft. Though research has been conducted on human performance in deepfake detection, the results have not yet been synthesized. This systematic review and meta-analysis investigates human deepfake detection accuracy. Searches in PubMed, ScienceGov, JSTOR, Google Scholar, and paper references, conducted in June and October 2024, identified empirical studies measuring human detection of high-quality deepfakes. After pooling accuracy, odds-ratio, and sensitivity (d') effect sizes (k = 137 effects) from 56 papers involving 86,155 participants, we analyzed 1) overall deepfake detection performance, 2) performance across stimulus types (audio, image, text, and video), and 3) the effects of detection-improvement strategies. Overall deepfake detection rates (sensitivity) were not significantly above chance because 95% confidence intervals crossed 50%. Total deepfake detection accuracy was 55.54% (95% CI [48.87, 62.10], k = 67). For audio, accuracy was 62.08% [38.23, 83.18], k = 8; for images, 53.16% [42.12, 64.64], k = 18; for text, 52.00% [37.42, 65.88], k = 15; and for video, 57.31% [47.80, 66.57], k = 26. Odds ratios were 0.64 [0.52, 0.79], k = 62, indicating 39% detection accuracy, below chance (audio 45%, image 35%, text 40%, video 40%). Moreover, d' values show no significant difference from chance. However, strategies like feedback training, AI support, and deepfake caricaturization improved detection performance above chance levels (65.14% [55.21, 74.46], k = 15), especially for video stimuli.
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.80
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信