通过计算文本分析检查嵌入谎言。

IF 3.9 2区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES

Scientific Reports Pub Date : 2025-07-21 DOI:10.1038/s41598-025-11327-w

Riccardo Loconte, Bennett Kleinberg

{"title":"通过计算文本分析检查嵌入谎言。","authors":"Riccardo Loconte, Bennett Kleinberg","doi":"10.1038/s41598-025-11327-w","DOIUrl":null,"url":null,"abstract":"Verbal deception detection research relies on narratives and commonly assumes statements as truthful or deceptive. A more realistic perspective acknowledges that the veracity of statements exists on a continuum, with truthful and deceptive parts being embedded within the same statement. However, research on embedded lies has been lagging behind. We collected a novel dataset of 2,088 truthful and deceptive statements with annotated embedded lies. Using a counterbalanced within-subjects design, participants provided two versions of an autobiographical event. One was described truthfully, and the other one deceptively by including embedded lies. Participants later highlighted those embedded lies and judged them on lie centrality, deceptiveness, and source. We show that a fine-tuned language model (Llama-3-8B) can classify truthful statements and those containing embedded lies significantly above the chance level (64% accuracy). Individual differences, linguistic properties, and explainability analysis suggest that the challenge of moving the dial towards embedded lies stems from their resemblance to truthful statements. Typical deceptive statements consisted of 2/3 truthful information and 1/3 embedded lies, largely derived from past personal experiences and with minimal linguistic differences from their truthful counterparts. We present this dataset as a novel resource to address this challenge and foster research on embedded lies in verbal deception detection.","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"26482"},"PeriodicalIF":3.9000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12280142/pdf/","citationCount":"0","resultStr":"{\"title\":\"Examining embedded lies through computational text analysis.\",\"authors\":\"Riccardo Loconte, Bennett Kleinberg\",\"doi\":\"10.1038/s41598-025-11327-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Verbal deception detection research relies on narratives and commonly assumes statements as truthful or deceptive. A more realistic perspective acknowledges that the veracity of statements exists on a continuum, with truthful and deceptive parts being embedded within the same statement. However, research on embedded lies has been lagging behind. We collected a novel dataset of 2,088 truthful and deceptive statements with annotated embedded lies. Using a counterbalanced within-subjects design, participants provided two versions of an autobiographical event. One was described truthfully, and the other one deceptively by including embedded lies. Participants later highlighted those embedded lies and judged them on lie centrality, deceptiveness, and source. We show that a fine-tuned language model (Llama-3-8B) can classify truthful statements and those containing embedded lies significantly above the chance level (64% accuracy). Individual differences, linguistic properties, and explainability analysis suggest that the challenge of moving the dial towards embedded lies stems from their resemblance to truthful statements. Typical deceptive statements consisted of 2/3 truthful information and 1/3 embedded lies, largely derived from past personal experiences and with minimal linguistic differences from their truthful counterparts. We present this dataset as a novel resource to address this challenge and foster research on embedded lies in verbal deception detection.\",\"PeriodicalId\":21811,\"journal\":{\"name\":\"Scientific Reports\",\"volume\":\"15 1\",\"pages\":\"26482\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12280142/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Reports\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41598-025-11327-w\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-11327-w","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

言语欺骗检测研究依赖于叙述，通常假设陈述是真实的或欺骗性的。一个更现实的观点认为，陈述的真实性存在于一个连续体中，真实和欺骗的部分嵌入在同一个陈述中。然而，对嵌入式谎言的研究相对滞后。我们收集了一个新颖的数据集，其中包含2088个带有注释的谎言的真实和欺骗性陈述。使用平衡的主题内设计，参与者提供了两个版本的自传式事件。一种是真实的描述，另一种是欺骗性的，包括嵌入的谎言。随后，参与者强调了这些隐含的谎言，并根据谎言中心性、欺骗性和来源来判断它们。我们的研究表明，经过微调的语言模型（Llama-3-8B）可以对真实陈述和包含嵌入谎言的陈述进行分类，其准确率显著高于机会水平（64%）。个体差异、语言特性和可解释性分析表明，将刻度盘移向嵌入谎言的挑战源于它们与真实陈述的相似性。典型的欺骗性陈述由2/3的真实信息和1/3的嵌入谎言组成，主要来自过去的个人经历，与真实陈述的语言差异很小。我们将此数据集作为解决这一挑战的新资源，并促进嵌入式谎言在言语欺骗检测方面的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Examining embedded lies through computational text analysis.

查看原文本刊更多论文

Examining embedded lies through computational text analysis.

Verbal deception detection research relies on narratives and commonly assumes statements as truthful or deceptive. A more realistic perspective acknowledges that the veracity of statements exists on a continuum, with truthful and deceptive parts being embedded within the same statement. However, research on embedded lies has been lagging behind. We collected a novel dataset of 2,088 truthful and deceptive statements with annotated embedded lies. Using a counterbalanced within-subjects design, participants provided two versions of an autobiographical event. One was described truthfully, and the other one deceptively by including embedded lies. Participants later highlighted those embedded lies and judged them on lie centrality, deceptiveness, and source. We show that a fine-tuned language model (Llama-3-8B) can classify truthful statements and those containing embedded lies significantly above the chance level (64% accuracy). Individual differences, linguistic properties, and explainability analysis suggest that the challenge of moving the dial towards embedded lies stems from their resemblance to truthful statements. Typical deceptive statements consisted of 2/3 truthful information and 1/3 embedded lies, largely derived from past personal experiences and with minimal linguistic differences from their truthful counterparts. We present this dataset as a novel resource to address this challenge and foster research on embedded lies in verbal deception detection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Scientific Reports Natural Science Disciplines-

CiteScore

7.50

自引率

4.30%

发文量

19567

审稿时长

3.9 months

期刊介绍： We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections. Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021). •Engineering Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live. •Physical sciences Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics. •Earth and environmental sciences Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems. •Biological sciences Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants. •Health sciences The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.