Michael Schlee, Gillian Kant, Christoph Ehrling, Benjamin Säfken, Thomas Kneib
{"title":"解码合成新闻:一个可解释的多模态框架,用于新新闻语料库中的新闻文章分类","authors":"Michael Schlee, Gillian Kant, Christoph Ehrling, Benjamin Säfken, Thomas Kneib","doi":"10.1007/s10462-025-11188-9","DOIUrl":null,"url":null,"abstract":"<div><p>Recent advancements in Artificial Intelligence (AI), notably the development of Large Language Models (LLMs) and text-to-image diffusion models, have facilitated the creation of realistic textual content and images. Specifically, platforms like ChatGPT and Midjourney have simplified the creation of high-quality text and visuals with minimal expertise and cost. The increasing sophistication of Generative AI presents challenges in ensuring the integrity of news, media, and information quality, making it increasingly difficult to distinguish between real and artificially generated textual and visual content. Our work addressed this problem in two ways. First, by means of ChatGPT and Midjourney, we created a comprehensive novel multimodal news corpus named <i>SyN24News</i> based on the <i>N24News</i> corpus, on which we evaluated our model. Second, we developed a novel explainable synthetic news detector for discriminating between real and synthetic news articles. We leveraged a Neural Additive Model (NAM)-like network structure that ensures effect separation by handling input data in separate subnetworks. Complex structures and patterns are extracted by deep features from unstructured data, i.e., images and texts, using fine-tuned VGG and DistilBERT subnetworks. We ensured further explainability by individually processing carefully chosen handcrafted text and image features in simple Multilayer Perceptrons (MLPs), allowing for graphical interpretation of corresponding structured effects. Our findings indicate that textual information are the main drivers in the decision-making finding process. Structured textual effects, particularly Flesch-Kincaid reading ease and sentiment, have a much higher influence on the classification outcome than visual features such as dissimilarity and homogeneity.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11188-9.pdf","citationCount":"0","resultStr":"{\"title\":\"Decoding synthetic news: an interpretable multimodal framework for the classification of news articles in a novel news corpus\",\"authors\":\"Michael Schlee, Gillian Kant, Christoph Ehrling, Benjamin Säfken, Thomas Kneib\",\"doi\":\"10.1007/s10462-025-11188-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Recent advancements in Artificial Intelligence (AI), notably the development of Large Language Models (LLMs) and text-to-image diffusion models, have facilitated the creation of realistic textual content and images. Specifically, platforms like ChatGPT and Midjourney have simplified the creation of high-quality text and visuals with minimal expertise and cost. The increasing sophistication of Generative AI presents challenges in ensuring the integrity of news, media, and information quality, making it increasingly difficult to distinguish between real and artificially generated textual and visual content. Our work addressed this problem in two ways. First, by means of ChatGPT and Midjourney, we created a comprehensive novel multimodal news corpus named <i>SyN24News</i> based on the <i>N24News</i> corpus, on which we evaluated our model. Second, we developed a novel explainable synthetic news detector for discriminating between real and synthetic news articles. We leveraged a Neural Additive Model (NAM)-like network structure that ensures effect separation by handling input data in separate subnetworks. Complex structures and patterns are extracted by deep features from unstructured data, i.e., images and texts, using fine-tuned VGG and DistilBERT subnetworks. We ensured further explainability by individually processing carefully chosen handcrafted text and image features in simple Multilayer Perceptrons (MLPs), allowing for graphical interpretation of corresponding structured effects. Our findings indicate that textual information are the main drivers in the decision-making finding process. Structured textual effects, particularly Flesch-Kincaid reading ease and sentiment, have a much higher influence on the classification outcome than visual features such as dissimilarity and homogeneity.</p></div>\",\"PeriodicalId\":8449,\"journal\":{\"name\":\"Artificial Intelligence Review\",\"volume\":\"58 10\",\"pages\":\"\"},\"PeriodicalIF\":13.9000,\"publicationDate\":\"2025-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10462-025-11188-9.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence Review\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10462-025-11188-9\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11188-9","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Decoding synthetic news: an interpretable multimodal framework for the classification of news articles in a novel news corpus
Recent advancements in Artificial Intelligence (AI), notably the development of Large Language Models (LLMs) and text-to-image diffusion models, have facilitated the creation of realistic textual content and images. Specifically, platforms like ChatGPT and Midjourney have simplified the creation of high-quality text and visuals with minimal expertise and cost. The increasing sophistication of Generative AI presents challenges in ensuring the integrity of news, media, and information quality, making it increasingly difficult to distinguish between real and artificially generated textual and visual content. Our work addressed this problem in two ways. First, by means of ChatGPT and Midjourney, we created a comprehensive novel multimodal news corpus named SyN24News based on the N24News corpus, on which we evaluated our model. Second, we developed a novel explainable synthetic news detector for discriminating between real and synthetic news articles. We leveraged a Neural Additive Model (NAM)-like network structure that ensures effect separation by handling input data in separate subnetworks. Complex structures and patterns are extracted by deep features from unstructured data, i.e., images and texts, using fine-tuned VGG and DistilBERT subnetworks. We ensured further explainability by individually processing carefully chosen handcrafted text and image features in simple Multilayer Perceptrons (MLPs), allowing for graphical interpretation of corresponding structured effects. Our findings indicate that textual information are the main drivers in the decision-making finding process. Structured textual effects, particularly Flesch-Kincaid reading ease and sentiment, have a much higher influence on the classification outcome than visual features such as dissimilarity and homogeneity.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.