Class-weighted Dempster-Shafer in dual-level fusion for multimodal fake real estate listings detection.

IF 3.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
PeerJ Computer Science Pub Date : 2025-05-27 eCollection Date: 2025-01-01 DOI:10.7717/peerj-cs.2797
Maifuza Mohd Amin, Nor Samsiah Sani, Mohammad Faidzul Nasrudin
{"title":"Class-weighted Dempster-Shafer in dual-level fusion for multimodal fake real estate listings detection.","authors":"Maifuza Mohd Amin, Nor Samsiah Sani, Mohammad Faidzul Nasrudin","doi":"10.7717/peerj-cs.2797","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Detecting fake multimodal property listings is a significant challenge in online real estate platforms due to the increasing sophistication of fraudulent activities. The existing multimodal data fusion methods have several limitations and strengths in identifying fraudulent listings. Single-level fusion models whether at the feature, decision, or intermediate level struggle with balancing the contributions of different modalities leading to suboptimal decision-making. To address these problems, a dual-level fusion from multimodal for fake real estate listings detection is proposed. The dual-level fusion allows the integration of detailed features from text and image data to be performed at an early stage, followed by the metadata fusion at the decision stage in order to obtain a more comprehensive final classification. Furthermore, a new weighting scheme is introduced to optimize Dempster-Shafer in decision fusion to help the model achieve optimal performance and as a result, our method improves the classification. The Dempster-Shafer without class weightage lacks the flexibility to adapt to varying levels of uncertainty or importance across different classes.</p><p><strong>Methods: </strong>In Class Weighted Dempster-Shafer in Dual Level Fusion (CWDS-DLF), we employ advanced models (XLNet for text and ResNet101 for images) for feature extraction and use the Dempster-Shafer theory for decision fusion. A new weighting scheme, based on Bayesian optimization, was used to assign optimal weights to the 'fake' and 'not fake' classes, thereby enhancing the Dempster-Shafer theory in the decision fusion process.</p><p><strong>Results: </strong>The CWDS-DLF was evaluated on the property listing website dataset and achieved an F1 score of 96% and an accuracy of 93%. A t-test confirms the significance of these improvements (<i>p</i> < 0.05), demonstrating the effectiveness of our method in detecting fake property listings. Compared to other models, including 2D-convolutional neural network (CNN), XGBoost, and various multimodal approaches, our model consistently outperforms in precision, recall, and F1-score. This underscores the potential of integrating multimodal analysis with sophisticated fusion techniques to enhance the detection of fake property listings, ultimately improving consumer protection and operational efficiency in online real estate platforms.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e2797"},"PeriodicalIF":3.5000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12190670/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2797","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Detecting fake multimodal property listings is a significant challenge in online real estate platforms due to the increasing sophistication of fraudulent activities. The existing multimodal data fusion methods have several limitations and strengths in identifying fraudulent listings. Single-level fusion models whether at the feature, decision, or intermediate level struggle with balancing the contributions of different modalities leading to suboptimal decision-making. To address these problems, a dual-level fusion from multimodal for fake real estate listings detection is proposed. The dual-level fusion allows the integration of detailed features from text and image data to be performed at an early stage, followed by the metadata fusion at the decision stage in order to obtain a more comprehensive final classification. Furthermore, a new weighting scheme is introduced to optimize Dempster-Shafer in decision fusion to help the model achieve optimal performance and as a result, our method improves the classification. The Dempster-Shafer without class weightage lacks the flexibility to adapt to varying levels of uncertainty or importance across different classes.

Methods: In Class Weighted Dempster-Shafer in Dual Level Fusion (CWDS-DLF), we employ advanced models (XLNet for text and ResNet101 for images) for feature extraction and use the Dempster-Shafer theory for decision fusion. A new weighting scheme, based on Bayesian optimization, was used to assign optimal weights to the 'fake' and 'not fake' classes, thereby enhancing the Dempster-Shafer theory in the decision fusion process.

Results: The CWDS-DLF was evaluated on the property listing website dataset and achieved an F1 score of 96% and an accuracy of 93%. A t-test confirms the significance of these improvements (p < 0.05), demonstrating the effectiveness of our method in detecting fake property listings. Compared to other models, including 2D-convolutional neural network (CNN), XGBoost, and various multimodal approaches, our model consistently outperforms in precision, recall, and F1-score. This underscores the potential of integrating multimodal analysis with sophisticated fusion techniques to enhance the detection of fake property listings, ultimately improving consumer protection and operational efficiency in online real estate platforms.

基于类加权Dempster-Shafer的双层融合多模态虚假房地产列表检测。
背景:由于欺诈活动越来越复杂,检测虚假的多模式房产列表是在线房地产平台的一个重大挑战。现有的多模态数据融合方法在识别欺诈列表方面存在一定的局限性和优势。无论是在特征层、决策层还是中间层,单级融合模型都在努力平衡导致次优决策的不同模式的贡献。为了解决这些问题,提出了一种基于多模态的双层融合的虚假房地产信息检测方法。双级融合允许在早期对文本和图像数据的详细特征进行融合,然后在决策阶段进行元数据融合,以获得更全面的最终分类。此外,在决策融合中引入了一种新的加权方案来优化Dempster-Shafer,使模型达到最优性能,从而提高了分类能力。没有职业权重的Dempster-Shafer缺乏灵活性,无法适应不同职业之间不同程度的不确定性或重要性。方法:在类加权Dempster-Shafer双水平融合(CWDS-DLF)中,我们采用先进的模型(文本XLNet和图像ResNet101)进行特征提取,并使用Dempster-Shafer理论进行决策融合。采用一种新的基于贝叶斯优化的加权方案,对“假”类和“不假”类分配最优权重,从而增强决策融合过程中的Dempster-Shafer理论。结果:CWDS-DLF在房产上市网站数据集上进行了评估,F1得分为96%,准确率为93%。t检验证实了这些改进的显著性(p < 0.05),证明了我们的方法在检测虚假房产列表方面的有效性。与其他模型(包括2d -卷积神经网络(CNN)、XGBoost和各种多模态方法)相比,我们的模型在精度、召回率和f1得分方面始终优于其他模型。这凸显了将多模态分析与复杂的融合技术相结合的潜力,以加强对虚假房源的检测,最终提高消费者保护和在线房地产平台的运营效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
PeerJ Computer Science
PeerJ Computer Science Computer Science-General Computer Science
CiteScore
6.10
自引率
5.30%
发文量
332
审稿时长
10 weeks
期刊介绍: PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信