A graph-based model for semantic textual similarity measurement

IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Van-Tan Bui , Quang-Minh Nguyen , Van-Vinh Nguyen , Duc-Toan Nguyen
{"title":"A graph-based model for semantic textual similarity measurement","authors":"Van-Tan Bui ,&nbsp;Quang-Minh Nguyen ,&nbsp;Van-Vinh Nguyen ,&nbsp;Duc-Toan Nguyen","doi":"10.1016/j.datak.2025.102509","DOIUrl":null,"url":null,"abstract":"<div><div>Measuring semantic similarity between sentence pairs is a fundamental problem in Natural Language Processing with applications in various domains, including machine translation, speech recognition, automatic question answering, and text summarization. Despite its significance, accurately assessing semantic similarity remains a challenging task, particularly for underrepresented languages such as Vietnamese. Existing methods have yet to fully leverage the unique linguistic characteristics of Vietnamese for semantic similarity measurement. To address this limitation, we propose GBNet-STS (Graph-Based Network for Semantic Textual Similarity), a novel framework for measuring the semantic similarity of Vietnamese sentence pairs. GBNet-STS integrates lexical-grammatical similarity scores and distributional semantic similarity scores within a multi-layered graph-based model. By capturing different semantic perspectives through multiple interconnected layers, our approach provides a more comprehensive and robust similarity estimation. Experimental results demonstrate that GBNet-STS outperforms traditional methods, achieving state-of-the-art performance in Vietnamese semantic similarity tasks.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"161 ","pages":"Article 102509"},"PeriodicalIF":2.7000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X25001041","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Measuring semantic similarity between sentence pairs is a fundamental problem in Natural Language Processing with applications in various domains, including machine translation, speech recognition, automatic question answering, and text summarization. Despite its significance, accurately assessing semantic similarity remains a challenging task, particularly for underrepresented languages such as Vietnamese. Existing methods have yet to fully leverage the unique linguistic characteristics of Vietnamese for semantic similarity measurement. To address this limitation, we propose GBNet-STS (Graph-Based Network for Semantic Textual Similarity), a novel framework for measuring the semantic similarity of Vietnamese sentence pairs. GBNet-STS integrates lexical-grammatical similarity scores and distributional semantic similarity scores within a multi-layered graph-based model. By capturing different semantic perspectives through multiple interconnected layers, our approach provides a more comprehensive and robust similarity estimation. Experimental results demonstrate that GBNet-STS outperforms traditional methods, achieving state-of-the-art performance in Vietnamese semantic similarity tasks.
基于图的语义文本相似度度量模型
句子对之间的语义相似度测量是自然语言处理中的一个基本问题,在机器翻译、语音识别、自动问答和文本摘要等领域都有广泛的应用。尽管具有重要意义,但准确评估语义相似性仍然是一项具有挑战性的任务,特别是对于像越南语这样代表性不足的语言。现有的方法尚未充分利用越南语独特的语言特征进行语义相似度测量。为了解决这一限制,我们提出了一种新的框架GBNet-STS (Graph-Based Network for Semantic Textual Similarity)来测量越南语句子对的语义相似度。GBNet-STS将词汇语法相似度评分和分布语义相似度评分集成在一个多层基于图的模型中。通过通过多个相互连接的层捕获不同的语义透视图,我们的方法提供了更全面和健壮的相似性估计。实验结果表明,GBNet-STS优于传统方法,在越南语语义相似任务中取得了最先进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Data & Knowledge Engineering
Data & Knowledge Engineering 工程技术-计算机:人工智能
CiteScore
5.00
自引率
0.00%
发文量
66
审稿时长
6 months
期刊介绍: Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信