句法线索在短文本相似性度量中的应用研究

IF 0.9 4区 计算机科学 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS
Po-Sen Huang, Po-Sheng Chiu, Jia-Wei Chang, Yueh-Min Huang, Ming-Che Lee
{"title":"句法线索在短文本相似性度量中的应用研究","authors":"Po-Sen Huang, Po-Sheng Chiu, Jia-Wei Chang, Yueh-Min Huang, Ming-Che Lee","doi":"10.3966/160792642019052003017","DOIUrl":null,"url":null,"abstract":"Short-text semantic similarity is an essential technique of natural language search and is widely used in social network analysis and opinion mining to find unknown knowledge. Such similarity measures usually measure short texts with 10-20 words. Similar to spoken utterances, short texts do not necessarily follow formal grammatical rules. The limited information contained in short texts and their syntactic and semantic flexibility make similarity measures difficult. Therefore, this study designed and tested a part-of-speech-based short-text similarity algorithm to solve those problems. The effects of evaluating different parts of speech are thoroughly discussed. The proposed algorithm achieved the best performance using word measures corresponding to different parts of speech.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"839-850"},"PeriodicalIF":0.9000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"A study of using syntactic cues in short-text similarity measure\",\"authors\":\"Po-Sen Huang, Po-Sheng Chiu, Jia-Wei Chang, Yueh-Min Huang, Ming-Che Lee\",\"doi\":\"10.3966/160792642019052003017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Short-text semantic similarity is an essential technique of natural language search and is widely used in social network analysis and opinion mining to find unknown knowledge. Such similarity measures usually measure short texts with 10-20 words. Similar to spoken utterances, short texts do not necessarily follow formal grammatical rules. The limited information contained in short texts and their syntactic and semantic flexibility make similarity measures difficult. Therefore, this study designed and tested a part-of-speech-based short-text similarity algorithm to solve those problems. The effects of evaluating different parts of speech are thoroughly discussed. The proposed algorithm achieved the best performance using word measures corresponding to different parts of speech.\",\"PeriodicalId\":50172,\"journal\":{\"name\":\"Journal of Internet Technology\",\"volume\":\"20 1\",\"pages\":\"839-850\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Internet Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.3966/160792642019052003017\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Internet Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3966/160792642019052003017","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 6

摘要

短文本语义相似度是自然语言搜索的一项重要技术,广泛应用于社交网络分析和观点挖掘中,以发现未知知识。这种相似性度量通常测量10-20个单词的短文本。与口语类似,短文不一定遵循形式语法规则。短文本中所包含的有限信息及其句法和语义的灵活性使得相似性度量变得困难。因此,本研究设计并测试了一种基于词性的短文本相似度算法来解决这些问题。对评价不同词性的效果进行了深入的讨论。所提出的算法使用与不同词性相对应的单词测量来获得最佳性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A study of using syntactic cues in short-text similarity measure
Short-text semantic similarity is an essential technique of natural language search and is widely used in social network analysis and opinion mining to find unknown knowledge. Such similarity measures usually measure short texts with 10-20 words. Similar to spoken utterances, short texts do not necessarily follow formal grammatical rules. The limited information contained in short texts and their syntactic and semantic flexibility make similarity measures difficult. Therefore, this study designed and tested a part-of-speech-based short-text similarity algorithm to solve those problems. The effects of evaluating different parts of speech are thoroughly discussed. The proposed algorithm achieved the best performance using word measures corresponding to different parts of speech.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Internet Technology
Journal of Internet Technology COMPUTER SCIENCE, INFORMATION SYSTEMS-TELECOMMUNICATIONS
CiteScore
3.20
自引率
18.80%
发文量
112
审稿时长
13.8 months
期刊介绍: The Journal of Internet Technology accepts original technical articles in all disciplines of Internet Technology & Applications. Manuscripts are submitted for review with the understanding that they have not been published elsewhere. Topics of interest to JIT include but not limited to: Broadband Networks Electronic service systems (Internet, Intranet, Extranet, E-Commerce, E-Business) Network Management Network Operating System (NOS) Intelligent systems engineering Government or Staff Jobs Computerization National Information Policy Multimedia systems Network Behavior Modeling Wireless/Satellite Communication Digital Library Distance Learning Internet/WWW Applications Telecommunication Networks Security in Networks and Systems Cloud Computing Internet of Things (IoT) IPv6 related topics are especially welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信