基于改进访问概率算法和babelnet语义图的Web查询分类

Haniyeh Rashidghalam, F. Mahmoudi
{"title":"基于改进访问概率算法和babelnet语义图的Web查询分类","authors":"Haniyeh Rashidghalam, F. Mahmoudi","doi":"10.1109/RIOS.2015.7270748","DOIUrl":null,"url":null,"abstract":"In this paper, an unsupervised method which is not use log data is offered to solve ”the problem of web query classification”. The aim of the proposed approach is the mapping of all the problem components to the BabelNet concepts and solving the problem by using these concepts. Therefore, it is considered a three-phase solutions consist of Offline, Online and Classification phases. In offline phase, all categories are mapping to concepts in BabelNet by using a disambiguation system. In the online phase, first a query is enriched then preprocessing on query is required, after that, by using a disambiguation system all components are mapped to BabelNet's concepts. In the last phase, by improving on visiting probability algorithm, classification is done. For testing process, we used KDD2005 test set, which is leading the series have been used. Results indicate that between the approaches which are unsupervised and do not use log data, proposed approach, has acceptable performance in the point of view F1 measure. In other words, by compare to best approach which is unsupervised and does not use log data, proposed approach improved 2%, but by compare to the best approach which is unsupervised and uses log data the results get worse and shows reduction of 11% in term of F1 measure.","PeriodicalId":437944,"journal":{"name":"2015 AI & Robotics (IRANOPEN)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Web query classification using improved visiting probability algorithm and babelnet semantic graph\",\"authors\":\"Haniyeh Rashidghalam, F. Mahmoudi\",\"doi\":\"10.1109/RIOS.2015.7270748\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, an unsupervised method which is not use log data is offered to solve ”the problem of web query classification”. The aim of the proposed approach is the mapping of all the problem components to the BabelNet concepts and solving the problem by using these concepts. Therefore, it is considered a three-phase solutions consist of Offline, Online and Classification phases. In offline phase, all categories are mapping to concepts in BabelNet by using a disambiguation system. In the online phase, first a query is enriched then preprocessing on query is required, after that, by using a disambiguation system all components are mapped to BabelNet's concepts. In the last phase, by improving on visiting probability algorithm, classification is done. For testing process, we used KDD2005 test set, which is leading the series have been used. Results indicate that between the approaches which are unsupervised and do not use log data, proposed approach, has acceptable performance in the point of view F1 measure. In other words, by compare to best approach which is unsupervised and does not use log data, proposed approach improved 2%, but by compare to the best approach which is unsupervised and uses log data the results get worse and shows reduction of 11% in term of F1 measure.\",\"PeriodicalId\":437944,\"journal\":{\"name\":\"2015 AI & Robotics (IRANOPEN)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 AI & Robotics (IRANOPEN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RIOS.2015.7270748\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 AI & Robotics (IRANOPEN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RIOS.2015.7270748","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种不使用日志数据的无监督方法来解决“web查询分类问题”。所建议的方法的目的是将所有问题组件映射到BabelNet概念,并通过使用这些概念来解决问题。因此,它被认为是一个由离线、在线和分类三个阶段组成的三相解决方案。在脱机阶段,所有类别都通过使用消歧系统映射到BabelNet中的概念。在在线阶段,首先对查询进行丰富,然后对查询进行预处理,之后,通过使用消歧系统将所有组件映射到BabelNet的概念。最后,对访问概率算法进行改进,进行分类。在测试过程中,我们使用了KDD2005测试装置,这是该系列中领先的。结果表明,在无监督和不使用日志数据的方法之间,本文提出的方法在视点F1度量上具有可接受的性能。换句话说,与无监督和不使用日志数据的最佳方法相比,该方法提高了2%,但与无监督和使用日志数据的最佳方法相比,结果变得更差,在F1度量方面降低了11%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Web query classification using improved visiting probability algorithm and babelnet semantic graph
In this paper, an unsupervised method which is not use log data is offered to solve ”the problem of web query classification”. The aim of the proposed approach is the mapping of all the problem components to the BabelNet concepts and solving the problem by using these concepts. Therefore, it is considered a three-phase solutions consist of Offline, Online and Classification phases. In offline phase, all categories are mapping to concepts in BabelNet by using a disambiguation system. In the online phase, first a query is enriched then preprocessing on query is required, after that, by using a disambiguation system all components are mapped to BabelNet's concepts. In the last phase, by improving on visiting probability algorithm, classification is done. For testing process, we used KDD2005 test set, which is leading the series have been used. Results indicate that between the approaches which are unsupervised and do not use log data, proposed approach, has acceptable performance in the point of view F1 measure. In other words, by compare to best approach which is unsupervised and does not use log data, proposed approach improved 2%, but by compare to the best approach which is unsupervised and uses log data the results get worse and shows reduction of 11% in term of F1 measure.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信