Web Forum Retrieval and Text Analytics: A Survey

IF 8.3 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
D. Hoogeveen, Li Wang, Timothy Baldwin, Karin M. Verspoor
{"title":"Web Forum Retrieval and Text Analytics: A Survey","authors":"D. Hoogeveen, Li Wang, Timothy Baldwin, Karin M. Verspoor","doi":"10.1561/1500000062","DOIUrl":null,"url":null,"abstract":"This survey presents an overview of information retrieval, natural languageprocessing and machine learning research that makes use of forumdata, including both discussion forums and community questionansweringcQA archives. The focus is on automated analysis, withthe goal of gaining a better understanding of the data and its users.We discuss the different strategies used for both retrieval taskspost retrieval, question retrieval, and answer retrieval and classificationtasks post type classification, question classification, post qualityassessment, subjectivity, and viewpoint classification at the postlevel, as well as at the thread level thread retrieval, solvedness andtask orientation, discourse structure recovery and dialogue act tagging,QA-pair extraction, and thread summarisation. We also review workon forum users, including user satisfaction, expert finding, questionrecommendation and routing, and community analysis.The survey includes a brief history of forums, an overview of thedifferent kinds of forums, a summary of publicly available datasets forforum research, and a short discussion on the evaluation of retrievaltasks using forum data.The aim is to give a broad overview of the different kinds of forumresearch, a summary of the methods that have been applied, some insightsinto successful strategies, and potential areas for future research.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"54 1","pages":"1-163"},"PeriodicalIF":8.3000,"publicationDate":"2018-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations and Trends in Information Retrieval","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1561/1500000062","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 34

Abstract

This survey presents an overview of information retrieval, natural languageprocessing and machine learning research that makes use of forumdata, including both discussion forums and community questionansweringcQA archives. The focus is on automated analysis, withthe goal of gaining a better understanding of the data and its users.We discuss the different strategies used for both retrieval taskspost retrieval, question retrieval, and answer retrieval and classificationtasks post type classification, question classification, post qualityassessment, subjectivity, and viewpoint classification at the postlevel, as well as at the thread level thread retrieval, solvedness andtask orientation, discourse structure recovery and dialogue act tagging,QA-pair extraction, and thread summarisation. We also review workon forum users, including user satisfaction, expert finding, questionrecommendation and routing, and community analysis.The survey includes a brief history of forums, an overview of thedifferent kinds of forums, a summary of publicly available datasets forforum research, and a short discussion on the evaluation of retrievaltasks using forum data.The aim is to give a broad overview of the different kinds of forumresearch, a summary of the methods that have been applied, some insightsinto successful strategies, and potential areas for future research.
网络论坛检索和文本分析:一项调查
本调查概述了利用论坛数据的信息检索、自然语言处理和机器学习研究,包括讨论论坛和社区问答cqa档案。重点是自动化分析,目标是更好地理解数据及其用户。我们讨论了用于检索任务(现场检索、问题检索、答案检索和分类)的不同策略,包括后级的帖子类型分类、问题分类、帖子质量评估、主观性和观点分类,以及线程级的线程检索、可解性和任务定向、话语结构恢复和对话行为标记、问答对提取和线程摘要。我们还审查工作论坛用户,包括用户满意度,专家发现,问题推荐和路由,以及社区分析。该调查包括论坛的简史、不同类型论坛的概述、论坛研究的公开可用数据集的摘要,以及关于使用论坛数据评估检索任务的简短讨论。目的是对不同类型的论坛研究进行广泛的概述,总结已经应用的方法,对成功策略的一些见解,以及未来研究的潜在领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Foundations and Trends in Information Retrieval
Foundations and Trends in Information Retrieval COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
39.10
自引率
0.00%
发文量
3
期刊介绍: The surge in research across all domains in the past decade has resulted in a plethora of new publications, causing an exponential growth in published research. Navigating through this extensive literature and staying current has become a time-consuming challenge. While electronic publishing provides instant access to more articles than ever, discerning the essential ones for a comprehensive understanding of any topic remains an issue. To tackle this, Foundations and Trends® in Information Retrieval - FnTIR - addresses the problem by publishing high-quality survey and tutorial monographs in the field. Each issue of Foundations and Trends® in Information Retrieval - FnT IR features a 50-100 page monograph authored by research leaders, covering tutorial subjects, research retrospectives, and survey papers that provide state-of-the-art reviews within the scope of the journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信