如何用分类回答我的问题

Kelvin Wu, Lei Yu, M. Cutler
{"title":"如何用分类回答我的问题","authors":"Kelvin Wu, Lei Yu, M. Cutler","doi":"10.1109/AINAW.2007.356","DOIUrl":null,"url":null,"abstract":"Interest in developing open domain question answering systems that leverage the massive amount of knowledge available on the Web is on the rise. In this investigation, we address the problem of answering How do I questions. Our goal is to use the top results obtained from a search engine to extract and present correct answers. Identifying correct answers to such questions is a hard problem that seems to require deep natural language understanding. Fortunately, answers to How do I questions are often procedural, typically containing a successive sequence of actions. Learning to label text as procedural or non-procedural is an easier problem which we attempted to solve by extracting 12 informative features with which we trained classifiers. However, the corpus built from the top documents retrieved for a set of How do I- equivalent queries turned out to be highly imbalanced. To tackle this issue, sampling techniques were used for a variety of classification methods, yielding reasonable recall and precision for the minority class of procedural texts.","PeriodicalId":338799,"journal":{"name":"21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Towards Answering How do I Questions Using Classification\",\"authors\":\"Kelvin Wu, Lei Yu, M. Cutler\",\"doi\":\"10.1109/AINAW.2007.356\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Interest in developing open domain question answering systems that leverage the massive amount of knowledge available on the Web is on the rise. In this investigation, we address the problem of answering How do I questions. Our goal is to use the top results obtained from a search engine to extract and present correct answers. Identifying correct answers to such questions is a hard problem that seems to require deep natural language understanding. Fortunately, answers to How do I questions are often procedural, typically containing a successive sequence of actions. Learning to label text as procedural or non-procedural is an easier problem which we attempted to solve by extracting 12 informative features with which we trained classifiers. However, the corpus built from the top documents retrieved for a set of How do I- equivalent queries turned out to be highly imbalanced. To tackle this issue, sampling techniques were used for a variety of classification methods, yielding reasonable recall and precision for the minority class of procedural texts.\",\"PeriodicalId\":338799,\"journal\":{\"name\":\"21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AINAW.2007.356\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AINAW.2007.356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

开发开放领域问答系统以利用网络上大量可用知识的兴趣正在上升。在这次调查中,我们解决了如何回答问题的问题。我们的目标是使用从搜索引擎获得的顶级结果来提取并呈现正确的答案。识别这些问题的正确答案是一个难题,似乎需要深刻的自然语言理解能力。幸运的是,“How do I”问题的答案通常是程序性的,通常包含一系列连续的动作。学习将文本标记为程序性或非程序性是一个更容易的问题,我们试图通过提取12个信息特征来解决这个问题,我们用这些特征来训练分类器。然而,从一组“如何等同”查询检索到的顶级文档构建的语料库是高度不平衡的。为了解决这个问题,我们将抽样技术用于各种分类方法,对少数类程序文本产生合理的召回率和精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards Answering How do I Questions Using Classification
Interest in developing open domain question answering systems that leverage the massive amount of knowledge available on the Web is on the rise. In this investigation, we address the problem of answering How do I questions. Our goal is to use the top results obtained from a search engine to extract and present correct answers. Identifying correct answers to such questions is a hard problem that seems to require deep natural language understanding. Fortunately, answers to How do I questions are often procedural, typically containing a successive sequence of actions. Learning to label text as procedural or non-procedural is an easier problem which we attempted to solve by extracting 12 informative features with which we trained classifiers. However, the corpus built from the top documents retrieved for a set of How do I- equivalent queries turned out to be highly imbalanced. To tackle this issue, sampling techniques were used for a variety of classification methods, yielding reasonable recall and precision for the minority class of procedural texts.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信