An Efficient Framework for Constructing Generalized Locally-Induced Text Metrics

IJCAI : proceedings of the conference Pub Date : 2011-07-16 DOI:10.5591/978-1-57735-516-8/IJCAI11-198

S. Amizadeh, Shuguang Wang, M. Hauskrecht

引用次数: 5

Abstract

In this paper, we propose a new framework for constructing text metrics which can be used to compare and support inferences among terms and sets of terms. Our metric is derived from data-driven kernels on graphs that let us capture global relations among terms and sets of terms, regardless of their complexity and size. To compute the metric efficiently for any two subsets of terms, we develop an approximation technique that relies on the precompiled term-term similarities. To scale-up the approach to problems with huge number of terms, we develop and experiment with a solution that sub-samples the term space. We demonstrate the benefits of the whole framework on two text inference tasks: prediction of terms in the article from its abstract and query expansion in information retrieval.

查看原文本刊更多论文

构造广义局部诱导文本度量的有效框架

在本文中，我们提出了一个构建文本度量的新框架，该框架可用于比较和支持术语和术语集之间的推理。我们的度量来源于图上的数据驱动内核，这些内核使我们能够捕获术语和术语集之间的全局关系，而不考虑它们的复杂性和大小。为了有效地计算任意两个项子集的度量，我们开发了一种依赖于预编译的项-项相似性的近似技术。为了将该方法扩展到具有大量术语的问题，我们开发并试验了一种对术语空间进行子采样的解决方案。我们展示了整个框架在两个文本推理任务上的好处:从摘要中预测文章中的术语和信息检索中的查询扩展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IJCAI : proceedings of the conference

自引率

0.00%

发文量