QuIET: A Text Classification Technique Using Automatically Generated Span Queries

2014 IEEE International Conference on Semantic Computing Pub Date : 2014-06-16 DOI:10.1109/ICSC.2014.18

Vassilis Polychronopoulos, N. Pendar, S. Jeffery

引用次数: 2

Abstract

We propose a novel algorithm, QuIET, for binary classification of texts. The method automatically generates a set of span queries from a set of annotated documents and uses the query set to categorize unlabeled texts. QuIET generates models that are human understandable. We describe the method and evaluate it empirically against Support Vector Machines, demonstrating a comparable performance for a known curated dataset and a superior performance for some categories of noisy local businesses data. We also describe an active learning approach that is applicable to QuIET and can boost its performance.

查看原文本刊更多论文

QuIET:使用自动生成跨度查询的文本分类技术

我们提出了一种新的文本二分类算法——QuIET。该方法从一组带注释的文档自动生成一组跨查询，并使用该查询集对未标记的文本进行分类。QuIET生成人类可以理解的模型。我们描述了该方法，并根据支持向量机对其进行了经验评估，展示了对已知策划数据集的可比性能，以及对某些类别的嘈杂本地企业数据的优越性能。我们还描述了一种适用于QuIET的主动学习方法，可以提高其性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 IEEE International Conference on Semantic Computing

自引率

0.00%

发文量