基于感知器算法的多概念文档分类

Clay Woolam, L. Khan
{"title":"基于感知器算法的多概念文档分类","authors":"Clay Woolam, L. Khan","doi":"10.1109/WIIAT.2008.410","DOIUrl":null,"url":null,"abstract":"Previous work in hierarchical categorization focuses on the hierarchical perceptron (Hieron) algorithm. Hierarchical perceptron works on the principles of the perceptron,that is each class label in the hierarchy has an associated weight vector. To account for the hierarchy, we begin at the root of the tree and sum all weights to the target label.We make a prediction by considering the label that yields the maximum inner product of its feature set with its path-summed weights. Learning is done by adjusting the weights along the path from the predicted node to the correct node by a specific loss function that adheres to the large margin principal. There are several problems with applying this approach to a multiple class problem. In many cases we could end up punishing weights that gave a correct prediction, because the algorithm can only take a single case at a time. In this paper we present an extended hierarchical perceptron algorithm capable of solving the multiple categorization problem (MultiHieron). We introduce new aggregate loss function for multiple label learning. We make weight updates simultaneously instead of serially. Then, significant improvement over the basic Hieron algorithm is demonstrated on the aviation safety reporting system (ASRS) flight anomaly database and OntoNews corpus using both flat and hierarchical categorization metrics.","PeriodicalId":393772,"journal":{"name":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2008-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Multi-concept Document Classification Using a Perceptron-Like Algorithm\",\"authors\":\"Clay Woolam, L. Khan\",\"doi\":\"10.1109/WIIAT.2008.410\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Previous work in hierarchical categorization focuses on the hierarchical perceptron (Hieron) algorithm. Hierarchical perceptron works on the principles of the perceptron,that is each class label in the hierarchy has an associated weight vector. To account for the hierarchy, we begin at the root of the tree and sum all weights to the target label.We make a prediction by considering the label that yields the maximum inner product of its feature set with its path-summed weights. Learning is done by adjusting the weights along the path from the predicted node to the correct node by a specific loss function that adheres to the large margin principal. There are several problems with applying this approach to a multiple class problem. In many cases we could end up punishing weights that gave a correct prediction, because the algorithm can only take a single case at a time. In this paper we present an extended hierarchical perceptron algorithm capable of solving the multiple categorization problem (MultiHieron). We introduce new aggregate loss function for multiple label learning. We make weight updates simultaneously instead of serially. Then, significant improvement over the basic Hieron algorithm is demonstrated on the aviation safety reporting system (ASRS) flight anomaly database and OntoNews corpus using both flat and hierarchical categorization metrics.\",\"PeriodicalId\":393772,\"journal\":{\"name\":\"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WIIAT.2008.410\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WIIAT.2008.410","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

以前的分层分类研究主要集中在分层感知器(Hieron)算法上。层次感知器的工作原理是感知器,即层次中的每个类标签都有一个相关的权重向量。为了解释层次结构,我们从树的根开始,并将所有权重求和到目标标签。我们通过考虑产生其特征集与其路径和权值的最大内积的标签来进行预测。学习是通过一个特定的损失函数来调整从预测节点到正确节点的路径上的权重来完成的,这个损失函数遵循大边际原则。将这种方法应用于多类问题有几个问题。在许多情况下,我们最终可能会惩罚给出正确预测的权重,因为算法一次只能处理一个情况。本文提出了一种能够解决多重分类问题的扩展层次感知器算法(MultiHieron)。我们引入了新的聚合损失函数用于多标签学习。我们同时更新权重,而不是连续更新。然后,在航空安全报告系统(ASRS)飞行异常数据库和ontonnews语料库上,使用扁平和分层分类指标,证明了基于基本Hieron算法的显著改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-concept Document Classification Using a Perceptron-Like Algorithm
Previous work in hierarchical categorization focuses on the hierarchical perceptron (Hieron) algorithm. Hierarchical perceptron works on the principles of the perceptron,that is each class label in the hierarchy has an associated weight vector. To account for the hierarchy, we begin at the root of the tree and sum all weights to the target label.We make a prediction by considering the label that yields the maximum inner product of its feature set with its path-summed weights. Learning is done by adjusting the weights along the path from the predicted node to the correct node by a specific loss function that adheres to the large margin principal. There are several problems with applying this approach to a multiple class problem. In many cases we could end up punishing weights that gave a correct prediction, because the algorithm can only take a single case at a time. In this paper we present an extended hierarchical perceptron algorithm capable of solving the multiple categorization problem (MultiHieron). We introduce new aggregate loss function for multiple label learning. We make weight updates simultaneously instead of serially. Then, significant improvement over the basic Hieron algorithm is demonstrated on the aviation safety reporting system (ASRS) flight anomaly database and OntoNews corpus using both flat and hierarchical categorization metrics.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信