A hidden treasure? Evaluating and extending latent methods for link-based classification

Aaron Fleming, Luke K. McDowell, Zane Markel
{"title":"A hidden treasure? Evaluating and extending latent methods for link-based classification","authors":"Aaron Fleming, Luke K. McDowell, Zane Markel","doi":"10.1109/IRI.2014.7051954","DOIUrl":null,"url":null,"abstract":"Many information tasks involve objects that are explicitly or implicitly connected in a network, such as webpages connected by hyperlinks or people linked by \"friendships\" in a social network. Research on link-based classification (LBC) has studied how to leverage these connections to improve classification accuracy. This research broadly falls into two groups. First, there are methods that use the original attributes and/or links of the network, via a link-aware supervised classifier or via a non-learning method based on label propagation or random walks. Second, there are recent methods that first compute a set of latent features or links that summarize the network, then use a (hopefully simpler) supervised classifier or label propagation method. Some work has claimed that the latent methods can improve accuracy, but has not adequately compared with the best non-latent methods. In response, this paper provides the first substantial comparison between these two groups. We find that certain non-latent methods typically provide the best overall accuracy, but that latent methods can be competitive when a network is densely-labeled or when the attributes are not very informative. Moreover, we introduce two novel combinations of these methods that in some cases substantially increase accuracy.","PeriodicalId":360013,"journal":{"name":"Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2014.7051954","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Many information tasks involve objects that are explicitly or implicitly connected in a network, such as webpages connected by hyperlinks or people linked by "friendships" in a social network. Research on link-based classification (LBC) has studied how to leverage these connections to improve classification accuracy. This research broadly falls into two groups. First, there are methods that use the original attributes and/or links of the network, via a link-aware supervised classifier or via a non-learning method based on label propagation or random walks. Second, there are recent methods that first compute a set of latent features or links that summarize the network, then use a (hopefully simpler) supervised classifier or label propagation method. Some work has claimed that the latent methods can improve accuracy, but has not adequately compared with the best non-latent methods. In response, this paper provides the first substantial comparison between these two groups. We find that certain non-latent methods typically provide the best overall accuracy, but that latent methods can be competitive when a network is densely-labeled or when the attributes are not very informative. Moreover, we introduce two novel combinations of these methods that in some cases substantially increase accuracy.
隐藏的宝藏?评价和扩展基于链接的潜在分类方法
许多信息任务涉及在网络中显式或隐式连接的对象,例如通过超链接连接的网页或社交网络中通过“友谊”连接的人。基于链接的分类(LBC)研究了如何利用这些连接来提高分类精度。这项研究大致分为两类。首先,有一些方法使用网络的原始属性和/或链接,通过链接感知监督分类器或通过基于标签传播或随机行走的非学习方法。其次,最近有一些方法首先计算一组潜在特征或链接来总结网络,然后使用(希望更简单的)监督分类器或标签传播方法。一些研究声称潜在方法可以提高准确性,但尚未与最佳的非潜在方法进行充分的比较。因此,本文首次对这两个群体进行了实质性的比较。我们发现某些非潜在方法通常提供最好的总体准确性,但是当网络被密集标记或属性信息不是很丰富时,潜在方法可能会有竞争力。此外,我们介绍了这些方法的两种新组合,在某些情况下大大提高了准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信