A Text Classification Framework with a Local Feature Ranking for Learning Social Networks

M. Makrehchi, M. Kamel
{"title":"A Text Classification Framework with a Local Feature Ranking for Learning Social Networks","authors":"M. Makrehchi, M. Kamel","doi":"10.1109/ICDM.2007.26","DOIUrl":null,"url":null,"abstract":"In this paper, a text classifier framework with a feature ranking scheme is proposed to extract social structures from text data. It is assumed that only a small subset of relations between the individuals in a community is known. With this assumption, the social network extraction is translated into a classification problem. The relations between two individuals are represented by merging their document vectors and the given relations are used as labels of training data. By this transformation, a text classifier such as Rocchio is used for learning the unknown relations. We show that there is a link between the intrinsic sparsity of social networks and class imbalance. Furthermore, we show that feature ranking methods usually fail in problem with unbalanced data. In order to deal with this deficiency and re-balance the unbalanced social data, a local feature ranking method, which is called reverse discrimination, is proposed.","PeriodicalId":233758,"journal":{"name":"Seventh IEEE International Conference on Data Mining (ICDM 2007)","volume":"1926 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seventh IEEE International Conference on Data Mining (ICDM 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2007.26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

In this paper, a text classifier framework with a feature ranking scheme is proposed to extract social structures from text data. It is assumed that only a small subset of relations between the individuals in a community is known. With this assumption, the social network extraction is translated into a classification problem. The relations between two individuals are represented by merging their document vectors and the given relations are used as labels of training data. By this transformation, a text classifier such as Rocchio is used for learning the unknown relations. We show that there is a link between the intrinsic sparsity of social networks and class imbalance. Furthermore, we show that feature ranking methods usually fail in problem with unbalanced data. In order to deal with this deficiency and re-balance the unbalanced social data, a local feature ranking method, which is called reverse discrimination, is proposed.
基于局部特征排序的学习社交网络文本分类框架
本文提出了一种基于特征排序方案的文本分类器框架,用于从文本数据中提取社会结构。假设一个社区中只有一小部分个体之间的关系是已知的。有了这个假设,社会网络的提取就转化成了一个分类问题。通过合并两个个体之间的文档向量来表示它们之间的关系,并将给定的关系用作训练数据的标签。通过这种转换,文本分类器(如Rocchio)用于学习未知关系。我们表明,社会网络的内在稀疏性与阶级不平衡之间存在联系。此外,我们发现特征排序方法通常在不平衡数据的问题上失败。为了解决这一不足,对不平衡的社会数据进行再平衡,提出了一种局部特征排序方法,称为反向歧视。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信