Latent feature encoding using dyadic and relational data

Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management Pub Date : 2011-10-24 DOI:10.1145/2063576.2063926

S. Ando

{"title":"Latent feature encoding using dyadic and relational data","authors":"S. Ando","doi":"10.1145/2063576.2063926","DOIUrl":null,"url":null,"abstract":"Learning from dyadic and relational data is a fundamental problem for IR and KDD applications in web and social media domain. Basic behaviors and characteristics of users and documents are typically described by a collection of dyads, i.e., pairs of entities. Discriminative features extracted from such data are essential in exploratory and discriminatory analyses. Relational properties of the entities reflect pair-wise similarities and their collective community structure which are also valuable for discriminative learning. A challenging aspect of learning from the relational data in many domains, is that the generative process of relational links appears noisy and is not well described by a stochastic model.\n In this paper, we present a principled approach for learning discriminative features from heterogeneous sources of dyadic and relational data. We propose an information-theoretic framework called Latent Feature Encoding (LFE) which projects the entities and the links to a latent feature space in the analogy of -encoding. Projection is formalized as a maximization of the mutual information preserved in the latent features, regularized by the compression rate of encoding. The regularization is emphasized over more probable links to account for the noisiness of the observation. An empirical evaluation of the proposed method using text and social media datasets is presented. Performances in supervised and unsupervised learning tasks are compared with those of conventional latent feature extraction methods.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"44 1","pages":"2201-2204"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2063576.2063926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Learning from dyadic and relational data is a fundamental problem for IR and KDD applications in web and social media domain. Basic behaviors and characteristics of users and documents are typically described by a collection of dyads, i.e., pairs of entities. Discriminative features extracted from such data are essential in exploratory and discriminatory analyses. Relational properties of the entities reflect pair-wise similarities and their collective community structure which are also valuable for discriminative learning. A challenging aspect of learning from the relational data in many domains, is that the generative process of relational links appears noisy and is not well described by a stochastic model. In this paper, we present a principled approach for learning discriminative features from heterogeneous sources of dyadic and relational data. We propose an information-theoretic framework called Latent Feature Encoding (LFE) which projects the entities and the links to a latent feature space in the analogy of -encoding. Projection is formalized as a maximization of the mutual information preserved in the latent features, regularized by the compression rate of encoding. The regularization is emphasized over more probable links to account for the noisiness of the observation. An empirical evaluation of the proposed method using text and social media datasets is presented. Performances in supervised and unsupervised learning tasks are compared with those of conventional latent feature extraction methods.

查看原文本刊更多论文

使用二进和关系数据的潜在特征编码

从二元和关系数据中学习是网络和社交媒体领域IR和KDD应用的一个基本问题。用户和文档的基本行为和特征通常由一组二元组合(即成对的实体)来描述。从这些数据中提取的判别特征在探索性和判别性分析中是必不可少的。实体的关系属性反映了成对相似性及其集体社区结构，这对判别学习也很有价值。从许多领域的关系数据中学习的一个具有挑战性的方面是，关系链接的生成过程出现噪声，并且不能很好地用随机模型描述。在本文中，我们提出了一种从二元和关系数据的异构来源中学习判别特征的原则方法。我们提出了一种称为潜在特征编码(Latent Feature Encoding, LFE)的信息理论框架，它将实体和链接以类似于-编码的方式投射到潜在特征空间。投影被形式化为隐特征中保留的互信息的最大化，通过编码的压缩率进行正则化。在更可能的环节上强调正则化，以解释观测的噪声。本文提出了使用文本和社交媒体数据集对所提出的方法进行实证评估。在有监督学习和无监督学习任务中，比较了传统的潜在特征提取方法的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management

自引率

0.00%

发文量