Multi-transfer:多视角、多源的学习迁移

IF 2.1 4区数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Statistical Analysis and Data Mining Pub Date : 2014-01-01 DOI:10.1137/1.9781611972832.27

Ben Tan, Erheng Zhong, E. Xiang, Qiang Yang

{"title":"Multi-transfer:多视角、多源的学习迁移","authors":"Ben Tan, Erheng Zhong, E. Xiang, Qiang Yang","doi":"10.1137/1.9781611972832.27","DOIUrl":null,"url":null,"abstract":"Transfer learning, which aims to help the learning task in a target domain by leveraging knowledge from auxiliary domains, has been demonstrated to be effective in different applications, e.g., text mining, sentiment analysis, etc. In addition, in many real-world applications, auxiliary data are described from multiple perspectives and usually carried by multiple sources. For example, to help classify videos on Youtube, which include three views/perspectives: image, voice and subtitles, one may borrow data from Flickr, Last.FM and Google News. Although any single instance in these domains can only cover a part of the views available on Youtube, actually the piece of information carried by them may compensate with each other. In this paper, we define this transfer learning problem as Transfer Learning with Multiple Views and Multiple Sources. As different sources may have different probability distributions and different views may be compensate or inconsistent with each other, merging all data in a simplistic manner will not give optimal result. Thus, we propose a novel algorithm to leverage knowledge from different views and sources collaboratively, by letting different views from different sources complement each other through a co-training style framework, while revise the distribution differences in different domains. We conduct empirical studies on several real-world datasets to show that the proposed approach can improve the classification accuracy by up to 8% against different state-of-the-art baselines.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"34 1","pages":"282-293"},"PeriodicalIF":2.1000,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":"{\"title\":\"Multi-transfer: Transfer learning with multiple views and multiple sources\",\"authors\":\"Ben Tan, Erheng Zhong, E. Xiang, Qiang Yang\",\"doi\":\"10.1137/1.9781611972832.27\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transfer learning, which aims to help the learning task in a target domain by leveraging knowledge from auxiliary domains, has been demonstrated to be effective in different applications, e.g., text mining, sentiment analysis, etc. In addition, in many real-world applications, auxiliary data are described from multiple perspectives and usually carried by multiple sources. For example, to help classify videos on Youtube, which include three views/perspectives: image, voice and subtitles, one may borrow data from Flickr, Last.FM and Google News. Although any single instance in these domains can only cover a part of the views available on Youtube, actually the piece of information carried by them may compensate with each other. In this paper, we define this transfer learning problem as Transfer Learning with Multiple Views and Multiple Sources. As different sources may have different probability distributions and different views may be compensate or inconsistent with each other, merging all data in a simplistic manner will not give optimal result. Thus, we propose a novel algorithm to leverage knowledge from different views and sources collaboratively, by letting different views from different sources complement each other through a co-training style framework, while revise the distribution differences in different domains. We conduct empirical studies on several real-world datasets to show that the proposed approach can improve the classification accuracy by up to 8% against different state-of-the-art baselines.\",\"PeriodicalId\":48684,\"journal\":{\"name\":\"Statistical Analysis and Data Mining\",\"volume\":\"34 1\",\"pages\":\"282-293\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2014-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"46\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistical Analysis and Data Mining\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1137/1.9781611972832.27\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Analysis and Data Mining","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1137/1.9781611972832.27","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 46

摘要

迁移学习旨在通过利用辅助领域的知识来帮助目标领域的学习任务，已被证明在不同的应用中是有效的，例如文本挖掘，情感分析等。此外，在许多实际应用程序中，辅助数据是从多个角度描述的，通常由多个源携带。例如，为了帮助对Youtube上的视频进行分类，它包括三种视图/视角:图像、声音和字幕，人们可以从Flickr、Last中借用数据。FM和谷歌新闻。尽管这些域中的任何单个实例只能覆盖Youtube上可用视图的一部分，但实际上它们所携带的信息可以相互补偿。本文将这种迁移学习问题定义为多视图多源迁移学习。由于不同的数据源可能具有不同的概率分布，不同的视图可能相互补偿或不一致，以简单的方式合并所有数据将无法获得最佳结果。因此，我们提出了一种新的算法，通过共同训练风格的框架，让来自不同来源的不同观点相互补充，同时修正不同领域的分布差异，从而协同利用来自不同观点和来源的知识。我们对几个真实世界的数据集进行了实证研究，表明所提出的方法可以在不同的最先进的基线下将分类精度提高8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-transfer: Transfer learning with multiple views and multiple sources

Transfer learning, which aims to help the learning task in a target domain by leveraging knowledge from auxiliary domains, has been demonstrated to be effective in different applications, e.g., text mining, sentiment analysis, etc. In addition, in many real-world applications, auxiliary data are described from multiple perspectives and usually carried by multiple sources. For example, to help classify videos on Youtube, which include three views/perspectives: image, voice and subtitles, one may borrow data from Flickr, Last.FM and Google News. Although any single instance in these domains can only cover a part of the views available on Youtube, actually the piece of information carried by them may compensate with each other. In this paper, we define this transfer learning problem as Transfer Learning with Multiple Views and Multiple Sources. As different sources may have different probability distributions and different views may be compensate or inconsistent with each other, merging all data in a simplistic manner will not give optimal result. Thus, we propose a novel algorithm to leverage knowledge from different views and sources collaboratively, by letting different views from different sources complement each other through a co-training style framework, while revise the distribution differences in different domains. We conduct empirical studies on several real-world datasets to show that the proposed approach can improve the classification accuracy by up to 8% against different state-of-the-art baselines.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Statistical Analysis and Data Mining COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCEC-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

CiteScore

3.20

自引率

7.70%

发文量

期刊介绍： Statistical Analysis and Data Mining addresses the broad area of data analysis, including statistical approaches, machine learning, data mining, and applications. Topics include statistical and computational approaches for analyzing massive and complex datasets, novel statistical and/or machine learning methods and theory, and state-of-the-art applications with high impact. Of special interest are articles that describe innovative analytical techniques, and discuss their application to real problems, in such a way that they are accessible and beneficial to domain experts across science, engineering, and commerce. The focus of the journal is on papers which satisfy one or more of the following criteria: Solve data analysis problems associated with massive, complex datasets Develop innovative statistical approaches, machine learning algorithms, or methods integrating ideas across disciplines, e.g., statistics, computer science, electrical engineering, operation research. Formulate and solve high-impact real-world problems which challenge existing paradigms via new statistical and/or computational models Provide survey to prominent research topics.