iClass:多标签分类与专家知识的结合

2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA) Pub Date : 2015-12-01 DOI:10.1109/ICMLA.2015.179

Marmar Moussa, Marc Maynard

{"title":"iClass:多标签分类与专家知识的结合","authors":"Marmar Moussa, Marc Maynard","doi":"10.1109/ICMLA.2015.179","DOIUrl":null,"url":null,"abstract":"Roper Center is one of the largest public opinion data archives in the world. It collects data sets of polled survey questions from numerous media outlets and organizations. The volume of data introduces search complexities over survey questions and poses challenges when analyzing search trends. Roper Center question-level retrieval applications used human metadata experts to assign topics to content. This has been insufficient to reach required levels of consistency and provides an inadequate base for creating an advanced search experience. The objective of this work is to combine the human expert teams' knowledge of the nature of the survey questions and the concepts and topics these questions express, with the ability of multi-label classifiers to learn this knowledge and apply it to an automated, fast and accurate classification mechanism. This approach cuts down the question analysis and tagging time significantly as well as provides enhanced consistency and scalability for topics' descriptions. At the same time, creating an ensemble of machine learning classifiers combined with expert knowledge is expected to enhance the search experience and provide much needed analytic capabilities to the survey questions databases. In our design, we use classification from several machine learning algorithms like SVM and Decision Trees, combined with expert knowledge in form of handcrafted rules, data analysis and result review. We consolidate the different techniques into a Multipath Classifier with a Confidence point system that decides upon the relevance of topics assigned to survey questions with nearly perfect accuracy.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"51 9","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"iClass: Combining Multiple Multi-label Classification with Expert Knowledge\",\"authors\":\"Marmar Moussa, Marc Maynard\",\"doi\":\"10.1109/ICMLA.2015.179\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Roper Center is one of the largest public opinion data archives in the world. It collects data sets of polled survey questions from numerous media outlets and organizations. The volume of data introduces search complexities over survey questions and poses challenges when analyzing search trends. Roper Center question-level retrieval applications used human metadata experts to assign topics to content. This has been insufficient to reach required levels of consistency and provides an inadequate base for creating an advanced search experience. The objective of this work is to combine the human expert teams' knowledge of the nature of the survey questions and the concepts and topics these questions express, with the ability of multi-label classifiers to learn this knowledge and apply it to an automated, fast and accurate classification mechanism. This approach cuts down the question analysis and tagging time significantly as well as provides enhanced consistency and scalability for topics' descriptions. At the same time, creating an ensemble of machine learning classifiers combined with expert knowledge is expected to enhance the search experience and provide much needed analytic capabilities to the survey questions databases. In our design, we use classification from several machine learning algorithms like SVM and Decision Trees, combined with expert knowledge in form of handcrafted rules, data analysis and result review. We consolidate the different techniques into a Multipath Classifier with a Confidence point system that decides upon the relevance of topics assigned to survey questions with nearly perfect accuracy.\",\"PeriodicalId\":288427,\"journal\":{\"name\":\"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"51 9\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2015.179\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2015.179","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

罗珀中心是世界上最大的民意数据档案之一。它收集了来自众多媒体和组织的民意调查问题的数据集。数据量给调查问题带来了搜索复杂性，并在分析搜索趋势时提出了挑战。Roper Center问题级检索应用程序使用人工元数据专家为内容分配主题。这不足以达到所需的一致性水平，也不足以为创建高级搜索体验提供基础。这项工作的目标是将人类专家团队对调查问题的性质以及这些问题所表达的概念和主题的了解与多标签分类器学习这些知识并将其应用于自动化，快速和准确的分类机制的能力相结合。这种方法大大减少了问题分析和标记时间，并为主题描述提供了增强的一致性和可扩展性。同时，创建一个结合专家知识的机器学习分类器的集合有望增强搜索体验，并为调查问题数据库提供急需的分析能力。在我们的设计中，我们使用了几种机器学习算法(如SVM和Decision Trees)的分类，并结合了手工规则、数据分析和结果审查形式的专家知识。我们将不同的技术整合到一个具有置信度点系统的多路径分类器中，该系统以近乎完美的精度决定分配给调查问题的主题的相关性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

iClass: Combining Multiple Multi-label Classification with Expert Knowledge

Roper Center is one of the largest public opinion data archives in the world. It collects data sets of polled survey questions from numerous media outlets and organizations. The volume of data introduces search complexities over survey questions and poses challenges when analyzing search trends. Roper Center question-level retrieval applications used human metadata experts to assign topics to content. This has been insufficient to reach required levels of consistency and provides an inadequate base for creating an advanced search experience. The objective of this work is to combine the human expert teams' knowledge of the nature of the survey questions and the concepts and topics these questions express, with the ability of multi-label classifiers to learn this knowledge and apply it to an automated, fast and accurate classification mechanism. This approach cuts down the question analysis and tagging time significantly as well as provides enhanced consistency and scalability for topics' descriptions. At the same time, creating an ensemble of machine learning classifiers combined with expert knowledge is expected to enhance the search experience and provide much needed analytic capabilities to the survey questions databases. In our design, we use classification from several machine learning algorithms like SVM and Decision Trees, combined with expert knowledge in form of handcrafted rules, data analysis and result review. We consolidate the different techniques into a Multipath Classifier with a Confidence point system that decides upon the relevance of topics assigned to survey questions with nearly perfect accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)

自引率

0.00%

发文量