线性回归问题的顺序主动学习和离群值检测联合机器学习和人类学习设计

Xiaohua Li, Jian Zheng
{"title":"线性回归问题的顺序主动学习和离群值检测联合机器学习和人类学习设计","authors":"Xiaohua Li, Jian Zheng","doi":"10.1109/CISS.2016.7460537","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a joint machine learning and human learning design approach to make the training data labeling task in linear regression problems more efficient and robust to noise, modeling mismatch, and human labeling errors. Considering a sequential active learning scheme which relies on human learning to enlarge training data set, we integrate it with sparse outlier detection algorithms to mitigate the inevitable human errors during training data labeling. First, we assume sparse human errors and formulate the outlier detection as a sparse optimization problem within the sequential active learning procedure. Then, for non-sparse human errors, with the IRT (item response theory) to model the distribution of human errors, appropriate data are selected to reconstruct a training data set with sparse human errors. Simulations are conducted to verify the desirable performance of the proposed approach.","PeriodicalId":346776,"journal":{"name":"2016 Annual Conference on Information Science and Systems (CISS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Joint machine learning and human learning design with sequential active learning and outlier detection for linear regression problems\",\"authors\":\"Xiaohua Li, Jian Zheng\",\"doi\":\"10.1109/CISS.2016.7460537\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a joint machine learning and human learning design approach to make the training data labeling task in linear regression problems more efficient and robust to noise, modeling mismatch, and human labeling errors. Considering a sequential active learning scheme which relies on human learning to enlarge training data set, we integrate it with sparse outlier detection algorithms to mitigate the inevitable human errors during training data labeling. First, we assume sparse human errors and formulate the outlier detection as a sparse optimization problem within the sequential active learning procedure. Then, for non-sparse human errors, with the IRT (item response theory) to model the distribution of human errors, appropriate data are selected to reconstruct a training data set with sparse human errors. Simulations are conducted to verify the desirable performance of the proposed approach.\",\"PeriodicalId\":346776,\"journal\":{\"name\":\"2016 Annual Conference on Information Science and Systems (CISS)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Annual Conference on Information Science and Systems (CISS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISS.2016.7460537\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Annual Conference on Information Science and Systems (CISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISS.2016.7460537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

在本文中,我们提出了一种机器学习和人类学习的联合设计方法,使线性回归问题中的训练数据标记任务更加高效,并且对噪声,建模不匹配和人为标记错误具有鲁棒性。考虑一种依赖人类学习来扩大训练数据集的顺序主动学习方案,将其与稀疏离群点检测算法相结合,以减轻训练数据标注过程中不可避免的人为错误。首先,我们假设人为误差稀疏,并将异常点检测作为顺序主动学习过程中的稀疏优化问题。然后,对于非稀疏的人为错误,利用IRT(项目反应理论)对人为错误的分布进行建模,选择合适的数据重构具有稀疏人为错误的训练数据集。通过仿真验证了所提方法的良好性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Joint machine learning and human learning design with sequential active learning and outlier detection for linear regression problems
In this paper, we propose a joint machine learning and human learning design approach to make the training data labeling task in linear regression problems more efficient and robust to noise, modeling mismatch, and human labeling errors. Considering a sequential active learning scheme which relies on human learning to enlarge training data set, we integrate it with sparse outlier detection algorithms to mitigate the inevitable human errors during training data labeling. First, we assume sparse human errors and formulate the outlier detection as a sparse optimization problem within the sequential active learning procedure. Then, for non-sparse human errors, with the IRT (item response theory) to model the distribution of human errors, appropriate data are selected to reconstruct a training data set with sparse human errors. Simulations are conducted to verify the desirable performance of the proposed approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信