{"title":"线性回归问题的顺序主动学习和离群值检测联合机器学习和人类学习设计","authors":"Xiaohua Li, Jian Zheng","doi":"10.1109/CISS.2016.7460537","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a joint machine learning and human learning design approach to make the training data labeling task in linear regression problems more efficient and robust to noise, modeling mismatch, and human labeling errors. Considering a sequential active learning scheme which relies on human learning to enlarge training data set, we integrate it with sparse outlier detection algorithms to mitigate the inevitable human errors during training data labeling. First, we assume sparse human errors and formulate the outlier detection as a sparse optimization problem within the sequential active learning procedure. Then, for non-sparse human errors, with the IRT (item response theory) to model the distribution of human errors, appropriate data are selected to reconstruct a training data set with sparse human errors. Simulations are conducted to verify the desirable performance of the proposed approach.","PeriodicalId":346776,"journal":{"name":"2016 Annual Conference on Information Science and Systems (CISS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Joint machine learning and human learning design with sequential active learning and outlier detection for linear regression problems\",\"authors\":\"Xiaohua Li, Jian Zheng\",\"doi\":\"10.1109/CISS.2016.7460537\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a joint machine learning and human learning design approach to make the training data labeling task in linear regression problems more efficient and robust to noise, modeling mismatch, and human labeling errors. Considering a sequential active learning scheme which relies on human learning to enlarge training data set, we integrate it with sparse outlier detection algorithms to mitigate the inevitable human errors during training data labeling. First, we assume sparse human errors and formulate the outlier detection as a sparse optimization problem within the sequential active learning procedure. Then, for non-sparse human errors, with the IRT (item response theory) to model the distribution of human errors, appropriate data are selected to reconstruct a training data set with sparse human errors. Simulations are conducted to verify the desirable performance of the proposed approach.\",\"PeriodicalId\":346776,\"journal\":{\"name\":\"2016 Annual Conference on Information Science and Systems (CISS)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Annual Conference on Information Science and Systems (CISS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISS.2016.7460537\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Annual Conference on Information Science and Systems (CISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISS.2016.7460537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Joint machine learning and human learning design with sequential active learning and outlier detection for linear regression problems
In this paper, we propose a joint machine learning and human learning design approach to make the training data labeling task in linear regression problems more efficient and robust to noise, modeling mismatch, and human labeling errors. Considering a sequential active learning scheme which relies on human learning to enlarge training data set, we integrate it with sparse outlier detection algorithms to mitigate the inevitable human errors during training data labeling. First, we assume sparse human errors and formulate the outlier detection as a sparse optimization problem within the sequential active learning procedure. Then, for non-sparse human errors, with the IRT (item response theory) to model the distribution of human errors, appropriate data are selected to reconstruct a training data set with sparse human errors. Simulations are conducted to verify the desirable performance of the proposed approach.