Evaluating machine learning algorithms for applications with humans in the loop

A. K. Gopalakrishna, T. Ozcelebi, J. Lukkien, A. Liotta
{"title":"Evaluating machine learning algorithms for applications with humans in the loop","authors":"A. K. Gopalakrishna, T. Ozcelebi, J. Lukkien, A. Liotta","doi":"10.1109/ICNSC.2017.8000136","DOIUrl":null,"url":null,"abstract":"Applications employing data classification such as smart lighting that involve human factors such as perception lead to non-deterministic input-output relationships where more than one output may be acceptable for a given input. For these so called non-deterministic multiple output classification (nDMOC) problems, the relationship between the input and output may change over time making it difficult for the machine learning (ML) algorithms in a batch setting to make predictions for a given context. In this paper, we describe the nature of nDMOC problems and discuss the Relevance Score (RS) that is suitable in this context as a performance metric. RS determines the extent by which a predicted output is relevant to the user's context and behaviors, taking into account the inconsistencies that come with human (perception) factors. We tailor the RS metric so that it can be used to evaluate ML algorithms in an online setting at run-time. We assess the performance of a number of ML algorithms, using a smart lighting dataset with non-deterministic one-to-many input-output relationships. The results indicate that using RS instead of classification accuracy (CA) is suitable to analyze the performance of conventional ML algorithms applied to the category of nDMOC problems. Instance-based online ML gives the best RS performance. An interesting finding is that the RS keeps increasing with increasing number of samples, even after the CA performance converges.","PeriodicalId":145129,"journal":{"name":"2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNSC.2017.8000136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Applications employing data classification such as smart lighting that involve human factors such as perception lead to non-deterministic input-output relationships where more than one output may be acceptable for a given input. For these so called non-deterministic multiple output classification (nDMOC) problems, the relationship between the input and output may change over time making it difficult for the machine learning (ML) algorithms in a batch setting to make predictions for a given context. In this paper, we describe the nature of nDMOC problems and discuss the Relevance Score (RS) that is suitable in this context as a performance metric. RS determines the extent by which a predicted output is relevant to the user's context and behaviors, taking into account the inconsistencies that come with human (perception) factors. We tailor the RS metric so that it can be used to evaluate ML algorithms in an online setting at run-time. We assess the performance of a number of ML algorithms, using a smart lighting dataset with non-deterministic one-to-many input-output relationships. The results indicate that using RS instead of classification accuracy (CA) is suitable to analyze the performance of conventional ML algorithms applied to the category of nDMOC problems. Instance-based online ML gives the best RS performance. An interesting finding is that the RS keeps increasing with increasing number of samples, even after the CA performance converges.
评估有人类参与的应用的机器学习算法
采用数据分类(如智能照明)的应用程序涉及感知等人为因素,导致非确定性的输入-输出关系,其中对于给定的输入,可以接受多个输出。对于这些所谓的非确定性多输出分类(nDMOC)问题,输入和输出之间的关系可能会随着时间的推移而改变,这使得批处理设置中的机器学习(ML)算法难以对给定上下文进行预测。在本文中,我们描述了nDMOC问题的性质,并讨论了在这种情况下适合作为性能指标的相关性评分(RS)。RS确定预测输出与用户上下文和行为相关的程度,同时考虑到人类(感知)因素带来的不一致性。我们定制RS度量,以便它可以用于在运行时在线设置中评估ML算法。我们使用具有非确定性一对多输入输出关系的智能照明数据集评估了许多ML算法的性能。结果表明,用RS代替分类精度(CA)来分析传统ML算法在nDMOC问题分类中的性能是合适的。基于实例的在线ML提供了最佳的RS性能。一个有趣的发现是,即使在CA性能收敛之后,RS也会随着样本数量的增加而增加。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信