Evaluating machine learning algorithms for applications with humans in the loop

2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC) Pub Date : 2017-05-16 DOI:10.1109/ICNSC.2017.8000136

A. K. Gopalakrishna, T. Ozcelebi, J. Lukkien, A. Liotta

{"title":"Evaluating machine learning algorithms for applications with humans in the loop","authors":"A. K. Gopalakrishna, T. Ozcelebi, J. Lukkien, A. Liotta","doi":"10.1109/ICNSC.2017.8000136","DOIUrl":null,"url":null,"abstract":"Applications employing data classification such as smart lighting that involve human factors such as perception lead to non-deterministic input-output relationships where more than one output may be acceptable for a given input. For these so called non-deterministic multiple output classification (nDMOC) problems, the relationship between the input and output may change over time making it difficult for the machine learning (ML) algorithms in a batch setting to make predictions for a given context. In this paper, we describe the nature of nDMOC problems and discuss the Relevance Score (RS) that is suitable in this context as a performance metric. RS determines the extent by which a predicted output is relevant to the user's context and behaviors, taking into account the inconsistencies that come with human (perception) factors. We tailor the RS metric so that it can be used to evaluate ML algorithms in an online setting at run-time. We assess the performance of a number of ML algorithms, using a smart lighting dataset with non-deterministic one-to-many input-output relationships. The results indicate that using RS instead of classification accuracy (CA) is suitable to analyze the performance of conventional ML algorithms applied to the category of nDMOC problems. Instance-based online ML gives the best RS performance. An interesting finding is that the RS keeps increasing with increasing number of samples, even after the CA performance converges.","PeriodicalId":145129,"journal":{"name":"2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNSC.2017.8000136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Applications employing data classification such as smart lighting that involve human factors such as perception lead to non-deterministic input-output relationships where more than one output may be acceptable for a given input. For these so called non-deterministic multiple output classification (nDMOC) problems, the relationship between the input and output may change over time making it difficult for the machine learning (ML) algorithms in a batch setting to make predictions for a given context. In this paper, we describe the nature of nDMOC problems and discuss the Relevance Score (RS) that is suitable in this context as a performance metric. RS determines the extent by which a predicted output is relevant to the user's context and behaviors, taking into account the inconsistencies that come with human (perception) factors. We tailor the RS metric so that it can be used to evaluate ML algorithms in an online setting at run-time. We assess the performance of a number of ML algorithms, using a smart lighting dataset with non-deterministic one-to-many input-output relationships. The results indicate that using RS instead of classification accuracy (CA) is suitable to analyze the performance of conventional ML algorithms applied to the category of nDMOC problems. Instance-based online ML gives the best RS performance. An interesting finding is that the RS keeps increasing with increasing number of samples, even after the CA performance converges.

查看原文本刊更多论文

评估有人类参与的应用的机器学习算法

采用数据分类(如智能照明)的应用程序涉及感知等人为因素，导致非确定性的输入-输出关系，其中对于给定的输入，可以接受多个输出。对于这些所谓的非确定性多输出分类(nDMOC)问题，输入和输出之间的关系可能会随着时间的推移而改变，这使得批处理设置中的机器学习(ML)算法难以对给定上下文进行预测。在本文中，我们描述了nDMOC问题的性质，并讨论了在这种情况下适合作为性能指标的相关性评分(RS)。RS确定预测输出与用户上下文和行为相关的程度，同时考虑到人类(感知)因素带来的不一致性。我们定制RS度量，以便它可以用于在运行时在线设置中评估ML算法。我们使用具有非确定性一对多输入输出关系的智能照明数据集评估了许多ML算法的性能。结果表明，用RS代替分类精度(CA)来分析传统ML算法在nDMOC问题分类中的性能是合适的。基于实例的在线ML提供了最佳的RS性能。一个有趣的发现是，即使在CA性能收敛之后，RS也会随着样本数量的增加而增加。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC)

自引率

0.00%

发文量