Combinatorial Testing Metrics for Machine Learning

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW) Pub Date : 2021-04-01 DOI:10.1109/ICSTW52544.2021.00025

Erin Lanus, Laura J. Freeman, D. R. Kuhn, R. Kacker

引用次数: 15

Abstract

This paper defines a set difference metric for comparing machine learning (ML) datasets and proposes the difference between datasets be a function of combinatorial coverage. We illustrate its utility for evaluating and predicting performance of ML models. Identifying and measuring differences between datasets is of significant value for ML problems, where the accuracy of the model is heavily dependent on the degree to which training data are sufficiently representative of data encountered in application. The method is illustrated for transfer learning without retraining, the problem of predicting performance of a model trained on one dataset and applied to another.

查看原文本刊更多论文

机器学习的组合测试指标

本文定义了一个用于比较机器学习(ML)数据集的集差度量，并提出数据集之间的差异是组合覆盖的函数。我们说明了它在评估和预测机器学习模型性能方面的效用。识别和测量数据集之间的差异对于ML问题具有重要价值，其中模型的准确性严重依赖于训练数据在多大程度上充分代表应用中遇到的数据。该方法用于无需再训练的迁移学习，即在一个数据集上训练并应用于另一个数据集的模型的预测性能问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

自引率

0.00%

发文量