Metric Learning and Adaptive Boundary for Out-of-Domain Detection

International Conference on Applications of Natural Language to Data Bases Pub Date : 2022-04-22 DOI:10.48550/arXiv.2204.10849

Petr Lorenc, Tommaso Gargiani, Jan Pichl, Jakub Konrád, Petro Marek, Ondrej Kobza, J. Sedivý

{"title":"Metric Learning and Adaptive Boundary for Out-of-Domain Detection","authors":"Petr Lorenc, Tommaso Gargiani, Jan Pichl, Jakub Konrád, Petro Marek, Ondrej Kobza, J. Sedivý","doi":"10.48550/arXiv.2204.10849","DOIUrl":null,"url":null,"abstract":"Conversational agents are usually designed for closed-world environments. Unfortunately, users can behave unexpectedly. Based on the open-world environment, we often encounter the situation that the training and test data are sampled from different distributions. Then, data from different distributions are called out-of-domain (OOD). A robust conversational agent needs to react to these OOD utterances adequately. Thus, the importance of robust OOD detection is emphasized. Unfortunately, collecting OOD data is a challenging task. We have designed an OOD detection algorithm independent of OOD data that outperforms a wide range of current state-of-the-art algorithms on publicly available datasets. Our algorithm is based on a simple but efficient approach of combining metric learning with adaptive decision boundary. Furthermore, compared to other algorithms, we have found that our proposed algorithm has significantly improved OOD performance in a scenario with a lower number of classes while preserving the accuracy for in-domain (IND) classes.","PeriodicalId":136374,"journal":{"name":"International Conference on Applications of Natural Language to Data Bases","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Applications of Natural Language to Data Bases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2204.10849","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Conversational agents are usually designed for closed-world environments. Unfortunately, users can behave unexpectedly. Based on the open-world environment, we often encounter the situation that the training and test data are sampled from different distributions. Then, data from different distributions are called out-of-domain (OOD). A robust conversational agent needs to react to these OOD utterances adequately. Thus, the importance of robust OOD detection is emphasized. Unfortunately, collecting OOD data is a challenging task. We have designed an OOD detection algorithm independent of OOD data that outperforms a wide range of current state-of-the-art algorithms on publicly available datasets. Our algorithm is based on a simple but efficient approach of combining metric learning with adaptive decision boundary. Furthermore, compared to other algorithms, we have found that our proposed algorithm has significantly improved OOD performance in a scenario with a lower number of classes while preserving the accuracy for in-domain (IND) classes.

查看原文本刊更多论文

域外检测的度量学习和自适应边界

会话代理通常是为封闭环境设计的。不幸的是，用户的行为可能出乎意料。基于开放世界环境，我们经常会遇到训练数据和测试数据来自不同分布的情况。然后，来自不同分布的数据被称为域外(OOD)。一个健壮的会话代理需要对这些OOD话语做出充分的反应。因此，强调了鲁棒OOD检测的重要性。不幸的是，收集OOD数据是一项具有挑战性的任务。我们设计了一种独立于OOD数据的OOD检测算法，该算法在公开可用的数据集上优于当前各种最先进的算法。我们的算法基于一种简单而有效的方法，将度量学习与自适应决策边界相结合。此外，与其他算法相比，我们发现我们提出的算法在类数量较少的场景下显著提高了OOD性能，同时保持了域内(IND)类的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Applications of Natural Language to Data Bases

自引率

0.00%

发文量