利用门控融合提高NLP模型升级中预测的后向兼容性

Findings (Sydney (N.S.W.) Pub Date : 2023-02-04 DOI:10.48550/arXiv.2302.02080

Yi-An Lai, Elman Mansimov, Yuqing Xie, Yan Zhang

{"title":"利用门控融合提高NLP模型升级中预测的后向兼容性","authors":"Yi-An Lai, Elman Mansimov, Yuqing Xie, Yan Zhang","doi":"10.48550/arXiv.2302.02080","DOIUrl":null,"url":null,"abstract":"When upgrading neural models to a newer version, new errors that were not encountered in the legacy version can be introduced, known as regression errors. This inconsistent behavior during model upgrade often outweighs the benefits of accuracy gain and hinders the adoption of new models. To mitigate regression errors from model upgrade, distillation and ensemble have proven to be viable solutions without significant compromise in performance. Despite the progress, these approaches attained an incremental reduction in regression which is still far from achieving backward-compatible model upgrade. In this work, we propose a novel method, Gated Fusion, that promotes backward compatibility via learning to mix predictions between old and new models. Empirical results on two distinct model upgrade scenarios show that our method reduces the number of regression errors by 62% on average, outperforming the strongest baseline by an average of 25%.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"980-992"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving Prediction Backward-Compatiblility in NLP Model Upgrade with Gated Fusion\",\"authors\":\"Yi-An Lai, Elman Mansimov, Yuqing Xie, Yan Zhang\",\"doi\":\"10.48550/arXiv.2302.02080\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When upgrading neural models to a newer version, new errors that were not encountered in the legacy version can be introduced, known as regression errors. This inconsistent behavior during model upgrade often outweighs the benefits of accuracy gain and hinders the adoption of new models. To mitigate regression errors from model upgrade, distillation and ensemble have proven to be viable solutions without significant compromise in performance. Despite the progress, these approaches attained an incremental reduction in regression which is still far from achieving backward-compatible model upgrade. In this work, we propose a novel method, Gated Fusion, that promotes backward compatibility via learning to mix predictions between old and new models. Empirical results on two distinct model upgrade scenarios show that our method reduces the number of regression errors by 62% on average, outperforming the strongest baseline by an average of 25%.\",\"PeriodicalId\":73025,\"journal\":{\"name\":\"Findings (Sydney (N.S.W.)\",\"volume\":\"1 1\",\"pages\":\"980-992\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Findings (Sydney (N.S.W.)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2302.02080\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Findings (Sydney (N.S.W.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2302.02080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

当将神经模型升级到较新的版本时，可能会引入旧版本中没有遇到的新错误，称为回归错误。在模型升级过程中，这种不一致的行为往往超过了准确性获得的好处，并阻碍了新模型的采用。为了减轻模型升级带来的回归误差，蒸馏和集成已被证明是可行的解决方案，而不会对性能造成重大影响。尽管取得了进展，但这些方法实现了回归的增量减少，这仍然远远不能实现向后兼容的模型升级。在这项工作中，我们提出了一种新的方法，门控融合，通过学习混合新旧模型之间的预测来促进向后兼容性。在两种不同的模型升级场景下的实证结果表明，我们的方法平均减少了62%的回归误差，比最强基线平均高出25%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving Prediction Backward-Compatiblility in NLP Model Upgrade with Gated Fusion

When upgrading neural models to a newer version, new errors that were not encountered in the legacy version can be introduced, known as regression errors. This inconsistent behavior during model upgrade often outweighs the benefits of accuracy gain and hinders the adoption of new models. To mitigate regression errors from model upgrade, distillation and ensemble have proven to be viable solutions without significant compromise in performance. Despite the progress, these approaches attained an incremental reduction in regression which is still far from achieving backward-compatible model upgrade. In this work, we propose a novel method, Gated Fusion, that promotes backward compatibility via learning to mix predictions between old and new models. Empirical results on two distinct model upgrade scenarios show that our method reduces the number of regression errors by 62% on average, outperforming the strongest baseline by an average of 25%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Findings (Sydney (N.S.W.)

自引率

0.00%

发文量

审稿时长

4 weeks