{"title":"Metamorphic Robustness Testing of Google Translate","authors":"Dickson T. S. Lee, Z. Zhou, T. H. Tse","doi":"10.1145/3387940.3391484","DOIUrl":null,"url":null,"abstract":"Current research on the testing of machine translation software mainly focuses on functional correctness for valid, well-formed inputs. By contrast, robustness testing, which involves the ability of the software to handle erroneous or unanticipated inputs, is often overlooked. In this paper, we propose to address this important shortcoming. Using the metamorphic robustness testing approach, we compare the translations of original inputs with those of follow-up inputs having different categories of minor typos. Our empirical results reveal a lack of robustness in Google Translate, thereby opening a new research direction for the quality assurance of neural machine translators.","PeriodicalId":309659,"journal":{"name":"Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops","volume":"341 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3387940.3391484","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Current research on the testing of machine translation software mainly focuses on functional correctness for valid, well-formed inputs. By contrast, robustness testing, which involves the ability of the software to handle erroneous or unanticipated inputs, is often overlooked. In this paper, we propose to address this important shortcoming. Using the metamorphic robustness testing approach, we compare the translations of original inputs with those of follow-up inputs having different categories of minor typos. Our empirical results reveal a lack of robustness in Google Translate, thereby opening a new research direction for the quality assurance of neural machine translators.