Chang-ai Sun;Jian Mu;Mingjun Xiao;Huai Liu;Pinjia He
{"title":"Semantic Structure Invariance-Based Metamorphic Testing for Machine Translation Systems","authors":"Chang-ai Sun;Jian Mu;Mingjun Xiao;Huai Liu;Pinjia He","doi":"10.1109/TR.2024.3521029","DOIUrl":null,"url":null,"abstract":"In recent years, deep neural networks have been applied in machine translation systems, resulting in the so-called neural machine translation (NMT) models that can improve translation quality significantly. However, due to the brittleness of deep neural network, machine translation systems could return erroneous translations that lead to misunderstandings or even cause serious losses. To detect translation errors, various testing techniques have been proposed. As a popularly used technique, metamorphic testing mainly relies on text or syntactic structure of translations while ignoring the meaning of sentences (i.e., semantic information). Compared with text and syntactic information, semantic information of sentences is more stable when dealing with languages that have rich vocabulary and flexible word order. Motivated by this observation, we propose semantic structure invariance-based metamorphic testing (SSIMT) for machine translation systems. The key insight is that contextually similar sentences should typically have translations of similar semantic structures. Experiments have been conducted to evaluate SSIMT on two widely used machine translation systems, Microsoft Bing Translator and Google Translate with 600 seed sentences crawled from well-known news websites covering six different corpus topics. The experimental results show that SSIMT is able to find thousands of erroneous translations in both translation systems with high accuracy (over 70%). Translation errors reported by SSIMT covers a wide variety of common error types.","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"74 3","pages":"3251-3265"},"PeriodicalIF":5.7000,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10830582/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, deep neural networks have been applied in machine translation systems, resulting in the so-called neural machine translation (NMT) models that can improve translation quality significantly. However, due to the brittleness of deep neural network, machine translation systems could return erroneous translations that lead to misunderstandings or even cause serious losses. To detect translation errors, various testing techniques have been proposed. As a popularly used technique, metamorphic testing mainly relies on text or syntactic structure of translations while ignoring the meaning of sentences (i.e., semantic information). Compared with text and syntactic information, semantic information of sentences is more stable when dealing with languages that have rich vocabulary and flexible word order. Motivated by this observation, we propose semantic structure invariance-based metamorphic testing (SSIMT) for machine translation systems. The key insight is that contextually similar sentences should typically have translations of similar semantic structures. Experiments have been conducted to evaluate SSIMT on two widely used machine translation systems, Microsoft Bing Translator and Google Translate with 600 seed sentences crawled from well-known news websites covering six different corpus topics. The experimental results show that SSIMT is able to find thousands of erroneous translations in both translation systems with high accuracy (over 70%). Translation errors reported by SSIMT covers a wide variety of common error types.
期刊介绍:
IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.