{"title":"Distinguishing the language of ciphered texts","authors":"Boris Ryabko, Andrey Gruzin, V. Monarev","doi":"10.1109/SIBIRCON.2008.4602584","DOIUrl":null,"url":null,"abstract":"We address the problem of distinguishing the language of a ciphered file. We consider texts in Russian, English, German and French which are encrypted by the block ciphers. We consider the ECB (electronic code book) mode and ciphers with block length 64, 96 and 128 bits. It is shown that the language of the encrypted text can be effectively determined, for instance Russian and English texts can be distinguished with the error around 5% when the block length is 64 bits and the file length is 800 kbytes and more and with mistake around 25% when the block length is 128 bits and the length of file is 2500 kbytes and more. In addition, English and German texts can be distinguished with the mistake around 13% in the case when the block length is 64 bits and file length is not less than 800 kbytes. Also the results of experiments to distinguish texts in natural and artificial languages (e.g., source codes in C++ or Java) are presented.","PeriodicalId":295946,"journal":{"name":"2008 IEEE Region 8 International Conference on Computational Technologies in Electrical and Electronics Engineering","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE Region 8 International Conference on Computational Technologies in Electrical and Electronics Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIBIRCON.2008.4602584","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We address the problem of distinguishing the language of a ciphered file. We consider texts in Russian, English, German and French which are encrypted by the block ciphers. We consider the ECB (electronic code book) mode and ciphers with block length 64, 96 and 128 bits. It is shown that the language of the encrypted text can be effectively determined, for instance Russian and English texts can be distinguished with the error around 5% when the block length is 64 bits and the file length is 800 kbytes and more and with mistake around 25% when the block length is 128 bits and the length of file is 2500 kbytes and more. In addition, English and German texts can be distinguished with the mistake around 13% in the case when the block length is 64 bits and file length is not less than 800 kbytes. Also the results of experiments to distinguish texts in natural and artificial languages (e.g., source codes in C++ or Java) are presented.