{"title":"Web based machine learning for language identification and translation","authors":"Ş. Sağiroğlu, U. Yavanoglu, Esra Nergis Guven","doi":"10.1109/ICMLA.2007.27","DOIUrl":null,"url":null,"abstract":"Language identification is an important task for Web information retrieval services. This paper presents the implementation of a platform for language identification in multi-lingual documents on Web. The platform consists of five modules to achieve the tasks automatically. Furthermore, artificial neural networks were used for the identification of languages in multi-lingual documents. Results for six languages including Turkish, French, Italian, Danish and Deutsch are present. The major benefit of the approach is that the ANN based language identification system could meet the expectations in real-time language identification accuracy with the help of a developed system. Experiments have shown that system achieves the tasks in high accuracy in discriminating different languages and converting them other languages on Web pages.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2007.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
Language identification is an important task for Web information retrieval services. This paper presents the implementation of a platform for language identification in multi-lingual documents on Web. The platform consists of five modules to achieve the tasks automatically. Furthermore, artificial neural networks were used for the identification of languages in multi-lingual documents. Results for six languages including Turkish, French, Italian, Danish and Deutsch are present. The major benefit of the approach is that the ANN based language identification system could meet the expectations in real-time language identification accuracy with the help of a developed system. Experiments have shown that system achieves the tasks in high accuracy in discriminating different languages and converting them other languages on Web pages.