Salah Al-Hagree, Maher Al-Sanabani, Khaled M. Alalayah, Mohammed Hadwan
{"title":"设计一种精确高效的阿拉伯语人名匹配算法","authors":"Salah Al-Hagree, Maher Al-Sanabani, Khaled M. Alalayah, Mohammed Hadwan","doi":"10.1109/ICOICE48418.2019.9035184","DOIUrl":null,"url":null,"abstract":"A great deal of research has been done to find out an accurate algorithm for name matching that would play major role in the application process. Researchers have developed several algorithms to measure the similarity of string, but most of them are designed mainly to deal with Latin-based languages. However, dealing with the Arabic context is a challenging task, owing to the nature and unique features of the Arabic language. This can explain why the name matching algorithms in the Arabic context are rare. Therefore, this paper aims at designing an accurate and efficient algorithm for matching Arabic names. In this paper, a framework for matching Arabic names has been designed to provide a platform for the current and future investigations, involving matching Arabic names. This framework deals with specific characteristics of Arabic language and the various levels of similarities for Arabic letters, mainly keyboard similarities, letter forms and phonetic similarities. Moreover, the proposed algorithm accounts for the operation of transposition and the enhanced states of substitution, deletion and insertion operations. Therefore, the proposed algorithm reduces the storage space of the process, saves the time of processing time and reduces the time complexity from O(N3) to O(N2). Besides, the experiments show that the proposed algorithm is more efficient and more accurate than the other algorithms. Keywords: Matching Arabic names, String matching, Character N-gram, Levenshtein distance.","PeriodicalId":109414,"journal":{"name":"2019 First International Conference of Intelligent Computing and Engineering (ICOICE)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Designing an Accurate and Efficient Algorithm for Matching Arabic Names\",\"authors\":\"Salah Al-Hagree, Maher Al-Sanabani, Khaled M. Alalayah, Mohammed Hadwan\",\"doi\":\"10.1109/ICOICE48418.2019.9035184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A great deal of research has been done to find out an accurate algorithm for name matching that would play major role in the application process. Researchers have developed several algorithms to measure the similarity of string, but most of them are designed mainly to deal with Latin-based languages. However, dealing with the Arabic context is a challenging task, owing to the nature and unique features of the Arabic language. This can explain why the name matching algorithms in the Arabic context are rare. Therefore, this paper aims at designing an accurate and efficient algorithm for matching Arabic names. In this paper, a framework for matching Arabic names has been designed to provide a platform for the current and future investigations, involving matching Arabic names. This framework deals with specific characteristics of Arabic language and the various levels of similarities for Arabic letters, mainly keyboard similarities, letter forms and phonetic similarities. Moreover, the proposed algorithm accounts for the operation of transposition and the enhanced states of substitution, deletion and insertion operations. Therefore, the proposed algorithm reduces the storage space of the process, saves the time of processing time and reduces the time complexity from O(N3) to O(N2). Besides, the experiments show that the proposed algorithm is more efficient and more accurate than the other algorithms. Keywords: Matching Arabic names, String matching, Character N-gram, Levenshtein distance.\",\"PeriodicalId\":109414,\"journal\":{\"name\":\"2019 First International Conference of Intelligent Computing and Engineering (ICOICE)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 First International Conference of Intelligent Computing and Engineering (ICOICE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOICE48418.2019.9035184\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 First International Conference of Intelligent Computing and Engineering (ICOICE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOICE48418.2019.9035184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Designing an Accurate and Efficient Algorithm for Matching Arabic Names
A great deal of research has been done to find out an accurate algorithm for name matching that would play major role in the application process. Researchers have developed several algorithms to measure the similarity of string, but most of them are designed mainly to deal with Latin-based languages. However, dealing with the Arabic context is a challenging task, owing to the nature and unique features of the Arabic language. This can explain why the name matching algorithms in the Arabic context are rare. Therefore, this paper aims at designing an accurate and efficient algorithm for matching Arabic names. In this paper, a framework for matching Arabic names has been designed to provide a platform for the current and future investigations, involving matching Arabic names. This framework deals with specific characteristics of Arabic language and the various levels of similarities for Arabic letters, mainly keyboard similarities, letter forms and phonetic similarities. Moreover, the proposed algorithm accounts for the operation of transposition and the enhanced states of substitution, deletion and insertion operations. Therefore, the proposed algorithm reduces the storage space of the process, saves the time of processing time and reduces the time complexity from O(N3) to O(N2). Besides, the experiments show that the proposed algorithm is more efficient and more accurate than the other algorithms. Keywords: Matching Arabic names, String matching, Character N-gram, Levenshtein distance.