{"title":"基于词嵌入的阿拉伯语句子语义相似度研究","authors":"Badrya Dahy, M. Farouk, Khaled Fathy","doi":"10.1109/ESOLEC54569.2022.10009099","DOIUrl":null,"url":null,"abstract":"Natural language processing pays significant attention to semantic textual similarity. It's useful in a variety of NLP-applications, including information retrieval, plagiarism detection, data extraction, and machine translation. Sentence similarity in the Arabic language has not been investigated deeply because of the lack of Arabic language resources. Moreover, it's critical to calculate the degree of similarity between Arabic sentences accurately. The method for determining the semantic similarity of Arabic sentences is suggested in this research. The strategy suggested uses word embedding to measure the similarity between words. Moreover, more than one similarity measure is combined to calculate the final similarity. Furthermore, due to the lack of Arabic resources, a new dataset for evaluating similarity techniques has been constructed. The new dataset is available for public use. An experiment have been conducted to show the efficiency of the strategy suggested. Two datasets are used to compare other approaches. Experiments reveal that the proposed methods outperform alternative approaches to measuring sentence similarity in the Arabic language.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Arabic Sentences Semantic Similarity Based on Word Embedding\",\"authors\":\"Badrya Dahy, M. Farouk, Khaled Fathy\",\"doi\":\"10.1109/ESOLEC54569.2022.10009099\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Natural language processing pays significant attention to semantic textual similarity. It's useful in a variety of NLP-applications, including information retrieval, plagiarism detection, data extraction, and machine translation. Sentence similarity in the Arabic language has not been investigated deeply because of the lack of Arabic language resources. Moreover, it's critical to calculate the degree of similarity between Arabic sentences accurately. The method for determining the semantic similarity of Arabic sentences is suggested in this research. The strategy suggested uses word embedding to measure the similarity between words. Moreover, more than one similarity measure is combined to calculate the final similarity. Furthermore, due to the lack of Arabic resources, a new dataset for evaluating similarity techniques has been constructed. The new dataset is available for public use. An experiment have been conducted to show the efficiency of the strategy suggested. Two datasets are used to compare other approaches. Experiments reveal that the proposed methods outperform alternative approaches to measuring sentence similarity in the Arabic language.\",\"PeriodicalId\":179850,\"journal\":{\"name\":\"2022 20th International Conference on Language Engineering (ESOLEC)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 20th International Conference on Language Engineering (ESOLEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ESOLEC54569.2022.10009099\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 20th International Conference on Language Engineering (ESOLEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESOLEC54569.2022.10009099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Arabic Sentences Semantic Similarity Based on Word Embedding
Natural language processing pays significant attention to semantic textual similarity. It's useful in a variety of NLP-applications, including information retrieval, plagiarism detection, data extraction, and machine translation. Sentence similarity in the Arabic language has not been investigated deeply because of the lack of Arabic language resources. Moreover, it's critical to calculate the degree of similarity between Arabic sentences accurately. The method for determining the semantic similarity of Arabic sentences is suggested in this research. The strategy suggested uses word embedding to measure the similarity between words. Moreover, more than one similarity measure is combined to calculate the final similarity. Furthermore, due to the lack of Arabic resources, a new dataset for evaluating similarity techniques has been constructed. The new dataset is available for public use. An experiment have been conducted to show the efficiency of the strategy suggested. Two datasets are used to compare other approaches. Experiments reveal that the proposed methods outperform alternative approaches to measuring sentence similarity in the Arabic language.