{"title":"阿拉伯语文本主题分割算法的评价","authors":"Fayçal Nouar, H. Belhadef","doi":"10.1109/ICNLSP.2018.8374389","DOIUrl":null,"url":null,"abstract":"In this paper, we are interested in the topic segmentation of Arabic texts. For this aim, we evaluate two based lexical cohesion algorithms: MinCutSeg and BayesSeg by using the Pk and WindowDiff metrics. To assess how well each algorithm works, each was applied on three datasets with longer texts from two different domains: transcribed multi-party conversations and written texts. After adaptation to the Arabic language, the test results show significant differences in performance depending on the types of documents.","PeriodicalId":405017,"journal":{"name":"International Conference on Natural Language and Speech Processing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Evaluation of topic segmentation algorithms on Arabic texts\",\"authors\":\"Fayçal Nouar, H. Belhadef\",\"doi\":\"10.1109/ICNLSP.2018.8374389\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we are interested in the topic segmentation of Arabic texts. For this aim, we evaluate two based lexical cohesion algorithms: MinCutSeg and BayesSeg by using the Pk and WindowDiff metrics. To assess how well each algorithm works, each was applied on three datasets with longer texts from two different domains: transcribed multi-party conversations and written texts. After adaptation to the Arabic language, the test results show significant differences in performance depending on the types of documents.\",\"PeriodicalId\":405017,\"journal\":{\"name\":\"International Conference on Natural Language and Speech Processing\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Natural Language and Speech Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNLSP.2018.8374389\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Natural Language and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNLSP.2018.8374389","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evaluation of topic segmentation algorithms on Arabic texts
In this paper, we are interested in the topic segmentation of Arabic texts. For this aim, we evaluate two based lexical cohesion algorithms: MinCutSeg and BayesSeg by using the Pk and WindowDiff metrics. To assess how well each algorithm works, each was applied on three datasets with longer texts from two different domains: transcribed multi-party conversations and written texts. After adaptation to the Arabic language, the test results show significant differences in performance depending on the types of documents.