Essam Said Hanandeh, Aref Abu Awwad, Yazan Khassawneh
{"title":"Classify Arabic Text using Vector Space Models","authors":"Essam Said Hanandeh, Aref Abu Awwad, Yazan Khassawneh","doi":"10.1109/acit53391.2021.9677134","DOIUrl":null,"url":null,"abstract":"The researchers of this study chose 242 Arabic abstract doucments. Computer science and information systems are mentioned in all of these abstracts. The researchers created an Arabic-specific autonomous information retrieval system, the system was written in the C# NET programming language and its compatible with IBM/PCs and other microcomputers. For this corpus, The researchers used an automatic indexing strategy. The system was created using the Vector Space Model (VSM). In this model, the researcher take all measurements and utilize the Cosine, Dice, Jaccard, and Inner Product Similarity measures. Using the Vector Space Model, the researchers compared the retrieval results. In Arabic documents, the researchers discovered that the retrieval result for cosine is better than the retrieval result for other measures.","PeriodicalId":302120,"journal":{"name":"2021 22nd International Arab Conference on Information Technology (ACIT)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 22nd International Arab Conference on Information Technology (ACIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/acit53391.2021.9677134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The researchers of this study chose 242 Arabic abstract doucments. Computer science and information systems are mentioned in all of these abstracts. The researchers created an Arabic-specific autonomous information retrieval system, the system was written in the C# NET programming language and its compatible with IBM/PCs and other microcomputers. For this corpus, The researchers used an automatic indexing strategy. The system was created using the Vector Space Model (VSM). In this model, the researcher take all measurements and utilize the Cosine, Dice, Jaccard, and Inner Product Similarity measures. Using the Vector Space Model, the researchers compared the retrieval results. In Arabic documents, the researchers discovered that the retrieval result for cosine is better than the retrieval result for other measures.