{"title":"一种结合词袋和概念袋方法的文本自动分类表示方案","authors":"A. Alahmadi, Arash Joorabchi, A. Mahdi","doi":"10.1109/IEEEGCC.2013.6705759","DOIUrl":null,"url":null,"abstract":"This paper introduces a new approach to creating text representations and apply it to a standard text classification collections. The approach is based on supplementing the well-known Bag-of-Words (BOW) representational scheme with a concept-based representation that utilises Wikipedia as a knowledge base. The proposed representations are used to generate a Vector Space Model, which in turn is fed into a Support Vector Machine classifier to categorise a collection of textual documents from two publically available datasets. Experimental results for evaluating the performance of our model in comparison to using a standard BOW scheme and a concept-based scheme, as well as recently reported similar text representations that are based on augmenting the standard BOW approach with concept-based representations.","PeriodicalId":316751,"journal":{"name":"2013 7th IEEE GCC Conference and Exhibition (GCC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"A new text representation scheme combining Bag-of-Words and Bag-of-Concepts approaches for automatic text classification\",\"authors\":\"A. Alahmadi, Arash Joorabchi, A. Mahdi\",\"doi\":\"10.1109/IEEEGCC.2013.6705759\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces a new approach to creating text representations and apply it to a standard text classification collections. The approach is based on supplementing the well-known Bag-of-Words (BOW) representational scheme with a concept-based representation that utilises Wikipedia as a knowledge base. The proposed representations are used to generate a Vector Space Model, which in turn is fed into a Support Vector Machine classifier to categorise a collection of textual documents from two publically available datasets. Experimental results for evaluating the performance of our model in comparison to using a standard BOW scheme and a concept-based scheme, as well as recently reported similar text representations that are based on augmenting the standard BOW approach with concept-based representations.\",\"PeriodicalId\":316751,\"journal\":{\"name\":\"2013 7th IEEE GCC Conference and Exhibition (GCC)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 7th IEEE GCC Conference and Exhibition (GCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEEEGCC.2013.6705759\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 7th IEEE GCC Conference and Exhibition (GCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEEEGCC.2013.6705759","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A new text representation scheme combining Bag-of-Words and Bag-of-Concepts approaches for automatic text classification
This paper introduces a new approach to creating text representations and apply it to a standard text classification collections. The approach is based on supplementing the well-known Bag-of-Words (BOW) representational scheme with a concept-based representation that utilises Wikipedia as a knowledge base. The proposed representations are used to generate a Vector Space Model, which in turn is fed into a Support Vector Machine classifier to categorise a collection of textual documents from two publically available datasets. Experimental results for evaluating the performance of our model in comparison to using a standard BOW scheme and a concept-based scheme, as well as recently reported similar text representations that are based on augmenting the standard BOW approach with concept-based representations.