B. Barufaldi, E. F. Santana, José Rogério B. B. Filho, J. V. D. Poel, Milton Marques Júnior, L. Batista
{"title":"基于PPM-C数据压缩的文学时期文本分类","authors":"B. Barufaldi, E. F. Santana, José Rogério B. B. Filho, J. V. D. Poel, Milton Marques Júnior, L. Batista","doi":"10.1109/STIL.2009.39","DOIUrl":null,"url":null,"abstract":"Methods and techniques for data compression have been used for pattern recognition, including automatic text classification. The performance of the Prediction by Partial Matching (PPM) as a text classifier has already been proofed by many works, including authorship attribution for Portuguese texts. Classes involved in classification process may not be restricted by only one author. By including two or more authors in one class, one can create a literature style. This work presents a literature style classifier for texts from Brazilian literature by using the PPM-C statistical model.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Text Classification by Literary Period Using PPM-C Data Compression\",\"authors\":\"B. Barufaldi, E. F. Santana, José Rogério B. B. Filho, J. V. D. Poel, Milton Marques Júnior, L. Batista\",\"doi\":\"10.1109/STIL.2009.39\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Methods and techniques for data compression have been used for pattern recognition, including automatic text classification. The performance of the Prediction by Partial Matching (PPM) as a text classifier has already been proofed by many works, including authorship attribution for Portuguese texts. Classes involved in classification process may not be restricted by only one author. By including two or more authors in one class, one can create a literature style. This work presents a literature style classifier for texts from Brazilian literature by using the PPM-C statistical model.\",\"PeriodicalId\":265848,\"journal\":{\"name\":\"2009 Seventh Brazilian Symposium in Information and Human Language Technology\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Seventh Brazilian Symposium in Information and Human Language Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/STIL.2009.39\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STIL.2009.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Text Classification by Literary Period Using PPM-C Data Compression
Methods and techniques for data compression have been used for pattern recognition, including automatic text classification. The performance of the Prediction by Partial Matching (PPM) as a text classifier has already been proofed by many works, including authorship attribution for Portuguese texts. Classes involved in classification process may not be restricted by only one author. By including two or more authors in one class, one can create a literature style. This work presents a literature style classifier for texts from Brazilian literature by using the PPM-C statistical model.