B. Barufaldi, E. F. Santana, José Rogério B. B. Filho, J. V. D. Poel, Milton Marques Júnior, L. Batista
{"title":"Text Classification by Literary Period Using PPM-C Data Compression","authors":"B. Barufaldi, E. F. Santana, José Rogério B. B. Filho, J. V. D. Poel, Milton Marques Júnior, L. Batista","doi":"10.1109/STIL.2009.39","DOIUrl":null,"url":null,"abstract":"Methods and techniques for data compression have been used for pattern recognition, including automatic text classification. The performance of the Prediction by Partial Matching (PPM) as a text classifier has already been proofed by many works, including authorship attribution for Portuguese texts. Classes involved in classification process may not be restricted by only one author. By including two or more authors in one class, one can create a literature style. This work presents a literature style classifier for texts from Brazilian literature by using the PPM-C statistical model.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STIL.2009.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Methods and techniques for data compression have been used for pattern recognition, including automatic text classification. The performance of the Prediction by Partial Matching (PPM) as a text classifier has already been proofed by many works, including authorship attribution for Portuguese texts. Classes involved in classification process may not be restricted by only one author. By including two or more authors in one class, one can create a literature style. This work presents a literature style classifier for texts from Brazilian literature by using the PPM-C statistical model.