{"title":"A novel feature selection based on Tibetan grammar for Tibetan text classification","authors":"T. Jiang, Hongzhi Yu","doi":"10.1109/ICSESS.2015.7339093","DOIUrl":null,"url":null,"abstract":"Feature selection is a strategy that aims at making text classifiers more efficient and accurate. In this paper, we proposed a novel feature selection method based on Tibetan grammar for Tibetan classification. Tibetan language express grammatical meaning through the function words and word order, and the function word has large proportions. By analyzing the Tibetan grammar and distribution of part of speech, we proposed feature selection method based on Tibetan notional words. The method analyzed the part of speech of Tibetan text, and then used notional words as text features combined with IG method to realize feature selection. The experimental result shows that this method has improved significantly on classification efficiency and accuracy which compared with the traditional feature selection methods.","PeriodicalId":335871,"journal":{"name":"2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSESS.2015.7339093","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Feature selection is a strategy that aims at making text classifiers more efficient and accurate. In this paper, we proposed a novel feature selection method based on Tibetan grammar for Tibetan classification. Tibetan language express grammatical meaning through the function words and word order, and the function word has large proportions. By analyzing the Tibetan grammar and distribution of part of speech, we proposed feature selection method based on Tibetan notional words. The method analyzed the part of speech of Tibetan text, and then used notional words as text features combined with IG method to realize feature selection. The experimental result shows that this method has improved significantly on classification efficiency and accuracy which compared with the traditional feature selection methods.