{"title":"深度学习用于句子分类","authors":"Abdalraouf Hassan, A. Mahmood","doi":"10.1109/LISAT.2017.8001979","DOIUrl":null,"url":null,"abstract":"Most of the machine learning algorithms requires the input to be denoted as a fixed-length feature vector. In text classifications (bag-of-words) is a popular fixed-length features. Despite their simplicity, they are limited in many tasks; they ignore semantics of words and loss ordering of words. In this paper, we propose a simple and efficient neural language model for sentence-level classification task. Our model employs Recurrent Neural Network Language Model (RNN-LM). Particularly, Long Short-Term Memory (LSTM) over pre-trained word vectors obtained from unsupervised neural language model to capture semantics and syntactic information in a short sentence. We achieved outstanding empirical results on multiple benchmark datasets, IMDB Sentiment analysis dataset, and Stanford Sentiment Treebank (SSTb) dataset. The empirical results show that our model is comparable with neural methods and outperforms traditional methods in sentiment analysis task.","PeriodicalId":370931,"journal":{"name":"2017 IEEE Long Island Systems, Applications and Technology Conference (LISAT)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":"{\"title\":\"Deep learning for sentence classification\",\"authors\":\"Abdalraouf Hassan, A. Mahmood\",\"doi\":\"10.1109/LISAT.2017.8001979\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most of the machine learning algorithms requires the input to be denoted as a fixed-length feature vector. In text classifications (bag-of-words) is a popular fixed-length features. Despite their simplicity, they are limited in many tasks; they ignore semantics of words and loss ordering of words. In this paper, we propose a simple and efficient neural language model for sentence-level classification task. Our model employs Recurrent Neural Network Language Model (RNN-LM). Particularly, Long Short-Term Memory (LSTM) over pre-trained word vectors obtained from unsupervised neural language model to capture semantics and syntactic information in a short sentence. We achieved outstanding empirical results on multiple benchmark datasets, IMDB Sentiment analysis dataset, and Stanford Sentiment Treebank (SSTb) dataset. The empirical results show that our model is comparable with neural methods and outperforms traditional methods in sentiment analysis task.\",\"PeriodicalId\":370931,\"journal\":{\"name\":\"2017 IEEE Long Island Systems, Applications and Technology Conference (LISAT)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"40\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE Long Island Systems, Applications and Technology Conference (LISAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LISAT.2017.8001979\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Long Island Systems, Applications and Technology Conference (LISAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LISAT.2017.8001979","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Most of the machine learning algorithms requires the input to be denoted as a fixed-length feature vector. In text classifications (bag-of-words) is a popular fixed-length features. Despite their simplicity, they are limited in many tasks; they ignore semantics of words and loss ordering of words. In this paper, we propose a simple and efficient neural language model for sentence-level classification task. Our model employs Recurrent Neural Network Language Model (RNN-LM). Particularly, Long Short-Term Memory (LSTM) over pre-trained word vectors obtained from unsupervised neural language model to capture semantics and syntactic information in a short sentence. We achieved outstanding empirical results on multiple benchmark datasets, IMDB Sentiment analysis dataset, and Stanford Sentiment Treebank (SSTb) dataset. The empirical results show that our model is comparable with neural methods and outperforms traditional methods in sentiment analysis task.