Masoumeh Rajabi, Mohammad Ehsan Basiri, Shahla Nemati
{"title":"Identifying High-Quality User Replies Using Deep Neural Networks","authors":"Masoumeh Rajabi, Mohammad Ehsan Basiri, Shahla Nemati","doi":"10.1109/ICWR51868.2021.9443143","DOIUrl":null,"url":null,"abstract":"With the significant expansion of Q&A forums and the increasing need for users to access useful information, identifying quality content in text forums is of particular importance. Previous studies have focused on extracting several types of quality features from text that may be a time and labor-intensive task. To address this problem, in this paper, a long short-term memory (LSTM) deep neural network model is proposed to determine high-quality responses of users in text forums using only raw text of user replies. In the proposed model, embeddings from language models (ELMo) are usesd to represent words in vectors or embeddings. The proposed model is evaluated on two datasets: The TripAdvisor for New York City (NYC) and the Ubuntu Linux distribution online forums. Comparison of the results obtained using the proposed model and support vector machines (SVM), linear regression (LR), artificial neural networks (ANN), and naïve Bayes (NB) algorithms showed that, using only textual features, the accuracy of the proposed model was 43% and 28% higher compared to the highest accuracy obtained by the four traditional machine learning (ML) algorithms on the NYC and the Ubuntu datasets, respectively. This improvement was about 17% and 16% compared to the best results obtained by ML algorithms using both textual and quality dimension features.","PeriodicalId":377597,"journal":{"name":"2021 7th International Conference on Web Research (ICWR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Web Research (ICWR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWR51868.2021.9443143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
With the significant expansion of Q&A forums and the increasing need for users to access useful information, identifying quality content in text forums is of particular importance. Previous studies have focused on extracting several types of quality features from text that may be a time and labor-intensive task. To address this problem, in this paper, a long short-term memory (LSTM) deep neural network model is proposed to determine high-quality responses of users in text forums using only raw text of user replies. In the proposed model, embeddings from language models (ELMo) are usesd to represent words in vectors or embeddings. The proposed model is evaluated on two datasets: The TripAdvisor for New York City (NYC) and the Ubuntu Linux distribution online forums. Comparison of the results obtained using the proposed model and support vector machines (SVM), linear regression (LR), artificial neural networks (ANN), and naïve Bayes (NB) algorithms showed that, using only textual features, the accuracy of the proposed model was 43% and 28% higher compared to the highest accuracy obtained by the four traditional machine learning (ML) algorithms on the NYC and the Ubuntu datasets, respectively. This improvement was about 17% and 16% compared to the best results obtained by ML algorithms using both textual and quality dimension features.