{"title":"CEW-DTW: A new time series model for text mining","authors":"Guandong Zhang, Hao Yu, Lu Xiao","doi":"10.1109/ICOIACT.2018.8350694","DOIUrl":null,"url":null,"abstract":"The keyword information is usually applied to describe answers. In most of the previous studies, researchers usually rank answers according to keyword retrieval, which fails to consider the importance of the time sequence of keywords in answers. In this paper, we propose CEW-DTW, a new time series model for answer ranking. This model considers the importance of the time sequence of keywords as well as the amount of keywords. CEW-DTW is developed from a carefully designed model, Dynamic Time Warping-Delta (DTW-D). We choose Amazon question/answer data as our evaluation dataset. We apply Entropy to remove noise in answer vectors. In experiments, we apply normalized discounted cumulative gain (nDCG) as the assess rule to test models. CEW-DTW is proven to have a better performance than Dynamic Time Warping (DTW) and Dynamic Time Warping-Delta (DTW-D) in answer ranking. An extensive set of evaluation results demonstrates the effectiveness of the CEW-DTW model for answer ranking.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"30 8","pages":"158-162"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Information and Communications Technology (ICOIACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOIACT.2018.8350694","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The keyword information is usually applied to describe answers. In most of the previous studies, researchers usually rank answers according to keyword retrieval, which fails to consider the importance of the time sequence of keywords in answers. In this paper, we propose CEW-DTW, a new time series model for answer ranking. This model considers the importance of the time sequence of keywords as well as the amount of keywords. CEW-DTW is developed from a carefully designed model, Dynamic Time Warping-Delta (DTW-D). We choose Amazon question/answer data as our evaluation dataset. We apply Entropy to remove noise in answer vectors. In experiments, we apply normalized discounted cumulative gain (nDCG) as the assess rule to test models. CEW-DTW is proven to have a better performance than Dynamic Time Warping (DTW) and Dynamic Time Warping-Delta (DTW-D) in answer ranking. An extensive set of evaluation results demonstrates the effectiveness of the CEW-DTW model for answer ranking.