从旅游评论中提取受欢迎的消息

Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services Pub Date : 2016-11-28 DOI:10.1145/3011141.3011149

Keigo Sakai, Akiyo Nadamoto

{"title":"从旅游评论中提取受欢迎的消息","authors":"Keigo Sakai, Akiyo Nadamoto","doi":"10.1145/3011141.3011149","DOIUrl":null,"url":null,"abstract":"Nowadays, travel-related information of many kinds can be found on the Internet. People plan their travel and obtain information about sightseeing spots from the Internet before they travel. When obtaining information related to sightseeing spots, they receive basic information from official pages easily. However, other useful information exists on user-generated travel sites. User-generated travel sites abound on the Internet, offering great amounts of diverse information related to travel and destinations. This study addresses travel information of four types related to user-generated travel sites: basic, useful-buzz, useful-unexpected, and garbage information. Useful-unexpected information benefits users, but extracting it from user-generated content is difficult because it includes so much useful-buzz information and garbage information. We designate useful-unexpected important information as \"Welcome-news\". As described herein, we propose a means of extracting Welcome-news from user-generated travel contents. Our proposed Welcome-news is \"useful information\" and \"unexpected information\" related to travel. We first extract useful information based on Welcome news keywords, which are general keywords and unique keywords. General keywords often appear in Welcome-news. We regard general keywords by our user experiment. Unique keywords depend on the sightseeing spot. We regard unique keywords based on SVM. Next we extract unexpected information based on clustering. Our unexpected information includes topics of unexpected information that are important topics for sightseeing spots and contents that are often not stated. Subsequently, we extract important topics based on topic-based clustering. Then we extract unexpected information contents from the cluster based on its distance from the cluster center. We conducted experiments of three types to extract correct answers, to assess the feasibility of using Welcome-news keywords, and to assess the feasibility of extracting Welcome-news.","PeriodicalId":247823,"journal":{"name":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","volume":"39 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Extracting welcome news from travel reviews\",\"authors\":\"Keigo Sakai, Akiyo Nadamoto\",\"doi\":\"10.1145/3011141.3011149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, travel-related information of many kinds can be found on the Internet. People plan their travel and obtain information about sightseeing spots from the Internet before they travel. When obtaining information related to sightseeing spots, they receive basic information from official pages easily. However, other useful information exists on user-generated travel sites. User-generated travel sites abound on the Internet, offering great amounts of diverse information related to travel and destinations. This study addresses travel information of four types related to user-generated travel sites: basic, useful-buzz, useful-unexpected, and garbage information. Useful-unexpected information benefits users, but extracting it from user-generated content is difficult because it includes so much useful-buzz information and garbage information. We designate useful-unexpected important information as \\\"Welcome-news\\\". As described herein, we propose a means of extracting Welcome-news from user-generated travel contents. Our proposed Welcome-news is \\\"useful information\\\" and \\\"unexpected information\\\" related to travel. We first extract useful information based on Welcome news keywords, which are general keywords and unique keywords. General keywords often appear in Welcome-news. We regard general keywords by our user experiment. Unique keywords depend on the sightseeing spot. We regard unique keywords based on SVM. Next we extract unexpected information based on clustering. Our unexpected information includes topics of unexpected information that are important topics for sightseeing spots and contents that are often not stated. Subsequently, we extract important topics based on topic-based clustering. Then we extract unexpected information contents from the cluster based on its distance from the cluster center. We conducted experiments of three types to extract correct answers, to assess the feasibility of using Welcome-news keywords, and to assess the feasibility of extracting Welcome-news.\",\"PeriodicalId\":247823,\"journal\":{\"name\":\"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services\",\"volume\":\"39 6\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3011141.3011149\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3011141.3011149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

如今，在互联网上可以找到各种各样的旅游相关信息。人们在旅行前会在网上计划他们的旅行，并获取有关观光景点的信息。在获取与观光景点相关的信息时，他们很容易从官方页面获得基本信息。然而，用户生成的旅游网站上还有其他有用的信息。用户生成的旅游网站在互联网上比比皆是，提供了大量与旅游和目的地相关的各种信息。本研究探讨了与用户生成的旅游网站相关的四种类型的旅游信息:基本信息、有用的热门信息、有用的意外信息和垃圾信息。意想不到的有用信息对用户有益，但是从用户生成的内容中提取这些信息很困难，因为其中包含了太多的有用信息和垃圾信息。我们把有用的、意外的重要信息称为“欢迎消息”。如本文所述，我们提出了一种从用户生成的旅游内容中提取欢迎新闻的方法。我们提出的欢迎信息是与旅游相关的“有用信息”和“意外信息”。我们首先根据欢迎新闻关键词提取有用信息，欢迎新闻关键词分为通用关键词和唯一关键词。一般关键字经常出现在欢迎消息中。我们通过用户实验来考虑一般关键词。独特的关键词取决于观光景点。我们考虑基于SVM的唯一关键字。接下来，我们基于聚类提取意外信息。我们的意外信息包括意外信息的主题，这些主题对于景点来说是很重要的话题，也包括一些通常没有说明的内容。随后，我们基于主题聚类提取重要主题。然后根据聚类与聚类中心的距离从聚类中提取意外信息内容。我们进行了三种类型的实验来提取正确答案，评估使用Welcome-news关键词的可行性，以及评估提取Welcome-news的可行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Extracting welcome news from travel reviews

Nowadays, travel-related information of many kinds can be found on the Internet. People plan their travel and obtain information about sightseeing spots from the Internet before they travel. When obtaining information related to sightseeing spots, they receive basic information from official pages easily. However, other useful information exists on user-generated travel sites. User-generated travel sites abound on the Internet, offering great amounts of diverse information related to travel and destinations. This study addresses travel information of four types related to user-generated travel sites: basic, useful-buzz, useful-unexpected, and garbage information. Useful-unexpected information benefits users, but extracting it from user-generated content is difficult because it includes so much useful-buzz information and garbage information. We designate useful-unexpected important information as "Welcome-news". As described herein, we propose a means of extracting Welcome-news from user-generated travel contents. Our proposed Welcome-news is "useful information" and "unexpected information" related to travel. We first extract useful information based on Welcome news keywords, which are general keywords and unique keywords. General keywords often appear in Welcome-news. We regard general keywords by our user experiment. Unique keywords depend on the sightseeing spot. We regard unique keywords based on SVM. Next we extract unexpected information based on clustering. Our unexpected information includes topics of unexpected information that are important topics for sightseeing spots and contents that are often not stated. Subsequently, we extract important topics based on topic-based clustering. Then we extract unexpected information contents from the cluster based on its distance from the cluster center. We conducted experiments of three types to extract correct answers, to assess the feasibility of using Welcome-news keywords, and to assess the feasibility of extracting Welcome-news.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services

自引率

0.00%

发文量