{"title":"从旅游评论中提取受欢迎的消息","authors":"Keigo Sakai, Akiyo Nadamoto","doi":"10.1145/3011141.3011149","DOIUrl":null,"url":null,"abstract":"Nowadays, travel-related information of many kinds can be found on the Internet. People plan their travel and obtain information about sightseeing spots from the Internet before they travel. When obtaining information related to sightseeing spots, they receive basic information from official pages easily. However, other useful information exists on user-generated travel sites. User-generated travel sites abound on the Internet, offering great amounts of diverse information related to travel and destinations. This study addresses travel information of four types related to user-generated travel sites: basic, useful-buzz, useful-unexpected, and garbage information. Useful-unexpected information benefits users, but extracting it from user-generated content is difficult because it includes so much useful-buzz information and garbage information. We designate useful-unexpected important information as \"Welcome-news\". As described herein, we propose a means of extracting Welcome-news from user-generated travel contents. Our proposed Welcome-news is \"useful information\" and \"unexpected information\" related to travel. We first extract useful information based on Welcome news keywords, which are general keywords and unique keywords. General keywords often appear in Welcome-news. We regard general keywords by our user experiment. Unique keywords depend on the sightseeing spot. We regard unique keywords based on SVM. Next we extract unexpected information based on clustering. Our unexpected information includes topics of unexpected information that are important topics for sightseeing spots and contents that are often not stated. Subsequently, we extract important topics based on topic-based clustering. Then we extract unexpected information contents from the cluster based on its distance from the cluster center. We conducted experiments of three types to extract correct answers, to assess the feasibility of using Welcome-news keywords, and to assess the feasibility of extracting Welcome-news.","PeriodicalId":247823,"journal":{"name":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","volume":"39 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Extracting welcome news from travel reviews\",\"authors\":\"Keigo Sakai, Akiyo Nadamoto\",\"doi\":\"10.1145/3011141.3011149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, travel-related information of many kinds can be found on the Internet. People plan their travel and obtain information about sightseeing spots from the Internet before they travel. When obtaining information related to sightseeing spots, they receive basic information from official pages easily. However, other useful information exists on user-generated travel sites. User-generated travel sites abound on the Internet, offering great amounts of diverse information related to travel and destinations. This study addresses travel information of four types related to user-generated travel sites: basic, useful-buzz, useful-unexpected, and garbage information. Useful-unexpected information benefits users, but extracting it from user-generated content is difficult because it includes so much useful-buzz information and garbage information. We designate useful-unexpected important information as \\\"Welcome-news\\\". As described herein, we propose a means of extracting Welcome-news from user-generated travel contents. Our proposed Welcome-news is \\\"useful information\\\" and \\\"unexpected information\\\" related to travel. We first extract useful information based on Welcome news keywords, which are general keywords and unique keywords. General keywords often appear in Welcome-news. We regard general keywords by our user experiment. Unique keywords depend on the sightseeing spot. We regard unique keywords based on SVM. Next we extract unexpected information based on clustering. Our unexpected information includes topics of unexpected information that are important topics for sightseeing spots and contents that are often not stated. Subsequently, we extract important topics based on topic-based clustering. Then we extract unexpected information contents from the cluster based on its distance from the cluster center. We conducted experiments of three types to extract correct answers, to assess the feasibility of using Welcome-news keywords, and to assess the feasibility of extracting Welcome-news.\",\"PeriodicalId\":247823,\"journal\":{\"name\":\"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services\",\"volume\":\"39 6\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3011141.3011149\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3011141.3011149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Nowadays, travel-related information of many kinds can be found on the Internet. People plan their travel and obtain information about sightseeing spots from the Internet before they travel. When obtaining information related to sightseeing spots, they receive basic information from official pages easily. However, other useful information exists on user-generated travel sites. User-generated travel sites abound on the Internet, offering great amounts of diverse information related to travel and destinations. This study addresses travel information of four types related to user-generated travel sites: basic, useful-buzz, useful-unexpected, and garbage information. Useful-unexpected information benefits users, but extracting it from user-generated content is difficult because it includes so much useful-buzz information and garbage information. We designate useful-unexpected important information as "Welcome-news". As described herein, we propose a means of extracting Welcome-news from user-generated travel contents. Our proposed Welcome-news is "useful information" and "unexpected information" related to travel. We first extract useful information based on Welcome news keywords, which are general keywords and unique keywords. General keywords often appear in Welcome-news. We regard general keywords by our user experiment. Unique keywords depend on the sightseeing spot. We regard unique keywords based on SVM. Next we extract unexpected information based on clustering. Our unexpected information includes topics of unexpected information that are important topics for sightseeing spots and contents that are often not stated. Subsequently, we extract important topics based on topic-based clustering. Then we extract unexpected information contents from the cluster based on its distance from the cluster center. We conducted experiments of three types to extract correct answers, to assess the feasibility of using Welcome-news keywords, and to assess the feasibility of extracting Welcome-news.