Liyi Zheng, Tian Nie, Ichiro Moriya, Yusuke Inoue, Takakazu Imada, T. Utsuro, Yasuhide Kawada, N. Kando
{"title":"日语和汉语博客的话题比较分析","authors":"Liyi Zheng, Tian Nie, Ichiro Moriya, Yusuke Inoue, Takakazu Imada, T. Utsuro, Yasuhide Kawada, N. Kando","doi":"10.1109/WAINA.2014.107","DOIUrl":null,"url":null,"abstract":"This paper first studies how to apply a topic model to Chinese and Japanese blog posts collected from a few hundred Chinese and Japanese bloggers and then to classify bloggers into topics. The estimated topics are exploited in the task of over viewing the Chinese and Japanese bloggers' concerns, opinions, and cultures. Those topics are also quite helpful when comparing them between Chinese and Japanese in order to discover differences in the concerns, opinions, and cultures of the two languages. In the evaluation, we collect a few hundred bloggers from a well-known Sina blog host bloggers categories in China, and an also well-known blogger community service Nihon Blog Mura in Japan. As case studies, we focus on the \"health\", \"military\", and \"nursing care\" categories in the services of both languages, and generate topics based on a topic model, and then overview and compare them between Chinese and Japanese. We actually discover certain differences in bloggers' topics between Chinese and Japanese.","PeriodicalId":424903,"journal":{"name":"2014 28th International Conference on Advanced Information Networking and Applications Workshops","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Comparative Topic Analysis of Japanese and Chinese Bloggers\",\"authors\":\"Liyi Zheng, Tian Nie, Ichiro Moriya, Yusuke Inoue, Takakazu Imada, T. Utsuro, Yasuhide Kawada, N. Kando\",\"doi\":\"10.1109/WAINA.2014.107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper first studies how to apply a topic model to Chinese and Japanese blog posts collected from a few hundred Chinese and Japanese bloggers and then to classify bloggers into topics. The estimated topics are exploited in the task of over viewing the Chinese and Japanese bloggers' concerns, opinions, and cultures. Those topics are also quite helpful when comparing them between Chinese and Japanese in order to discover differences in the concerns, opinions, and cultures of the two languages. In the evaluation, we collect a few hundred bloggers from a well-known Sina blog host bloggers categories in China, and an also well-known blogger community service Nihon Blog Mura in Japan. As case studies, we focus on the \\\"health\\\", \\\"military\\\", and \\\"nursing care\\\" categories in the services of both languages, and generate topics based on a topic model, and then overview and compare them between Chinese and Japanese. We actually discover certain differences in bloggers' topics between Chinese and Japanese.\",\"PeriodicalId\":424903,\"journal\":{\"name\":\"2014 28th International Conference on Advanced Information Networking and Applications Workshops\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 28th International Conference on Advanced Information Networking and Applications Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WAINA.2014.107\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 28th International Conference on Advanced Information Networking and Applications Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WAINA.2014.107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
摘要
本文首先研究了如何将主题模型应用到几百个中日博主的中文和日语博客文章中,然后对博主进行主题分类。这些预估的话题被用来检视中国和日本博客的关注点、观点和文化。这些话题在比较汉语和日语时也很有帮助,可以发现两种语言在关注点、观点和文化上的差异。在评估中,我们从中国知名的新浪博客托管博客类别和同样知名的博客社区服务Nihon blog Mura中收集了几百名博主。作为案例研究,我们以两种语言服务中的“健康”、“军事”和“护理”类别为重点,基于主题模型生成主题,然后在中文和日语之间进行概述和比较。实际上,我们发现中文和日文博客的话题存在一定的差异。
Comparative Topic Analysis of Japanese and Chinese Bloggers
This paper first studies how to apply a topic model to Chinese and Japanese blog posts collected from a few hundred Chinese and Japanese bloggers and then to classify bloggers into topics. The estimated topics are exploited in the task of over viewing the Chinese and Japanese bloggers' concerns, opinions, and cultures. Those topics are also quite helpful when comparing them between Chinese and Japanese in order to discover differences in the concerns, opinions, and cultures of the two languages. In the evaluation, we collect a few hundred bloggers from a well-known Sina blog host bloggers categories in China, and an also well-known blogger community service Nihon Blog Mura in Japan. As case studies, we focus on the "health", "military", and "nursing care" categories in the services of both languages, and generate topics based on a topic model, and then overview and compare them between Chinese and Japanese. We actually discover certain differences in bloggers' topics between Chinese and Japanese.