{"title":"Hashtags: an essential aspect of topic modeling of city events through social media.","authors":"Mikhail V. Kovalchuk, D. Nasonov","doi":"10.1109/ICMLA52953.2021.00255","DOIUrl":null,"url":null,"abstract":"Today, the city is full of digital information, which can be extremely useful in various applications. Instagram, Facebook, VKontakte, and other popular social networks contain a vast amount of valuable data. This information reflects individual stories of people and the background of the city, its events, and current activities in different areas and places of attraction. City events have essential attributes like the time of occurrence, geographical coverage, audience, and often expressed interests or topics. Owning the subject of events, you can solve a whole range of tasks - from individual recommendation systems for leisure activities for citizens and tourists to providing services in the field of food (food trucks) and transport (taxis). To determine the topic (subject) of events, it is necessary to solve two crucial tasks: to identify the events themselves from a variety of city posts and to develop an approach based on modern natural language processing methods for identifying events topics. To determine the events, we suggest an improved algorithm that we had previously developed that integrates time window and area coverage strategy. However, the focus of the work is on the analysis of different approaches to identifying topics, considering the heterogeneity of posts, both in semantic meaning and in size and structure. The focus of this paper is the importance of using post hashtags in various variations to set up more accurate models. In addition, the analysis of features for different language groups was carried out.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"1 1","pages":"1594-1599"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA52953.2021.00255","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Today, the city is full of digital information, which can be extremely useful in various applications. Instagram, Facebook, VKontakte, and other popular social networks contain a vast amount of valuable data. This information reflects individual stories of people and the background of the city, its events, and current activities in different areas and places of attraction. City events have essential attributes like the time of occurrence, geographical coverage, audience, and often expressed interests or topics. Owning the subject of events, you can solve a whole range of tasks - from individual recommendation systems for leisure activities for citizens and tourists to providing services in the field of food (food trucks) and transport (taxis). To determine the topic (subject) of events, it is necessary to solve two crucial tasks: to identify the events themselves from a variety of city posts and to develop an approach based on modern natural language processing methods for identifying events topics. To determine the events, we suggest an improved algorithm that we had previously developed that integrates time window and area coverage strategy. However, the focus of the work is on the analysis of different approaches to identifying topics, considering the heterogeneity of posts, both in semantic meaning and in size and structure. The focus of this paper is the importance of using post hashtags in various variations to set up more accurate models. In addition, the analysis of features for different language groups was carried out.