{"title":"DisKeyword: Tweet Corpora Exploration for Keyword Selection","authors":"Sacha Lévy, Reihaneh Rabbany","doi":"10.1145/3539597.3573033","DOIUrl":null,"url":null,"abstract":"How to accelerate the search for relevant topical keywords within a tweet corpus? Computational social scientists conducting topical studies employ large, self-collected or crowdsourced social media datasets such as tweet corpora. Comprehensive sets of relevant keywords are often necessary to sample or analyze these data sources. However, naively skimming through thousands of keywords can quickly become a daunting task. In this study, we present a web-based application to simplify the search for relevant topical hashtags in a tweet corpus. DisKeyword allows users to grasp high-level trends in their dataset, while iteratively labeling keywords recommended based on their links to prior labeled hashtags. We open-source our code under the MIT license.","PeriodicalId":227804,"journal":{"name":"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3539597.3573033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
How to accelerate the search for relevant topical keywords within a tweet corpus? Computational social scientists conducting topical studies employ large, self-collected or crowdsourced social media datasets such as tweet corpora. Comprehensive sets of relevant keywords are often necessary to sample or analyze these data sources. However, naively skimming through thousands of keywords can quickly become a daunting task. In this study, we present a web-based application to simplify the search for relevant topical hashtags in a tweet corpus. DisKeyword allows users to grasp high-level trends in their dataset, while iteratively labeling keywords recommended based on their links to prior labeled hashtags. We open-source our code under the MIT license.