Facilitating topic modeling in tourism research:Comprehensive comparison of new AI technologies

IF 10.9 1区 管理学 Q1 ENVIRONMENTAL STUDIES
Andrei P. Kirilenko, Svetlana Stepchenkova
{"title":"Facilitating topic modeling in tourism research:Comprehensive comparison of new AI technologies","authors":"Andrei P. Kirilenko,&nbsp;Svetlana Stepchenkova","doi":"10.1016/j.tourman.2024.105007","DOIUrl":null,"url":null,"abstract":"<div><p>In the past few years, a new crop of transformer-based language models such as Google's BERT and OpenAI's ChatGPT has become increasingly popular in text analysis, owing their success to their ability to capture the entire document's context. These new methods, however, have yet to percolate into tourism academic literature. This paper aims to fill in this gap by providing a comparative analysis of these instruments against the commonly used Latent Dirichlet Allocation for topic extraction of contrasting tourism-related data: coherent vs. noisy, short vs. long, and small vs. large corpus size. The data are typical of tourism literature and include comments of followers of a popular blogger, TripAdvisor reviews, and review titles. We provide recommendations of data domains where the review methods demonstrate the best performance, consider success dimensions, and discuss each method's strong and weak sides. In general, GPT tends to return comprehensive, highly interpretable, and relevant to the real-world topics for all datasets, including the noisy ones, and at all scales. Meanwhile, ChatGPT is the most vulnerable to the issue of trust common to the “black box” model, which we explore in detail.</p></div>","PeriodicalId":48469,"journal":{"name":"Tourism Management","volume":"106 ","pages":"Article 105007"},"PeriodicalIF":10.9000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0261517724001262/pdfft?md5=9cf4802a7c4ae0637a244cf2391ccfb1&pid=1-s2.0-S0261517724001262-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tourism Management","FirstCategoryId":"91","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0261517724001262","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL STUDIES","Score":null,"Total":0}
引用次数: 0

Abstract

In the past few years, a new crop of transformer-based language models such as Google's BERT and OpenAI's ChatGPT has become increasingly popular in text analysis, owing their success to their ability to capture the entire document's context. These new methods, however, have yet to percolate into tourism academic literature. This paper aims to fill in this gap by providing a comparative analysis of these instruments against the commonly used Latent Dirichlet Allocation for topic extraction of contrasting tourism-related data: coherent vs. noisy, short vs. long, and small vs. large corpus size. The data are typical of tourism literature and include comments of followers of a popular blogger, TripAdvisor reviews, and review titles. We provide recommendations of data domains where the review methods demonstrate the best performance, consider success dimensions, and discuss each method's strong and weak sides. In general, GPT tends to return comprehensive, highly interpretable, and relevant to the real-world topics for all datasets, including the noisy ones, and at all scales. Meanwhile, ChatGPT is the most vulnerable to the issue of trust common to the “black box” model, which we explore in detail.

促进旅游研究中的主题建模:人工智能新技术的综合比较
在过去几年中,谷歌的 BERT 和 OpenAI 的 ChatGPT 等新一批基于转换器的语言模型在文本分析领域越来越受欢迎,其成功之处在于它们能够捕捉整个文档的上下文。然而,这些新方法尚未渗透到旅游学术文献中。本文旨在填补这一空白,将这些方法与常用的 Latent Dirichlet Allocation 进行对比分析,以提取旅游相关数据的主题:连贯与嘈杂、短与长、小语料库与大语料库。这些数据是典型的旅游文献,包括一位热门博主的粉丝评论、TripAdvisor 评论和评论标题。我们推荐了评论方法表现最佳的数据域,考虑了成功维度,并讨论了每种方法的强项和弱项。总的来说,对于所有数据集(包括噪声数据集)和所有尺度的数据集,GPT 都倾向于返回全面、高度可解释且与现实世界主题相关的结果。同时,ChatGPT 最容易受到 "黑箱 "模型常见的信任问题的影响,我们将对此进行详细探讨。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Tourism Management
Tourism Management Multiple-
CiteScore
24.10
自引率
7.90%
发文量
190
审稿时长
45 days
期刊介绍: Tourism Management, the preeminent scholarly journal, concentrates on the comprehensive management aspects, encompassing planning and policy, within the realm of travel and tourism. Adopting an interdisciplinary perspective, the journal delves into international, national, and regional tourism, addressing various management challenges. Its content mirrors this integrative approach, featuring primary research articles, progress in tourism research, case studies, research notes, discussions on current issues, and book reviews. Emphasizing scholarly rigor, all published papers are expected to contribute to theoretical and/or methodological advancements while offering specific insights relevant to tourism management and policy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信