Why are you traveling? Inferring trip profiles from online reviews and domain-knowledge

Q1 Social Sciences

Online Social Networks and Media Pub Date : 2025-01-01 DOI:10.1016/j.osnem.2024.100296

Lucas G.S. Félix, Washington Cunha, Claudio M.V. de Andrade, Marcos André Gonçalves, Jussara M. Almeida

{"title":"Why are you traveling? Inferring trip profiles from online reviews and domain-knowledge","authors":"Lucas G.S. Félix, Washington Cunha, Claudio M.V. de Andrade, Marcos André Gonçalves, Jussara M. Almeida","doi":"10.1016/j.osnem.2024.100296","DOIUrl":null,"url":null,"abstract":"<div><div>This paper addresses the task of inferring trip profiles (TPs), which consists of determining the profile of travelers engaged in a particular trip given a set of possible categories. TPs may include working trips, leisure journeys with friends, or family vacations. Travelers with different TPs typically have varied plans regarding destinations and timing. TP inference may provide significant insights for numerous tourism-related services, such as geo-recommender systems and tour planning. We focus on TP inference using TripAdvisor, a prominent tourism-centric social media platform, as our data source. Our goal is to evaluate how effectively we can automatically discern the TP from a user review on this platform. A user review encompasses both textual feedback and domain-specific data (such as a user’s previous visits to the location), which are crucial for accurately characterizing the trip. To achieve this, we assess various feature sets (including text and domain-specific) and implement advanced machine learning models, such as neural Transformers and open-source Large Language Models (Llama 2, Bloom). We examine two variants of the TP inference task—binary and multi-class. Surprisingly, our findings reveal that combining domain-specific features with TF-IDF-based representation in an LGBM model performs as well as more complex Transformer and LLM models, while being much more efficient and interpretable.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100296"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Online Social Networks and Media","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468696424000211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}

引用次数: 0

Abstract

This paper addresses the task of inferring trip profiles (TPs), which consists of determining the profile of travelers engaged in a particular trip given a set of possible categories. TPs may include working trips, leisure journeys with friends, or family vacations. Travelers with different TPs typically have varied plans regarding destinations and timing. TP inference may provide significant insights for numerous tourism-related services, such as geo-recommender systems and tour planning. We focus on TP inference using TripAdvisor, a prominent tourism-centric social media platform, as our data source. Our goal is to evaluate how effectively we can automatically discern the TP from a user review on this platform. A user review encompasses both textual feedback and domain-specific data (such as a user’s previous visits to the location), which are crucial for accurately characterizing the trip. To achieve this, we assess various feature sets (including text and domain-specific) and implement advanced machine learning models, such as neural Transformers and open-source Large Language Models (Llama 2, Bloom). We examine two variants of the TP inference task—binary and multi-class. Surprisingly, our findings reveal that combining domain-specific features with TF-IDF-based representation in an LGBM model performs as well as more complex Transformer and LLM models, while being much more efficient and interpretable.

Abstract Image

查看原文本刊更多论文

你为什么要旅行？从在线评论和领域知识推断旅行概况

本文讨论了推断旅行概况（TPs）的任务，该任务包括确定给定一组可能类别的特定旅行中从事的旅行者的概况。旅行计划可能包括工作旅行、与朋友的休闲旅行或家庭度假。有不同旅游计划的旅行者通常在目的地和时间方面有不同的计划。TP推理可以为许多与旅游相关的服务提供重要的见解，例如地理推荐系统和旅游规划。我们使用TripAdvisor（一个著名的以旅游为中心的社交媒体平台）作为我们的数据源，专注于TP推断。我们的目标是评估我们在这个平台上从用户评论中自动识别TP的有效性。用户评论包含文本反馈和特定于领域的数据（例如用户以前对该位置的访问），这对于准确描述旅行的特征至关重要。为了实现这一点，我们评估了各种特征集（包括文本和特定领域），并实现了先进的机器学习模型，如神经变形器和开源大型语言模型（Llama 2, Bloom）。我们研究了TP推理任务的两种变体——二进制和多类。令人惊讶的是，我们的研究结果表明，在LGBM模型中，将特定领域的特征与基于tf - idf的表示相结合，与更复杂的Transformer和LLM模型表现一样好，同时效率更高，可解释性更高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊