Information Processing & Management最新文献

筛选
英文 中文
Injecting new insights: How do review sentiment and rating inconsistency shape the helpfulness of airline reviews?
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-07 DOI: 10.1016/j.ipm.2025.104088
Yang Liu , Lihua Ma , Yue Dou , Zhen Zhu , Lili Ma , Zhuoxin Liu
{"title":"Injecting new insights: How do review sentiment and rating inconsistency shape the helpfulness of airline reviews?","authors":"Yang Liu ,&nbsp;Lihua Ma ,&nbsp;Yue Dou ,&nbsp;Zhen Zhu ,&nbsp;Lili Ma ,&nbsp;Zhuoxin Liu","doi":"10.1016/j.ipm.2025.104088","DOIUrl":"10.1016/j.ipm.2025.104088","url":null,"abstract":"<div><div>Evaluating review helpfulness is pivotal in assessing the caliber of airline reviews, instigating lively debates in both academic and practical spheres. This study endeavors to construct a comprehensive conceptual framework grounded in signaling theory, recognizing two factors as indicators shaping the perceived helpfulness of reviews. Empirical analysis was conducted using 82,539 reviews from nine airlines on TripAdvisor. Initially, the study scrutinizes the combined impact of review sentiment and consumer rating, followed by exploring the influence of review inconsistency on review helpfulness. Our experimental results show that most variables achieved a significance of one thousandth. Additionally, we shed light on the moderating effects of several heuristic clues in the model, including text length, seat class, and region. These findings underscore those heuristic clues that collectively influence the helpfulness of reviews. The outcomes of this research can aid airlines in identifying the most helpful reviews, thereby mitigating consumer search costs and empowering reviewers to contribute more valuable insights.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104088"},"PeriodicalIF":7.4,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143360820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring hate speech dynamics: The emotional, linguistic, and thematic impact on social media users
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-05 DOI: 10.1016/j.ipm.2025.104079
Amira Ghenai , Zeinab Noorian , Hadiseh Moradisani , Parya Abadeh , Caroline Erentzen , Fattane Zarrinkalam
{"title":"Exploring hate speech dynamics: The emotional, linguistic, and thematic impact on social media users","authors":"Amira Ghenai ,&nbsp;Zeinab Noorian ,&nbsp;Hadiseh Moradisani ,&nbsp;Parya Abadeh ,&nbsp;Caroline Erentzen ,&nbsp;Fattane Zarrinkalam","doi":"10.1016/j.ipm.2025.104079","DOIUrl":"10.1016/j.ipm.2025.104079","url":null,"abstract":"<div><div>Online hate speech has become a critical issue, particularly during the COVID-19 pandemic, when anti-Asian sentiment surged across social media platforms. However, the causal mechanisms driving emotional and behavioral shifts in users posting hateful content remain understudied. This study investigates the causal relationship between engaging in hateful content and changes in linguistic and emotional expression on social media. Using a dataset of 6,002 Twitter/X users, we employ causal inference techniques, including propensity score matching, and advanced topic modeling to compare users posting hateful content with a matched group of non-hateful users. Our main findings can be summarized as follows: (a) Users who post hateful content show significantly higher levels of anger, anxiety, and negative emotions, along with increased third-person pronoun usage. (b) Moral outrage and profanity levels peak during hateful posts but decline over time, while remaining elevated compared to non-hateful posts. (c) Hateful posts are more interconnected, cover more diverse topics, and are more similar to one another, revealing lower cohesion within individual posts but higher cohesion across posts. These findings contribute to understanding the causal effects of online hate speech on user behavior, offering actionable insights for social media platforms to mitigate the spread of hateful content and its broader societal impact.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104079"},"PeriodicalIF":7.4,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143242449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging LLMs for action item identification in Urdu meetings: Dataset creation and comparative analysis
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-05 DOI: 10.1016/j.ipm.2025.104071
Bareera Sadia , Farah Adeeba , Sana Shams , Sarmad Hussain
{"title":"Leveraging LLMs for action item identification in Urdu meetings: Dataset creation and comparative analysis","authors":"Bareera Sadia ,&nbsp;Farah Adeeba ,&nbsp;Sana Shams ,&nbsp;Sarmad Hussain","doi":"10.1016/j.ipm.2025.104071","DOIUrl":"10.1016/j.ipm.2025.104071","url":null,"abstract":"<div><div>In response to the increasing number of online meetings, automation of action items identification in online Urdu meetings, has become crucial. To serve this purpose, this research presents the first ever dataset and guidelines for annotating action items in code-mixed Urdu-English language. Collected dataset comprises of 240 recorded meetings, 600 fabricated action items, and 250 real meeting action items, totaling 2948 action items. We evaluated the efficiency and accuracy of various deep learning and machine learning models through a comparative analysis on a balanced dataset being discussed in Section 4.2. Additionally, three Large Language Models (LLMs) BLOOMZ, LLaMA, and GPT-3.5 were tested using zero-shot and few-shot configurations. BLOOMZ and LLaMA were specifically fine-tuned to enhance their performance in recognizing Urdu meeting action items. The fine-tuned model, ur_BLOOMZ-1b1, achieved the highest average F1 score of 0.94, surpassing all other traditional models. This study lays a solid foundation for future research in multilingual environments and advances our understanding of action item identification in Urdu meetings.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104071"},"PeriodicalIF":7.4,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143242450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep multi-view subspace clustering via hierarchical diversity optimization of consensus learning
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-04 DOI: 10.1016/j.ipm.2025.104081
Siyu Chen , Lifan Peng , Xiaoqian Zhang , Yufeng Chen , Er Wang , Zhenwen Ren
{"title":"Deep multi-view subspace clustering via hierarchical diversity optimization of consensus learning","authors":"Siyu Chen ,&nbsp;Lifan Peng ,&nbsp;Xiaoqian Zhang ,&nbsp;Yufeng Chen ,&nbsp;Er Wang ,&nbsp;Zhenwen Ren","doi":"10.1016/j.ipm.2025.104081","DOIUrl":"10.1016/j.ipm.2025.104081","url":null,"abstract":"<div><div>Deep multi-view subspace clustering outperforms classic multi-view clustering methods due to its powerful nonlinear feature extraction capabilities. Nevertheless, current deep multi-view clustering approaches face several challenges: (1) a lack of multi-level feature expression during consensus feature learning; (2) some nonlinear geometric structures in the data have not been fully utilized, leading to incomplete graph information representation; (3) the neglect of robust supervision from the original feature matrix in the multi-view clustering. To address these issues, we propose a Deep Multi-view Subspace Clustering via Hierarchical Diversity Optimization of Consensus Learning, termed as DMSC-HDOC. Our framework integrates three key modules: The hierarchical self-weighted fusion (HSF) module to resample the original features and learn more diverse features. On this basis, dual laplacian constraint (DLC) module are exploited to mine the geometric structure of the data samples. Finally, self-alignment contrast (SaC) is effectively used to supervise the consensus features of the original features. Extensive experiments on the several widely used datasets have shown the superiority of the proposed DMSC-HDOC compared to existing state-of-the-arts methods.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104081"},"PeriodicalIF":7.4,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TK-RNSP: Efficient Top-K Repetitive Negative Sequential Pattern mining
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-01 DOI: 10.1016/j.ipm.2025.104077
Dun Lan , Chuanhou Sun , Xiangjun Dong , Ping Qiu , Yongshun Gong , Xinwang Liu , Philippe Fournier-Viger , Chengqi Zhang
{"title":"TK-RNSP: Efficient Top-K Repetitive Negative Sequential Pattern mining","authors":"Dun Lan ,&nbsp;Chuanhou Sun ,&nbsp;Xiangjun Dong ,&nbsp;Ping Qiu ,&nbsp;Yongshun Gong ,&nbsp;Xinwang Liu ,&nbsp;Philippe Fournier-Viger ,&nbsp;Chengqi Zhang","doi":"10.1016/j.ipm.2025.104077","DOIUrl":"10.1016/j.ipm.2025.104077","url":null,"abstract":"<div><div>Repetitive Negative Sequential Patterns (RNSPs) can provide critical insights into the importance of sequences. However, most current RNSP mining methods require users to set an appropriate support threshold to obtain the expected number of patterns, which is a very difficult task for the users without prior experience. To address this issue, we propose a new algorithm, TK-RNSP, to mine the Top-<span><math><mi>K</mi></math></span> RNSPs with the highest support, without the need to set a support threshold. In detail, we achieve a significant breakthrough by proposing a series of definitions that enable RNSP mining to satisfy anti-monotonicity. Then, we propose a bitmap-based Depth-First Backtracking Search (DFBS) strategy to decrease the heavy computational burden by increasing the speed of support calculation. Finally, we propose the algorithm TK-RNSP in an one-stage process, which can effectively reduce the generation of unnecessary patterns and improve computational efficiency comparing to those two-stage process algorithms. To the best of our knowledge, TK-RNSP is the first algorithm to mine Top-<span><math><mi>K</mi></math></span> RNSPs. Extensive experiments on eight datasets show that TK-RNSP has better flexibility and efficiency to mine Top-<span><math><mi>K</mi></math></span> RNSPs.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104077"},"PeriodicalIF":7.4,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incorporating Forgetting Curve and Memory Replay for Evolving Socially-aware Recommendation
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-01-28 DOI: 10.1016/j.ipm.2025.104070
Hongqi Chen, Zhiyong Feng, Shizhan Chen, Hongyue Wu, Yingchao Sun, Jingyu Li, Qinghang Gao, Lu Zhang, Xiao Xue
{"title":"Incorporating Forgetting Curve and Memory Replay for Evolving Socially-aware Recommendation","authors":"Hongqi Chen,&nbsp;Zhiyong Feng,&nbsp;Shizhan Chen,&nbsp;Hongyue Wu,&nbsp;Yingchao Sun,&nbsp;Jingyu Li,&nbsp;Qinghang Gao,&nbsp;Lu Zhang,&nbsp;Xiao Xue","doi":"10.1016/j.ipm.2025.104070","DOIUrl":"10.1016/j.ipm.2025.104070","url":null,"abstract":"<div><div>Social recommendations play a crucial role in helping users filter information and discover potential requirements. However, existing works often ignore the effects of memory patterns and social inconsistency, which hide the recommendation for capturing evolving user interests. To overcome these problems, a model incorporating the Forgetting curve and Memory Replay for Evolving Socially-aware recommendation (FMRES) is proposed to navigate users’ fresh interests. Specifically, a cognitive-inspired Ebbinghaus curve is integrated with item attributes to consider users’ personalized interest forgetting and retention. Then, the memory replay mechanism is employed to revive forgotten yet valuable items, fostering user engagement and enhancing relevance in recommendations. By aggregating the neighbors’ social characters, consistent friends are sampled to identify meaningful and impactful relationships. Finally, temporal representations of users and items are incorporated to track the evolution of users’ interests by utilizing gated recurrent units. Extensive experiments on three datasets demonstrate that the proposed model consistently outperforms advanced baseline methods over various metrics.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104070"},"PeriodicalIF":7.4,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging in-task emotional responses with post-task evaluations in digital library search interface user studies
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-01-28 DOI: 10.1016/j.ipm.2025.104069
Abbas Pirmoradi, Orland Hoeber
{"title":"Bridging in-task emotional responses with post-task evaluations in digital library search interface user studies","authors":"Abbas Pirmoradi,&nbsp;Orland Hoeber","doi":"10.1016/j.ipm.2025.104069","DOIUrl":"10.1016/j.ipm.2025.104069","url":null,"abstract":"<div><div>Interactive information retrieval (IIR) interfaces are commonly evaluated using questionnaires that collect post-task subjective measures such as satisfaction, ease of use, usefulness, and user engagement. Although the importance of measuring emotional responses during the search process has been recognized, incorporating this aspect into IIR user studies has been challenging. We have developed a novel method to capture real-time emotional responses based on advances in facial emotion classification approaches. We utilize consumer-grade front-facing cameras to collect emotional responses, which synchronize with the user’s interactions with the search interface. In a controlled laboratory study, the relevance of search results was manipulated to validate the approach’s effectiveness and explore how search results’ relevance impacts users’ emotional responses, post-task evaluations of the search interface, and interactions with search interface features. This enabled us to examine whether we could detect emotional responses, whether recency effects were observed in post-task evaluations, and whether feature use correlated with emotional responses. The study was conducted in the context of exploratory search within an academic digital library. The results of this study demonstrate that both positive and negative emotional responses can be reliably detected during the search process. There is evidence of recency effects in post-task measures, and the study identifies specific interactive features used during the experience of positive and negative emotional responses. This serves as a foundation for the use of emotional responses to supplement post-task survey data when evaluating search interfaces.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104069"},"PeriodicalIF":7.4,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Local context enhanced Consistency-aware Mamba-based Sequential Recommendation model
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-01-28 DOI: 10.1016/j.ipm.2025.104076
Zhu Zhang, Bo Yang, Yimeng Lu
{"title":"A Local context enhanced Consistency-aware Mamba-based Sequential Recommendation model","authors":"Zhu Zhang,&nbsp;Bo Yang,&nbsp;Yimeng Lu","doi":"10.1016/j.ipm.2025.104076","DOIUrl":"10.1016/j.ipm.2025.104076","url":null,"abstract":"<div><div>Sequential recommendation (SR) focuses on capturing users’ interests from their historical behaviors. Transformer-based SR models have demonstrated promising performance by leveraging self-attention for sequential modeling. Recently, Mamba, a novel sequential model, has shown competitive performance compared to Transformers. In SR tasks, item representation learning involves both global and local context information. While several existing SR models attempt to address this integration, they suffer from inferior performance or computational inefficiency. Moreover, existing Mamba-based SR model appears to capture only the global context information. Given Mamba’s merits in enhancing model performance and efficiency, there is substantial potential to more effectively integrate both global and local context information within a Mamba-based framework. Additionally, consistency training, which is pivotal for enhancing model performance, remains underexplored in existing SR models.</div><div>To tackle these challenges, we propose a Local Context Enhanced Consistency-aware Mamba-based Sequential Recommendation Model (LC-Mamba). LC-Mamba captures both global and local context information to improve recommendation performance. Specifically, LC-Mamba leverages a GNN-based sequence encoder to extract information from local neighbors for each item (local context information) in a graph view, while utilizing a Mamba-based sequence encoder to capture dependencies between items in the sequence (global context information) in a sequential view. Furthermore, we introduce consistency training, including model-level and representation-level consistency, to further enhance performance. Specifically, we incorporate R-Drop regularization into the Mamba-based sequence encoder to mitigate the inconsistency between training and inference caused by random dropout (model-level consistency). Additionally, we leverage contrastive learning to enhance consistency between the item representations learned from the sequential and graph views (representation-level consistency). Extensive experiments on three widely used datasets illustrate that LC-Mamba outperforms baseline models in HR and NDCG, achieving up to a 31.03% improvement in NDCG. LC-Mamba can be applied to real-world applications such as e-commerce and content platforms.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104076"},"PeriodicalIF":7.4,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overcoming language barriers via machine translation with sparse Mixture-of-Experts fusion of large language models
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-01-28 DOI: 10.1016/j.ipm.2025.104078
Shaolin Zhu , Leiyu Pan , Dong Jian , Deyi Xiong
{"title":"Overcoming language barriers via machine translation with sparse Mixture-of-Experts fusion of large language models","authors":"Shaolin Zhu ,&nbsp;Leiyu Pan ,&nbsp;Dong Jian ,&nbsp;Deyi Xiong","doi":"10.1016/j.ipm.2025.104078","DOIUrl":"10.1016/j.ipm.2025.104078","url":null,"abstract":"<div><div>Large language models (LLMs) hold great promise for cross-lingual applications to power machine translation (MT) systems. However, directly fine-tuning LLMs on parallel data risks catastrophic forgetting and lacks explainability in cross-lingual knowledge transfer. In this paper, we introduce MoE-LLM, a novel fusion framework that enhances the multilingual translation abilities of LLMs by incorporating sparse Mixture-of-Experts (MoEs) components via hybrid transfer learning. MoE-LLM freezes the LLM parameters, mitigating forgetting, and introduces specialized translation experts within the MoEs modules. Our hybrid initialization strategy further bridges the representation gap by warm-starting MoE parameters using LLM representations. We evaluated MoE-LLM on 10 translation directions across 6 languages using the WMT benchmark. Compared with directly fine-tuning LLMs, MoE-LLM significantly improved translation quality, achieving gains of up to 2.5 BLEU points, with at least some improvement in zero-shot translation scenarios and surpassing other strong baselines like Adapter and LoRA-F. Our ablation studies highlight the effectiveness of the cascaded fusion strategy and the mixed initialization approach for optimal performance. MoE-LLM offers an effective and explainable solution for adapting pre-trained LLMs to multilingual machine translation, with particular benefits in low-resource and zero-shot scenarios.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104078"},"PeriodicalIF":7.4,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KRongBERT: Enhanced factorization-based morphological approach for the Korean pretrained language model
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-01-27 DOI: 10.1016/j.ipm.2025.104072
Hyunwook Yu , Yejin Cho , Geunchul Park , Mucheol Kim
{"title":"KRongBERT: Enhanced factorization-based morphological approach for the Korean pretrained language model","authors":"Hyunwook Yu ,&nbsp;Yejin Cho ,&nbsp;Geunchul Park ,&nbsp;Mucheol Kim","doi":"10.1016/j.ipm.2025.104072","DOIUrl":"10.1016/j.ipm.2025.104072","url":null,"abstract":"<div><div>The bidirectional encoder representations from transformers (BERT) model has achieved remarkable success in various natural language processing tasks for Latin-based languages. However, the Korean language presents unique challenges with limited data resources and complex linguistic structures. In this paper, we present KRongBERT, a language model specifically designed through a morphological approach to effectively address the unique linguistic complexities of Korean. KRongBERT mitigates the out-of-vocabulary issues that arise with byte-pair-encoding tokenizers in Korean and incorporates language-specific embedding layers to enhance understanding. Our model demonstrates up to an 1.56% improvement in performance on specific natural language understanding tasks compared to the traditional BERT implementations. Notably, KRongBERT achieves superior performance compared to existing state-of-the-art Korean BERT models while utilizing only 11.42% of the data required by other models. Our experiments demonstrate that KRongBERT efficiently handles the complexities of the Korean language, outperforming current state-of-the-art approaches. The code is publicly available at <span><span>https://github.com/Splo2t/KRongBERT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104072"},"PeriodicalIF":7.4,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信