Zhigang Wang , Ye Deng , Yu Dong , Jürgen Kurths , Jun Wu
{"title":"Spatial network disintegration based on ranking aggregation","authors":"Zhigang Wang , Ye Deng , Yu Dong , Jürgen Kurths , Jun Wu","doi":"10.1016/j.ipm.2024.103955","DOIUrl":"10.1016/j.ipm.2024.103955","url":null,"abstract":"<div><div>Disintegrating harmful networks presents a significant challenge, especially in spatial networks where both topological and geospatial features must be considered. Existing methods that rely on a single metric often fail to capture the full complexity of such networks. To address these limitations, we propose a novel ranking aggregation-based algorithm for spatial network disintegration. Our approach integrates multiple region centrality metrics, providing a comprehensive evaluation of region importance. The algorithm operates in two stages: first, multiple rankings based on different centrality metrics are aggregated into a composite ranking to refine the candidate regions for disintegration. In the second stage, an exact target enumeration method is applied within this candidate set to determine the optimal combination of regions that maximizes disintegration impact. This interconnected approach effectively combines ranking aggregation with targeted enumeration to ensure both efficiency and accuracy. Extensive experiments are conducted on synthetic and real-world spatial networks of different network configurations. The results demonstrate that our method consistently achieves superior disintegration performance compared to traditional approaches, effectively addressing the challenges associated with spatial network disintegration. This study provides a contribution to understanding and improving spatial network disintegration strategies by leveraging a comprehensive, multi-criteria approach.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103955"},"PeriodicalIF":7.4,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Embracing the power of ensemble forecasting: A novel hybrid approach for advanced predictive modeling","authors":"Isha Malhotra, Nidhi Goel","doi":"10.1016/j.ipm.2024.103954","DOIUrl":"10.1016/j.ipm.2024.103954","url":null,"abstract":"<div><div>Amidst the persistent threat of epidemics, effectively managing their complexities requires accurate forecasting to anticipate their trajectory, thus enabling the preparation and implementation of effective mitigation strategies. With a special emphasis on COVID-19, the present work focuses on the Omicron variant, recognizing its significance in the global context of infectious diseases. The proposed research evaluates the effectiveness of both univariate and multivariate frameworks utilizing statistical and deep learning approaches to forecast the spread of the epidemic. Forecasting robustness is boosted by effectively correlating linear and non-linear components with the original series. To improve the performance, correlation is facilitated using correlation-driven weights within the statistically enforced deep learning model (WD-ensemble framework). The modeling process utilizes 493 data points and multivariate time-series records, including infected cases, vaccinated cases, and stringency index. The training dataset spans from November 1, 2021, to January 17, 2023, while the testing dataset covers the period from January 18, 2023, to March 8, 2023. The proposed WD-ensemble framework, incorporating stochasticity, outperforms all other state-of-the-art models, yielding highly reliable forecasts with remarkably low RMSE of 907.54, MAPE of 0.0008, and MAE of 670.78. It demonstrates a reduction in error percentages compared to the top-performing existing model, with decreases of 30.0267% in RMSE, 20% in MAPE, and 24.9411% in MAE. A pivotal revelation in this research is the robust negative correlation (-0.86) between vaccinated and confirmed cases as compared to the stringency index, implying that widespread vaccination could warrant the relaxation of stringent measures, including business and school closures.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103954"},"PeriodicalIF":7.4,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Retrieve–Revise–Refine: A novel framework for retrieval of concise entailing legal article set","authors":"Chau Nguyen, Phuong Nguyen, Le-Minh Nguyen","doi":"10.1016/j.ipm.2024.103949","DOIUrl":"10.1016/j.ipm.2024.103949","url":null,"abstract":"<div><div>The retrieval of entailing legal article sets aims to identify a concise set of legal articles that holds an entailment relationship with a legal query or its negation. Unlike traditional information retrieval that focuses on relevance ranking, this task demands conciseness. However, prior research has inadequately addressed this need by employing traditional methods. To bridge this gap, we propose a three-stage Retrieve–Revise–Refine framework which explicitly addresses the need for conciseness by utilizing both small and large language models (LMs) in distinct yet complementary roles. Empirical evaluations on the COLIEE 2022 and 2023 datasets demonstrate that our framework significantly enhances performance, achieving absolute increases in the macro F2 score by 3.17% and 4.24% over previous state-of-the-art methods, respectively. Specifically, our Retrieve stage, employing various tailored fine-tuning strategies for small LMs, achieved a recall rate exceeding 0.90 in the top-5 results alone—ensuring comprehensive coverage of entailing articles. In the subsequent Revise stage, large LMs narrow this set, improving precision while sacrificing minimal coverage. The Refine stage further enhances precision by leveraging specialized insights from small LMs, resulting in a relative improvement of up to 19.15% in the number of concise article sets retrieved compared to previous methods. Our framework offers a promising direction for further research on specialized methods for retrieving concise sets of entailing legal articles, thereby more effectively meeting the task’s demands.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103949"},"PeriodicalIF":7.4,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
En Xu , Kai Zhao , Zhiwen Yu , Hui Wang , Siyuan Ren , Helei Cui , Yunji Liang , Bin Guo
{"title":"Upper bound on the predictability of rating prediction in recommender systems","authors":"En Xu , Kai Zhao , Zhiwen Yu , Hui Wang , Siyuan Ren , Helei Cui , Yunji Liang , Bin Guo","doi":"10.1016/j.ipm.2024.103950","DOIUrl":"10.1016/j.ipm.2024.103950","url":null,"abstract":"<div><div>The task of rating prediction has undergone extensive scrutiny, employing diverse modeling approaches to enhance accuracy. However, it remains uncertain whether a maximum accuracy, synonymous with predictability, exists for a given dataset, guiding the quest for optimal algorithms. While existing theories quantify predictability in one-dimensional symbol sequences, extending this to multidimensional and heterogeneous data poses challenges, rendering it unsuitable for rating prediction tasks. Our approach initially employs conditional entropy to quantify rating entropy, overcoming its inherent complexity by transforming it into two easily calculable entropies. Unlike conventional entropy measures, we utilize sample entropy to account for the numerical impact of rating sequences. Furthermore, novel metrics for quantifying entropy in numerical sequences are integrated to enhance predictability scaling. Demonstrating the effectiveness of our method across datasets of varying sizes and domains, current leading rating prediction algorithms achieve approximately 80% predictability.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103950"},"PeriodicalIF":7.4,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing pre-trained language models with Chinese character morphological knowledge","authors":"Zhenzhong Zheng , Xiaoming Wu , Xiangzhi Liu","doi":"10.1016/j.ipm.2024.103945","DOIUrl":"10.1016/j.ipm.2024.103945","url":null,"abstract":"<div><div>Pre-trained language models (PLMs) have demonstrated success in Chinese natural language processing (NLP) tasks by acquiring high-quality representations through contextual learning. However, these models tend to neglect the glyph features of Chinese characters, which contain valuable semantic knowledge. To address this issue, this paper introduces a self-supervised learning strategy, named SGBERT, aiming to learn high-quality semantic knowledge from Chinese Character morphology to enhance PLMs’ understanding of natural language. Specifically, the learning process of SGBERT can be divided into two stages. In the first stage, we preheat the glyph encoder by constructing contrastive learning between glyphs, enabling it to obtain preliminary glyph coding capabilities. In the second stage, we transform the glyph features captured by the glyph encoder into context-sensitive representations through a glyph-aware window. These representations are then contrasted with the character representations generated by the PLMs, leveraging the powerful representation capabilities of the PLMs to guide glyph learning. Finally, the glyph knowledge is fused with the pre-trained model representations to obtain semantically richer representations. We conduct experiments on ten datasets covering six Chinese NLP tasks, and the results demonstrate that SGBERT significantly enhances commonly used Chinese PLMs. On average, the introduction of SGBERT resulted in a performance improvement of 1.36% for BERT and 1.09% for RoBERTa.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103945"},"PeriodicalIF":7.4,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical multi-label text classification of tourism resources using a label-aware dual graph attention network","authors":"Quan Cheng, Wenwan Shi","doi":"10.1016/j.ipm.2024.103952","DOIUrl":"10.1016/j.ipm.2024.103952","url":null,"abstract":"<div><div>In the era of big data, classifying online tourism resource information can facilitate the matching of user needs with tourism resources and enhance the efficiency of tourism resource integration. However, most research in this field has concentrated on a simple classification problem with a single level of single labelling. In this paper, a Hierarchical Label-Aware Tourism-Informed Dual Graph Attention Network (HLT-DGAT) is proposed for the complex multi-level and multi-label classification presented by online textual information about Chinese tourism resources. This model integrates domain knowledge into a pre-trained language model and employs attention mechanisms to transform the text representation into the label-based representation. Subsequently, the model utilizes dual Graph Attention Network (GAT), with one component capturing vertical information and the other capturing horizontal information within the label hierarchy. The model's performance is validated on two commonly used public datasets as well as on a manually curated Chinese tourism resource dataset, which consists of online textual overviews of Chinese tourism resources above 3A level. Experimental results indicate that HLT-DGAT demonstrates superiority in threshold-based and area-under-curve evaluation metrics. Specifically, the <span><math><mrow><mrow><mtext>AU</mtext><mo>(</mo></mrow><mover><mrow><mtext>PRC</mtext></mrow><mo>‾</mo></mover><mrow><mo>)</mo></mrow></mrow></math></span> reaches 64.5 % on the Chinese tourism resource dataset with enforced leaf nodes, which is 3 % higher than the optimal corresponding metric of the baseline model. Furthermore, ablation studies show that (1) integrating domain knowledge, (2) combining local information, (3) considering label dependencies within the same level of label hierarchy, and (4) merging dynamic reconstruction can enhance overall model performance.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103952"},"PeriodicalIF":7.4,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142587301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ME3A: A Multimodal Entity Entailment framework for multimodal Entity Alignment","authors":"Yu Zhao, Ying Zhang, Xuhui Sui, Xiangrui Cai","doi":"10.1016/j.ipm.2024.103951","DOIUrl":"10.1016/j.ipm.2024.103951","url":null,"abstract":"<div><div>Current methods for multimodal entity alignment (MEA) primarily rely on entity representation learning, which undermines entity alignment performance because of cross-KG interaction deficiency and multimodal heterogeneity. In this paper, we propose a <strong>M</strong>ultimodal <strong>E</strong>ntity <strong>E</strong>ntailment framework of multimodal <strong>E</strong>ntity <strong>A</strong>lignment task, <strong>ME<sup>3</sup>A</strong>, and recast the MEA task as an entailment problem about entities in the two KGs. This way, the cross-KG modality information directly interacts with each other in the unified textual space. Specifically, we construct the multimodal information in the unified textual space as textual sequences: for relational and attribute modalities, we combine the neighbors and attribute values of entities as sentences; for visual modality, we map the entity image as trainable prefixes and insert them into sequences. Then, we input the concatenated sequences of two entities into the pre-trained language model (PLM) as an entailment reasoner to capture the unified fine-grained correlation pattern of the multimodal tokens between entities. Two types of entity aligners are proposed to model the bi-directional entailment probability as the entity similarity. Extensive experiments conducted on nine MEA datasets with various modality combination settings demonstrate that our ME<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>A effectively incorporates multimodal information and surpasses the performance of the state-of-the-art MEA methods by 16.5% at most.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103951"},"PeriodicalIF":7.4,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142587300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Impact of economic and socio-political risk factors on sovereign credit ratings","authors":"Abhinav Goel, Archana Singh","doi":"10.1016/j.ipm.2024.103943","DOIUrl":"10.1016/j.ipm.2024.103943","url":null,"abstract":"<div><div>Sovereign Credit Ratings (SCRs) help international investors price the risk of lending to sovereigns or entities domiciled within that sovereign, thereby impacting cost and availability of capital flows into an economy. The international credit rating agencies (CRAs - Moody's, S&P and Fitch) consider both quantitative (economic) and qualitative (socio-political) factors while determining the SCR of a country. However, research in the field of SCR has focussed largely on quantitative factors giving lesser importance to qualitative factors. The present work analyses the linkage of banking sector risks and SCR, the bias in rating process towards high-income nations, and the impact of both quantitative and qualitative factors to provide a more holistic picture of the determinants of SCR.</div><div>To attain these objectives, the present work develops two datasets covering 55 countries and compiles the data for 10 years (2011–2020) in terms of SCR obtained from Moody's and Fitch, and the values for various quantitative and qualitative factors. The dataset comprises of 18,700 data points obtained from 32 independent variables; 17 are quantitative and 15 qualitative. Some qualitative factors are also introduced which were not used earlier in SCR literature The data has been collated from World Bank, International Monetary Fund, United Nations etc. Correlation analysis has been performed on these two datasets followed by the application of Extra Tree Classifier for predicting SCR. Thorough result analysis indicates that qualitative factors, individually and as a group, are more important in determining SCR than quantitative factors. The results also indicate the presence of bias towards high-income nations and moderate importance of banking parameters in determination of SCR. Further, the use of Extra Tree Classifier gives a prediction accuracy of 97 % - 98 % for dataset 1 and dataset 2, respectively. Comparative analysis with existing work proves the efficacy of the present work.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103943"},"PeriodicalIF":7.4,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142587302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dian Wang , Yang Li , Suge Wang , Xin Chen , Jian Liao , Deyu Li , Xiaoli Li
{"title":"CKEMI: Concept knowledge enhanced metaphor identification framework","authors":"Dian Wang , Yang Li , Suge Wang , Xin Chen , Jian Liao , Deyu Li , Xiaoli Li","doi":"10.1016/j.ipm.2024.103946","DOIUrl":"10.1016/j.ipm.2024.103946","url":null,"abstract":"<div><div>Metaphor is pervasive in our life, there is roughly one metaphor every three sentences on average in our daily conversations. Previous metaphor identification researches in NLP have rarely focused on similarity between concepts from different domains. In this paper, we propose a Concept Knowledge Enhanced Metaphor Identification Framework (CKEMI) to model similarity between concepts from different domains. First, we construct the descriptive concept word set and the inter-word relation concept word set by selecting knowledge from the ConceptNet knowledge base. Then, we devise two hierarchical relation concept graph networks to refine inter-word relation concept knowledge. Next, we design the concept consistency mapping function to constrain the representation of inter-word relation concept and learn similarity information between concepts. Finally, we construct the target domain semantic scene by integrating the representation of inter-word relation concept knowledge for metaphor identification. Specifically, the F1 score of CKEMI is superior to the state-of-the-art (SOTA) methods, achieving improvements of over 0.5%, 1.0%, and 1.2% on the VUA-18(10k), VUA-20(16k), and MOH-X(0.6k) datasets, respectively.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103946"},"PeriodicalIF":7.4,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142577976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zehua Ding , Youliang Tian , Guorong Wang , Jinbo Xiong , Jinchuan Tang , Jianfeng Ma
{"title":"Membership inference attacks via spatial projection-based relative information loss in MLaaS","authors":"Zehua Ding , Youliang Tian , Guorong Wang , Jinbo Xiong , Jinchuan Tang , Jianfeng Ma","doi":"10.1016/j.ipm.2024.103947","DOIUrl":"10.1016/j.ipm.2024.103947","url":null,"abstract":"<div><div>Machine Learning as a Service (MLaaS) has significantly advanced data-driven decision-making and the development of intelligent applications. However, the privacy risks posed by membership inference attacks (MIAs) remain a critical concern. MIAs are primarily classified into score-based and perturbation-based attacks. The former relies on shadow data and models, which are difficult to obtain in practical applications, while the latter depends solely on perturbation distance, resulting in insufficient identification performance. To this end, we propose a Spatial Projection-based Relative Information Loss (SPRIL) MIA to ascertain the sample membership by flexibly controlling the size of perturbations in the noise space and integrating relative information loss. Firstly, we analyze the alterations in predicted probability distributions induced by adversarial perturbations and leverage these changes as pivotal features for membership identification. Secondly, we introduce a spatial projection technique that flexibly modulates the perturbation amplitude to accentuate the difference in probability distributions between member and non-member data. Thirdly, this quantifies the distribution difference by calculating relative information loss based on KL divergence to identify membership. SPRIL provides a solid method to assess the potential risks of DNN models in MLaaS and demonstrates its efficacy and precision in black-box and white-box settings. Finally, experimental results demonstrate the effectiveness of SPRIL across various datasets and model architectures. Notably, on the CIFAR-100 dataset, SPRIL achieves the highest attack accuracy and AUC, reaching 99.27% and 99.73%, respectively.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103947"},"PeriodicalIF":7.4,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142577975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}