Intelligent Data Analysis最新文献

筛选
英文 中文
A multi-layer multi-view stacking model for credit risk assessment 信用风险评估的多层多视图叠加模型
IF 1.7 4区 计算机科学
Intelligent Data Analysis Pub Date : 2023-08-01 DOI: 10.3233/ida-220403
Wenfang Han, Xiao Gu, Ling Jian
{"title":"A multi-layer multi-view stacking model for credit risk assessment","authors":"Wenfang Han, Xiao Gu, Ling Jian","doi":"10.3233/ida-220403","DOIUrl":"https://doi.org/10.3233/ida-220403","url":null,"abstract":"Credit risk assessment plays a key role in determining the banking policies and commercial strategies of financial institutions. Ensemble learning approaches have been validated to be more competitive than individual classifiers and statistical techniques for default prediction. However, most researches focused on improving overall prediction accuracy rather than improving the identification of actual defaulted loans. In addition, model interpretability has not been paid enough attention in previous studies. To fill up these gaps, we propose a Multi-layer Multi-view Stacking Integration (MLMVS) approach to predict default risk in the P2P lending scenario. As the main innovation, our proposal explores multi-view learning and soft probability outputs to produce multi-layer integration based on stacking. An interpretable artificial intelligence tool LIME is embedded for interpreting the prediction results. We perform a comprehensive analysis of MLMVS on the Lending Club dataset and conduct comparative experiments to compare it with a number of well-known individual classifiers and ensemble classification methods, which demonstrate the superiority of MLMVS.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":" ","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47109774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GeoNLPlify: A spatial data augmentation enhancing text classification for crisis monitoring GeoNLPlify:一种用于危机监测的增强文本分类的空间数据增强
IF 1.7 4区 计算机科学
Intelligent Data Analysis Pub Date : 2023-07-06 DOI: 10.3233/ida-230040
R. Découpes, M. Roche, M. Teisseire
{"title":"GeoNLPlify: A spatial data augmentation enhancing text classification for crisis monitoring","authors":"R. Découpes, M. Roche, M. Teisseire","doi":"10.3233/ida-230040","DOIUrl":"https://doi.org/10.3233/ida-230040","url":null,"abstract":"Crises such as natural disasters and public health emergencies generate vast amounts of text data, making it challenging to classify the information into relevant categories. Acquiring expert-labeled data for such scenarios can be difficult, leading to limited training datasets for text classification by fine-tuning BERT-like models. Unfortunately, traditional data augmentation techniques only slightly improve F1-scores. How can data augmentation be used to obtain better results in this applied domain? In this paper, using neural network explicability methods, we aim to highlight that fine-tuned BERT-like models on crisis corpora give too much importance to spatial information to make their predictions. This overfitting of spatial information limits their ability to generalize especially when the event which occurs in a place has evolved and changed since the training dataset has been built. To reduce this bias, we propose GeoNLPlify,1 a novel data augmentation technique that leverages spatial information to generate new labeled data for text classification related to crises. Our approach aims to address overfitting without necessitating modifications to the underlying model architecture, distinguishing it from other prevalent methods employed to combat overfitting. Our results show that GeoNLPlify significantly improves F1-scores, demonstrating the potential of the spatial information for data augmentation for crisis-related text classification tasks. In order to evaluate the contribution of our method, GeoNLPlify is applied to three public datasets (PADI-web, CrisisNLP and SST2) and compared with classical natural language processing data augmentations.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":" ","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49508250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Asymmetric multilevel interactive attention network integrating reviews for item recommendation 基于评论的非对称多层次互动关注网络
IF 1.7 4区 计算机科学
Intelligent Data Analysis Pub Date : 2023-06-29 DOI: 10.3233/ida-230128
Peilin Yang, Wenguang Zheng, Yingyuan Xiao, Xu Jiao
{"title":"Asymmetric multilevel interactive attention network integrating reviews for item recommendation","authors":"Peilin Yang, Wenguang Zheng, Yingyuan Xiao, Xu Jiao","doi":"10.3233/ida-230128","DOIUrl":"https://doi.org/10.3233/ida-230128","url":null,"abstract":"Recently, most studies in the field have focused on integrating reviews behind ratings to improve recommendation performance. However, two main problems remain (1) Most works use a unified data form and the same processing method to address the user and the item reviews, regardless of their essential differences. (2) Most works only adopt simple concatenation operation when constructing user-item interaction, thus ignoring the multilevel relationship between the user and the item, which may lead to suboptimal recommendation performance. In this paper, we propose a novel Asymmetric Multi-Level Interactive Attention Network (AMLIAN) integrating reviews for item recommendation. AMLIAN can predict precise ratings to help the user make better and faster decisions. Specifically, to address the essential difference between the user and the item reviews, AMLIAN uses the asymmetric network to construct user and item features using different data forms (document-level and review-level). To learn more personalized user-item interaction, the user ID and item ID and some processed features of user reviews and item reviews are respectively used for multilevel relationships. Experiments on five real-world datasets show that AMLIAN significantly outperforms state-of-the-art methods.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":" ","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42573986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing link prediction efficiency with shortest path and structural attributes 利用最短路径和结构属性提高链路预测效率
IF 1.7 4区 计算机科学
Intelligent Data Analysis Pub Date : 2023-06-29 DOI: 10.3233/ida-230030
Muhammad Wasim, F. Al-Obeidat, Adnan Amin, Haji Gul, Fernando Moreira
{"title":"Enhancing link prediction efficiency with shortest path and structural attributes","authors":"Muhammad Wasim, F. Al-Obeidat, Adnan Amin, Haji Gul, Fernando Moreira","doi":"10.3233/ida-230030","DOIUrl":"https://doi.org/10.3233/ida-230030","url":null,"abstract":"Link prediction is one of the most essential and crucial tasks in complex network research since it seeks to forecast missing links in a network based on current ones. This problem has applications in a variety of scientific disciplines, including social network research, recommendation systems, and biological networks. In previous work, link prediction has been solved through different methods such as path, social theory, topology, and similarity-based. The main issue is that path-based methods ignore topological features, while structure-based methods also fail to combine the path and structured-based features. As a result, a new technique based on the shortest path and topological features’ has been developed. The method uses both local and global similarity indices to measure the similarity. Extensive experiments on real-world datasets from a variety of domains are utilized to empirically test and compare the proposed framework to many state-of-the-art prediction techniques. Over 100 iterations, the collected data showed that the proposed method improved on the other methods in terms of accuracy. SI and AA, among the existing state-of-the-art algorithms, fared best with an AUC value of 82%, while the proposed method has an AUC value of 84%.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":" ","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49449760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative analysis of epidemic public opinion and policies in two regions of China based on big data 基于大数据的中国两个地区疫情舆论与政策比较分析
IF 1.7 4区 计算机科学
Intelligent Data Analysis Pub Date : 2023-06-28 DOI: 10.3233/ida-230025
Dong Qiu, Lin Huang
{"title":"Comparative analysis of epidemic public opinion and policies in two regions of China based on big data","authors":"Dong Qiu, Lin Huang","doi":"10.3233/ida-230025","DOIUrl":"https://doi.org/10.3233/ida-230025","url":null,"abstract":"Since the outbreak of COVID-19 (Corona Virus Disease 2019), the Chinese government has taken strict measures to prevent and control the epidemic. Although the spread of the virus has been controlled, people’s daily life and work have been affected and restricted to varying degrees. Thus people have different sentiments, these may affect people’s implementation and compliance with the policies, thus affecting the effectiveness of epidemic prevention and control. At present, few pieces of literature have analyzed the relationships between people’s feelings, policies, and epidemic trends. The object of this paper is to analyze the text content on social media, to find out the impact of the epidemic blockade policy on the public mood and the concerns expressed by the public about policies changes, and the interaction between policies and epidemic states at different stages of the epidemic. In this paper, we collected the posts of two cities where the epidemic occurred at the same time for analysis and comparative study. On the one hand, we revealed the changes in public attention and attitudes in the two regions during the epidemic, the other hand, it also reflects the differences in public sentiment between the two regions, as well as the correlation between emotions and policies and epidemic trends when different policies are adopted under different circumstances. The obtained results have a certain guiding significance for public health departments to formulate reasonable epidemic prevention policies.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":" ","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45099477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new Chinese text clustering algorithm based on WRD and improved K-means 基于WRD和改进K-means的中文文本聚类算法
IF 1.7 4区 计算机科学
Intelligent Data Analysis Pub Date : 2023-06-01 DOI: 10.3233/ida-226652
Zicai Cui, Bocheng Zhong, Chen Bai
{"title":"A new Chinese text clustering algorithm based on WRD and improved K-means","authors":"Zicai Cui, Bocheng Zhong, Chen Bai","doi":"10.3233/ida-226652","DOIUrl":"https://doi.org/10.3233/ida-226652","url":null,"abstract":"Text clustering has been widely used in data mining, document management, search engines, and other fields. The K-means algorithm is a representative algorithm of text clustering. However, traditional K-means algorithm often uses Euclidean distance or cosine distance to measure the similarity between texts, which is not effective in face of high-dimensional data and cannot retain enough semantic information. In response to the above problems, we combine word rotator’s distance with the K-means algorithm, and propose the WRDK-means algorithm, which use word rotator’s distance to calculate the similarity between texts and preserve more text features. Furthermore, we define a new cluster center initialization method that improves cluster instability during random initial cluster center selection. And, to solve the problem of inconsistent length between texts, we propose a new iterative approximation method of cluster centers. We selected three suitable datasets and five evaluation indicators to verify the feasibility of the proposed algorithm. Among them, the RI value of our algorithm exceeds 90%. And for Marco_F1, our scheme was about 37.77%, 23.2%, 13.06% and 20.12% better than other four methods, respectively.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"13 1","pages":"1205-1220"},"PeriodicalIF":1.7,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78274847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting the implicit independence assumption for learning directed graphical models 利用内隐独立性假设学习有向图形模型
IF 1.7 4区 计算机科学
Intelligent Data Analysis Pub Date : 2023-06-01 DOI: 10.3233/ida-226806
Limin Wang, Junyang Wei, Kuo Li, Jiaping Zhou
{"title":"Exploiting the implicit independence assumption for learning directed graphical models","authors":"Limin Wang, Junyang Wei, Kuo Li, Jiaping Zhou","doi":"10.3233/ida-226806","DOIUrl":"https://doi.org/10.3233/ida-226806","url":null,"abstract":"Bayesian network classifiers (BNCs) provide a sound formalism for representing probabilistic knowledge and reasoning with uncertainty. Explicit independence assumptions can effectively and efficiently reduce the size of the search space for solving the NP-complete problem of structure learning. Strong conditional dependencies, when added to the network topology of BNC, can relax the independence assumptions, whereas the weak ones may result in biased estimates of conditional probability and degradation in generalization performance. In this paper, we propose an extension to the k-dependence Bayesian classifier (KDB) that achieves the bias/variance trade-off by verifying the rationality of implicit independence assumptions implicated. The informational and probabilistic dependency relationships represented in the learned robust topologies will be more appropriate for fitting labeled and unlabeled data, respectively. The comprehensive experimental results on 40 UCI datasets show that our proposed algorithm achieves competitive classification performance when compared to state-of-the-art BNC learners and their efficient variants in terms of zero-one loss, root mean square error (RMSE), bias and variance.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"92 1","pages":"1143-1165"},"PeriodicalIF":1.7,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90421030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FairAW - Additive weighting without discrimination FairAW -无歧视的加性加权
IF 1.7 4区 计算机科学
Intelligent Data Analysis Pub Date : 2023-06-01 DOI: 10.3233/ida-226898
S. Radovanović, A. Petrović, Zorica Dodevska, Boris Delibasic
{"title":"FairAW - Additive weighting without discrimination","authors":"S. Radovanović, A. Petrović, Zorica Dodevska, Boris Delibasic","doi":"10.3233/ida-226898","DOIUrl":"https://doi.org/10.3233/ida-226898","url":null,"abstract":"With growing awareness of the societal impact of decision-making, fairness has become an important issue. More specifically, in many real-world situations, decision-makers can unintentionally discriminate a certain group of individuals based on either inherited or appropriated attributes, such as gender, age, race, or religion. In this paper, we introduce a post-processing technique, called fair additive weighting (FairAW) for achieving group and individual fairness in multi-criteria decision-making methods. The methodology is based on changing the score of an alternative by imposing fair criteria weights. This is achieved through minimization of differences in scores of individuals subject to fairness constraint. The proposed methodology can be successfully used in multi-criteria decision-making methods where the additive weighting is used to evaluate scores of individuals. Moreover, we tested the method both on synthetic and real-world data, and compared it to Disparate Impact Remover and FA*IR methods that are commonly used in achieving fair scoring of individuals. The obtained results showed that FairAW manages to achieve group fairness in terms of statistical parity, while also retaining individual fairness. Additionally, our approach managed to obtain the best equality in scoring between discriminated and privileged groups.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"24 1","pages":"1023-1045"},"PeriodicalIF":1.7,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83369436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient intrusion detection method using federated transfer learning and support vector machine with privacy-preserving 一种基于联邦迁移学习和支持向量机的高效入侵检测方法
IF 1.7 4区 计算机科学
Intelligent Data Analysis Pub Date : 2023-06-01 DOI: 10.3233/ida-226617
Weifei Wu, Yanhui Zhang
{"title":"An efficient intrusion detection method using federated transfer learning and support vector machine with privacy-preserving","authors":"Weifei Wu, Yanhui Zhang","doi":"10.3233/ida-226617","DOIUrl":"https://doi.org/10.3233/ida-226617","url":null,"abstract":"In recent decades, network security for organizations and individuals has become more and more important, and intrusion detection systems play a key role in protecting network security. To improve intrusion detection effect, different machine learning techniques have been widely applied and achieved exciting results. However, the premise that these methods achieve reliable results is that there are enough available and well-labeled training data, training and test data being from the same distribution. In real life, the limited label data generated by a single organization is not enough to train a reliable learning model, and the distribution of data collected by different organizations is difficult to be the same. In addition, various organizations protect their privacy and data security through data islands. Therefore, this paper proposes an efficient intrusion detection method using transfer learning and support vector machine with privacy-preserving (FETLSVMP). FETLSVMP performs aggregation of data distributed in various organizations through federated learning, then utilizes transfer learning and support vector machines build personalized models for each organization. Specifically, FETLSVMP first builds a transfer support vector machine model to solve the problem of data distribution differences among various organizations; then, under the mechanism of federated learning, the model is used for learning without sharing training data on each organization to protect data privacy; finally, the intrusion detection model is obtained with protecting the privacy of data. Experiments are carried out on NSL-KDD, KDD CUP99 and ISCX2012, the experimental results verify that the proposed method can achieve better results of detection and robust performance, especially for small samples and emerging intrusion behaviors, and have the ability to protect data privacy.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"50 1","pages":"1121-1141"},"PeriodicalIF":1.7,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84525856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning techniques for received signal strength indicator prediction 接收信号强度指标预测的机器学习技术
IF 1.7 4区 计算机科学
Intelligent Data Analysis Pub Date : 2023-06-01 DOI: 10.3233/ida-226750
Rina Azoulay-Schwartz, Eliya Edery, Yoram Haddad, Orit Rozenblit
{"title":"Machine learning techniques for received signal strength indicator prediction","authors":"Rina Azoulay-Schwartz, Eliya Edery, Yoram Haddad, Orit Rozenblit","doi":"10.3233/ida-226750","DOIUrl":"https://doi.org/10.3233/ida-226750","url":null,"abstract":"The advances made in wireless communication technology have led to efforts to improve the quality of reception, prevent poor connections and avoid disconnections between wireless and cellular devices. One of the most important steps toward preventing communication failures is to correctly estimate the received signal strength indicator (RSSI) of a wireless device. RSSI prediction is important for addressing various challenges such as localization, power control, link quality estimation, terminal connectivity estimation, and handover decisions. In this study, we compare different machine learning (ML) techniques that can be used to predict the received signal strength values of a device, given the received signal strength values of other devices in the region. We consider various ML methods, such as multi-layer ANN, K nearest neighbors, decision trees, random forest, and the K-means based method, for the prediction challenge. We checked the accuracy level of the learning process using a real dataset provided by a major national cellular operator. Our results show that the weighted K nearest neighbors algorithm, for K = 3 neighbors, achieved, on average, the most accurate RSSI predictions. We conclude that in environments where the size of data is relatively small, and data of close geographical points is available, a method that predicts the coverage of a point using the coverage near geographical points can be more successful and more accurate compared with other ML methods.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"2005 1","pages":"1167-1184"},"PeriodicalIF":1.7,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83014196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信