Big Data Research最新文献_第3页

Big data analytics for smart home energy management system based on IOMT using AHP and WASPAS 基于AHP和WASPAS的IOMT智能家居能源管理系统大数据分析

IF 3.5 3区计算机科学

Big Data Research Pub Date : 2025-05-10 DOI: 10.1016/j.bdr.2025.100534

Jingze Zhou , Salem Alkhalaf , S. Abdel-Khalek , Shah Nazir

{"title":"Big data analytics for smart home energy management system based on IOMT using AHP and WASPAS","authors":"Jingze Zhou , Salem Alkhalaf , S. Abdel-Khalek , Shah Nazir","doi":"10.1016/j.bdr.2025.100534","DOIUrl":"10.1016/j.bdr.2025.100534","url":null,"abstract":"<div><div>The convergence of edge computing and 5G network speed provides an innovative way to address the energy efficiency and low latency requirements in medical data processing, especially from the perspective of the Internet of Medical Things (IoMT). Together, these technologies allow for the quick and effective handling of the enormous volumes of medical data produced by different IoMT devices in the context of smart healthcare systems. The IoMT is bringing cutting-edge technologies, social benefits, and economic advantages to transform modern healthcare systems entirely. Digital healthcare is transforming due to machine learning, which uses sophisticated algorithms to forecast patients’ health status efficiently. These approaches predict the onset of disease, hospital readmissions, and treatment customization by analyzing large medical datasets. Strong data security and good forecast accuracy are still issues. The quality and variety of training data are key factors in making accurate predictions, and strict encryption, safe storage, and regulatory compliance are necessary for data security. By including various significant components from existing research, the current study seeks to determine the most collective features. The goal of the study is to offer a systematic approach for assessing these features identified by using the approaches of the AHP and WASPAS. These approaches are effective for efficient big data analytics in the context of smart home energy management system based on IOMT.</div></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"41 ","pages":"Article 100534"},"PeriodicalIF":3.5,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Saving food surplus and developing new business models: Exploring the potential of ‘Too Good To Go’ at territorial level using web-scraped data 节约粮食剩余和发展新的商业模式：利用网络数据在地区层面探索“太好而不能去”的潜力

IF 3.5 3区计算机科学

Big Data Research Pub Date : 2025-05-10 DOI: 10.1016/j.bdr.2025.100536

Mengting Yu, Luca Secondi, Tiziana Laureti, Luigi Palumbo

引用次数: 0

A novel study of kernel graph regularized semi-non-negative matrix factorization with orthogonal subspace for clustering 基于正交子空间的核图正则化半非负矩阵分解聚类的新研究

IF 3.5 3区计算机科学

Big Data Research Pub Date : 2025-04-22 DOI: 10.1016/j.bdr.2025.100531

Yasong Chen , Wen Li, Junjian Zhao

{"title":"A novel study of kernel graph regularized semi-non-negative matrix factorization with orthogonal subspace for clustering","authors":"Yasong Chen , Wen Li, Junjian Zhao","doi":"10.1016/j.bdr.2025.100531","DOIUrl":"10.1016/j.bdr.2025.100531","url":null,"abstract":"<div><div>As a nonlinear extension of Non-negative Matrix Factorization (NMF), Kernel Non-negative Matrix Factorization (KNMF) has demonstrated greater effectiveness in revealing latent features from raw data. Building on this, this paper introduces kernel theory and effectively combines the advantages of semi-nonnegative constraints, graph regularization, and orthogonal subspace constraints to propose a novel model-Kernel Graph Regularized Semi-Negative Matrix Factorization with Orthogonal Subspaces and Auxiliary Variables (semi-KGNMFOSV). This model introduces auxiliary variables and reformulates the optimization problem, successfully overcoming the convergence proof challenges typically associated with orthogonal subspace-constrained methods. Furthermore, the model utilizes kernel methods to effectively capture complex nonlinear structures in the data. The semi-nonnegative constraint, along with orthogonal subspace constraints incorporating auxiliary variables, enhances optimization efficiency, while graph regularization preserves the local geometric structure of the data. We develop an efficient optimization algorithm to solve the proposed model and conduct extensive experiments on multiple real-world datasets. Additionally, we investigate the impact of three different initialization strategies on the performance of the proposed algorithm. Experimental results demonstrate that, compared to classical and state-of-the-art methods, the proposed model exhibits superior performance across all three initialization strategies.</div></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"40 ","pages":"Article 100531"},"PeriodicalIF":3.5,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143863357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hourglass pattern matching for deep aware neural network text recommendation model 沙漏模式匹配的深度感知神经网络文本推荐模型

IF 3.5 3区计算机科学

Big Data Research Pub Date : 2025-04-17 DOI: 10.1016/j.bdr.2025.100532

Li Gao, Hongjun Li, Qingkui Chen, Dunlu Peng

{"title":"Hourglass pattern matching for deep aware neural network text recommendation model","authors":"Li Gao, Hongjun Li, Qingkui Chen, Dunlu Peng","doi":"10.1016/j.bdr.2025.100532","DOIUrl":"10.1016/j.bdr.2025.100532","url":null,"abstract":"<div><div>In recent years, with the rapid development of deep learning, big data mining, and natural language processing (NLP) technologies, the application of NLP in the field of recommendation systems has attracted significant attention. However, current text recommendation systems still face challenges in handling word distribution assumptions, preprocessing design, network inference models, and text perception technologies. Traditional RNN neural network layers often encounter issues such as gradient explosion or vanishing gradients, which hinder their ability to effectively handle long-term dependencies and reverse text inference among long texts. Therefore, this paper proposes a new type of depth-aware neural network recommendation model (Hourglass Deep-aware neural network Recommendation Model, HDARM), whose structure presents an hourglass shape. This model consists of three parts: The top of the hourglass uses Word Embedding for input through Fine-tune Bert to process text embeddings as word distribution assumptions, followed by utilizing bidirectional LSTM to integrate Transformer models for learning critical information. The middle of the hourglass retains key features of network outputs through CNN layers, which are combined with pooling layers to extract and enhance critical information from user text. The bottom of the hourglass avoids a decline in generalization performance through deep neural network layers. Finally, the model performs pattern matching between text vectors and word embeddings, recommending texts based on their relevance. In experiments, this model improved metrics like MSE and NDCG@10 by 8.74 % and 10.89 % respectively compared to the optimal baseline model.</div></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"40 ","pages":"Article 100532"},"PeriodicalIF":3.5,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143923599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A decision tree algorithm based on adaptive entropy of feature value importance 基于特征值重要度自适应熵的决策树算法

IF 3.5 3区计算机科学

Big Data Research Pub Date : 2025-04-14 DOI: 10.1016/j.bdr.2025.100530

Shaobo Deng, Weili Yuan, Sujie Guan, Xing Lin, Zemin Liao, Min Li

引用次数: 0

TE-PADN: A poisoning attack defense model based on temporal margin samples TE-PADN：基于时差采样的中毒攻击防御模型

IF 3.5 3区计算机科学

Big Data Research Pub Date : 2025-04-09 DOI: 10.1016/j.bdr.2025.100528

Haitao He , Ke Liu , Lei Zhang , Ke Xu , Jiazheng Li , Jiadong Ren

{"title":"TE-PADN: A poisoning attack defense model based on temporal margin samples","authors":"Haitao He , Ke Liu , Lei Zhang , Ke Xu , Jiazheng Li , Jiadong Ren","doi":"10.1016/j.bdr.2025.100528","DOIUrl":"10.1016/j.bdr.2025.100528","url":null,"abstract":"<div><div>With the development of network security research, intrusion detection systems based on deep learning show great potential in network attack detection. As crucial tools for ensuring network information security, these systems themselves are vulnerable to poisoning attacks from attackers. Currently, most poisoning attack defense methods cannot effectively utilize network traffic characteristics and are only effective for specific models, showing poor defense results for other models. Furthermore, detection of poisoning attacks is often overlooked, leading to a lack of timely and effective defense against such attacks. Therefore, we propose a data poisoning defense mechanism called TE-PADN. Firstly, we introduce a temporal margin sample generation algorithm that integrates an attention mechanism. Based on mapping the original data time series into a latent feature space, this algorithm learns the temporal characteristics of the data and focuses on information from different positions using the attention mechanism to generate temporal margin samples for repairing poisoned models. Secondly, we propose a multi-level poisoning attack detection method for real-time and accurate detection of undetected poisoning attacks. By employing ensemble learning methods, this approach enhances model robustness, repairs model classification boundaries that have shifted due to poisoning attacks and achieves efficient defense against poisoning attacks. Finally, experimental validation of our proposed method demonstrates promising results. Under a 10% attack intensity, the average accuracy of TE-PADN in recovering poisoning models increased by 6.5% on the NSL-KDD dataset, 5.3% on the UNSW-NB15 dataset, and 5.9% on the CICIDS2017 dataset.</div></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"40 ","pages":"Article 100528"},"PeriodicalIF":3.5,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143816452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging artificial intelligence for pandemic management: Case of COVID-19 in the United States 利用人工智能进行流行病管理：以美国的COVID-19为例

IF 3.5 3区计算机科学

Big Data Research Pub Date : 2025-04-08 DOI: 10.1016/j.bdr.2025.100529

Ehsan Ahmadi, Reza Maihami

{"title":"Leveraging artificial intelligence for pandemic management: Case of COVID-19 in the United States","authors":"Ehsan Ahmadi, Reza Maihami","doi":"10.1016/j.bdr.2025.100529","DOIUrl":"10.1016/j.bdr.2025.100529","url":null,"abstract":"<div><div>The COVID-19 pandemic revealed significant limitations in traditional approaches to analyzing time-series data that use one-dimensional data such as historical infection rates. Such approaches do not capture the complex, multifactor influences on disease spread. This paper addresses these challenges by proposing a comprehensive methodology that integrates multiple data sources, including community mobility, census information, Google search trends, socioeconomic variables, vaccination coverage, and political data. In addition, this paper proposes a new cross-learning (CL) methodology that allows for the training of machine learning models on multiple related time series simultaneously, enabling more accurate and robust predictions. Applying the CL approach with four machine learning algorithms, we successfully forecasted confirmed COVID-19 cases 30 days in advance with greater accuracy than the traditional ARIMAX model and the newer Transformer deep learning technique. Our findings identified daily hospital admissions as a significant predictor at the state level and vaccination status at the national level. Random Forest with CL was very effective, performing best in 44 states, while ARIMAX outperformed in seven larger states. These findings highlight the importance of advanced predictive modeling in resource optimization and response strategy development for future health emergencies.</div></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"40 ","pages":"Article 100529"},"PeriodicalIF":3.5,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143839334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Settlement patterns, official statistics and geo-economic dynamics: Evidence from a LADISC approach to Italy 聚落模式、官方统计和地缘经济动态：来自意大利LADISC方法的证据

IF 3.5 3区计算机科学

Big Data Research Pub Date : 2025-03-30 DOI: 10.1016/j.bdr.2025.100525

Gianluigi Salvucci , Luca Salvati , Leonardo Salvatore Alaimo , Ioannis Vardopoulos

{"title":"Settlement patterns, official statistics and geo-economic dynamics: Evidence from a LADISC approach to Italy","authors":"Gianluigi Salvucci , Luca Salvati , Leonardo Salvatore Alaimo , Ioannis Vardopoulos","doi":"10.1016/j.bdr.2025.100525","DOIUrl":"10.1016/j.bdr.2025.100525","url":null,"abstract":"<div><div>Taken as pivotal in explaining settlement patterns, territorial and socioeconomic factors — such as elevation or proximity to water bodies or infrastructures — are evolving amid contemporary trends favouring urbanized areas. Urban centers, transformed over the past decades, attract younger populations because of the inherent proximity to services and infrastructure, amid challenges posed by urban living costs and housing availability. This study extends the Latitude, Altitude, Distance from the Sea, and Proximity to Major Cities (LADISC) model, integrating two additional geographic metrics to provide a refined framework for analyzing population distribution trends. Unlike traditional approaches that rely on administrative boundaries, this model applies geostatistical techniques to high-resolution census data, offering a detailed and dynamic perspective on settlement evolution in Italy. Advanced applications of official data mining with exploratory statistical techniques allow for the uncovering of a significant concentration of elderly populations within urban centers, underscoring the needed for tailored healthcare services and urban amenities. Conversely, we found that younger populations are decentralizing towards suburban areas, reflecting a sudden shift in preferences and mobility patterns. Such trends prompt a reassessment of urban planning and (sustainable) development strategies to accommodate diverse population needs. Our study further explores the impact of Covid-19 pandemic on population distribution, suggesting a potential surge in remote working and digital interactions that are most likely to reshape peri‑urban settlements. By refining the LADISC framework, this study presents an innovative methodology for handling large-scale census data, allowing for spatially explicit demographic analysis that captures population shifts more precisely than traditional methods.</div></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"40 ","pages":"Article 100525"},"PeriodicalIF":3.5,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144068457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Women in life sciences firms: Gender diversity and roles indicator from data integration 生命科学公司中的女性：来自数据整合的性别多样性和角色指标

IF 3.5 3区计算机科学

Big Data Research Pub Date : 2025-03-28 DOI: 10.1016/j.bdr.2025.100526

Laura Benedan , Cinzia Colapinto , Paolo Mariani , Laura Pagani , Mariangela Zenga

引用次数: 0

Efficient, interpretable and automated feature engineering for bank data 银行数据的高效、可解释和自动化特征工程

IF 3.5 3区计算机科学

Big Data Research Pub Date : 2025-03-28 DOI: 10.1016/j.bdr.2025.100524

Atilla Karaahmetoğlu , Mehmet Yıldız , Erdem Ünal , Uğur Aydın , Murat Koraş , Barış Akgün

{"title":"Efficient, interpretable and automated feature engineering for bank data","authors":"Atilla Karaahmetoğlu , Mehmet Yıldız , Erdem Ünal , Uğur Aydın , Murat Koraş , Barış Akgün","doi":"10.1016/j.bdr.2025.100524","DOIUrl":"10.1016/j.bdr.2025.100524","url":null,"abstract":"<div><div>Banks rely on expert-generated features and simple models to have high performance and interpretability at the same time. Interpretability is needed for internal assessment and regulatory compliance for specific problems such as risk assessment and both expert generated features and simple models satisfy this need. However, feature generation by experts is a time-consuming process and susceptible to bias. In addition, features need to be generated fairly often due to the dynamic nature of bank data, and in case of significant changes or new data sources, expertise might take a while to build up. Complex models, such as deep neural networks, may be able to remedy this. However, interpretability/explainability approaches for complex models are not satisfactory from the banks' point of view. In addition, such models do not always work well with tabular data which is abundant in banking applications. This paper introduces an automated feature synthesis pipeline that creates informative and domain-interpretable features which iconsumes significantly less time than brute-force methods. We create novel feature synthesis steps, define elimination rules to rule out uninterpretable features, and combine performance-based feature selection methods to pick desirable ones to build our models. Our results on two different datasets show that the features generated with our pipeline; (1) perform on par or better than features generated by existing methods, (2) are obtained faster, and (3) are domain-interpretable.</div></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"40 ","pages":"Article 100524"},"PeriodicalIF":3.5,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143790985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0