Benedetto Leto, Gianvito Urgese, Enrico Macii, Vittorio Fra
{"title":"Variable-precision neuromorphic state space model for on-edge activity classification","authors":"Benedetto Leto, Gianvito Urgese, Enrico Macii, Vittorio Fra","doi":"10.1016/j.future.2025.108193","DOIUrl":"10.1016/j.future.2025.108193","url":null,"abstract":"<div><div>Neuromorphic computing is rising as a promising paradigm for efficient AI, leveraging event-driven computation to achieve low-power and high-performance computing. Due to the real-time processing required by edge devices with minimal power consumption, optimizing neuromorphic models for on-edge applications can be crucial to address the issue of power efficiency and resource-constraint devices. This work explores the definition of a neuromorphic state space model and its deployment on non-dedicated hardware. Structured sparsity and quantization techniques are leveraged to enhance the model’s efficiency. By compressing synaptic operations and memory footprint, we demonstrate how neuromorphic models can be adapted for on-edge deployment, ensuring low-latency and memory efficient inference. This study highlights the potential of neuromorphic models as a scalable solution for real-world embedded systems with limited resources.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108193"},"PeriodicalIF":6.2,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guanghui Wang , Qinghua Zeng , Lingfeng Shen , Shuang Ding , Xin He , Zhonghao Zhai , Heng Li , Zongqi Shi
{"title":"Towards efficient privacy-preserving keyword search for outsourced data in intelligent transportation systems","authors":"Guanghui Wang , Qinghua Zeng , Lingfeng Shen , Shuang Ding , Xin He , Zhonghao Zhai , Heng Li , Zongqi Shi","doi":"10.1016/j.future.2025.108192","DOIUrl":"10.1016/j.future.2025.108192","url":null,"abstract":"<div><div>Privacy-preserving keyword search is important for outsourced data in Intelligent Transportation Systems (ITS). Traditional keyword search techniques utilized homomorphic encryption and searchable encryption to achieve privacy protection. However, the techniques generally suffer from high computational and communication costs, especially in high-security and large-scale data scenarios. To address this issue, this paper proposes an efficient privacy-preserving keyword search scheme for outsourced data in ITS. Firstly, by optimizing probabilistic homomorphic encryption to deterministic encryption, the computational cost on the data owner side is reduced and the ciphertext size is decreased, effectively reducing communication costs. Then, a secure comparison protocol and a secure inequality test algorithm are designed to achieve privacy-preserving keyword search, with enhanced privacy of the search results through the introduction of a random number scheme. The decryption operation for the end users is migrated to the cloud, further alleviating the computational and communication burden on the end users while ensuring system privacy. Finally, theoretical analysis and experimental results show that the proposed scheme outperforms existing methods in terms of computational efficiency and communication cost, making it particularly suitable for outsourced data scenarios in ITS.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108192"},"PeriodicalIF":6.2,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alberto Mulone, Doriana Medic̀, Iacopo Colonnelli, Marco Aldinucci
{"title":"A formal framework for fault tolerance in hybrid scientific workflows","authors":"Alberto Mulone, Doriana Medic̀, Iacopo Colonnelli, Marco Aldinucci","doi":"10.1016/j.future.2025.108188","DOIUrl":"10.1016/j.future.2025.108188","url":null,"abstract":"<div><div>In large-scale distributed systems, failures are routine events whose occurrences increase with the number of computational tasks and execution locations. The advantage of representing an application as a workflow is the possibility of exploiting Workflow Management System (WMS) features such as portability, scalability, and, crucially, reliability. Among these, reliability is essential for ensuring robust execution in dynamic and failure-prone environments. In recent years, the emergence of hybrid workflows has posed new and intriguing challenges by increasing the possibility of distributing computations involving heterogeneous and independent environments. Consequently, the number of possible points of failure during the execution increased, creating a need for sophisticated fault tolerance mechanisms capable of addressing the specific requirements of hybrid systems. This work introduces a formal framework for a fault tolerance mechanism in hybrid workflows, enabling failure recovery through a rollback approach. The framework is rigorously defined by adapting and extending an existing workflow semantics tailored for hybrid execution. Our method leverages provenance data from workflow execution up to the point of failure, and creates a recovery workflow that spans multiple infrastructures. The rollback approach provides a robust and reliable strategy to ensure resilience against step failures and potential data loss. We then implement this mechanism in the StreamFlow WMS, and evaluate it using two case studies: the 1000 Genomes workflow and a synthetic workflow featuring iterative patterns. Experiments showcase the conceptual validity of our approach and assess the overhead introduced by the mechanism, including data availability checks.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108188"},"PeriodicalIF":6.2,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced YOLO with FPGA hardware acceleration for aluminum sheet defect detection","authors":"Fang Xia, Gangyang Nan, Zhongqing Jia, Di Wang","doi":"10.1016/j.future.2025.108189","DOIUrl":"10.1016/j.future.2025.108189","url":null,"abstract":"<div><div>The leap forward in transitioning to intelligent manufacturing—particularly in the area of metal surface defect detection—has been dramatically reinforced by advances in informatization. Convolutional Neural Networks (CNNs), rooted in deep learning, have demonstrated considerable promise in image recognition tasks. However, challenges concerning resource allocation and high power consumption persist, posing notable bottlenecks for practical deployment. To address these concerns, this paper proposes an accelerator for the You-Only-Look-Once (YOLO) v4-Tiny algorithm and its implementation on a System-on-Chip (SoC) architecture. First, the k-means++ clustering algorithm is employed to reposition anchor boxes, and a hardware-friendly activation function is integrated into the model. Moreover, the Field Programmable Gate Array (FPGA) accelerates the network through computational efficiency improvements and lightweight design optimizations. To further enhance performance, the paper employs layer fusion, network parameters quantization to reduce computational complexity and resource consumption. Additionally, for memory efficiency, ping-pong buffering is proposed, significantly improving data interaction. Furthermore, throughput and area optimization are achieved using High-Level Synthesis (HLS) instructions. Ultimately, this design incorporates multi-port I/O and loop tiling strategy to further improve data processing efficiency. The tailored optimization showcases promising outcomes, maintaining a mean average precision (mAP) of 97.76 %, accompanied by a low power consumption of 2.77 W and a runtime of 0.279 s. It achieves an optimal balance of evaluation metrics, prioritizing competitive detection accuracy and low power consumption over maximal performance across all indicators. This approach fulfills industrial requirements for aluminum sheet flaw identification, demonstrating significant theoretical and practical contributions.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108189"},"PeriodicalIF":6.2,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A dual-layer dynamic graph summarization method based on extendable suffix fingerprints","authors":"Qiang Liu, Longlong Zhao, He Cao, Zheng Liu","doi":"10.1016/j.future.2025.108181","DOIUrl":"10.1016/j.future.2025.108181","url":null,"abstract":"<div><div>The existing graph summarization methods often suffer from high addressing overheads and hash collisions, especially when facing real-world graph streams and power-law distributions, resulting in severe spatial-temporal performance degradation. This paper proposes a dual-layer dynamic graph summarization method (DLS). DLS is composed of inter-block and intra-block layers. In the inter-block layer, DLS employs an extendable suffix hash fingerprint-based addressing method, to achieve efficient inter-block addressing and migration. In the intra-block layer, DLS adopts a window-based adaptive extension mechanism, which adjusts the maximum extension size based on the degree statistics to reduce the intra-block hash collisions. The extensive experimental results on real-life graph datasets demonstrate DLS’s effectiveness. Compared with existing graph stream summarization methods, DLS can achieve an average of approximately 50 % performance promotion, 23 % average memory consumption saving than the traditional works.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108181"},"PeriodicalIF":6.2,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145311719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A multilevel algorithm for scalable independent task assignment","authors":"H. Burhan Tabak, E. Kartal Tabak, Cevdet Aykanat","doi":"10.1016/j.future.2025.108183","DOIUrl":"10.1016/j.future.2025.108183","url":null,"abstract":"<div><div>Assigning a large number of independent tasks to heterogeneous processors is a fundamental problem in modern computing, with applications in many domains such as cloud services, web crawling, and AI training. Exact and matheuristic approaches deliver high-quality assignments but incur superlinear or even exponential runtime costs, making them impractical, especially on large problem instances. Conversely, lightweight heuristics run efficiently at scale but often produce assignments with much lower quality. To address this issue, we present the first multilevel framework for the independent task assignment problem that maintains an end-to-end linear runtime bound of <span><math><mrow><mi>O</mi><mo>(</mo><mi>K</mi><mi>N</mi><mo>)</mo></mrow></math></span>, where <span><math><mrow><mi>K</mi><mspace></mspace><mo>×</mo><mspace></mspace><mi>N</mi></mrow></math></span> is the size of the expected-time-to-compute matrix, with <span><math><mi>K</mi></math></span> and <span><math><mi>N</mi></math></span> respectively representing the number of processors and tasks. We propose (i) novel high-quality coarsening metrics that numerically define task characteristics and similarity; (ii) an efficient and effective matching algorithm that incorporates these metrics while maintaining linear time complexity with respect to the input size; (iii) an initial solution scheme that generates base solutions using complementary heuristics, which are disjointly projected back through the uncoarsening levels; (iv) an effective and efficient uncoarsening algorithm that iteratively improves assignment quality with different refinement algorithms. Extensive experimental evaluations involving hundreds of millions of tasks demonstrate that our algorithm achieves significantly higher quality and runs faster than known high-quality heuristics, making it a practical choice for the problem instances at high scale.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108183"},"PeriodicalIF":6.2,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hamid Ghorbani, Nima Eslami, Mohammad Hossein Moaiyeri
{"title":"Energy-efficient ternary in-memory computing architecture for versatile health monitoring in wearable devices","authors":"Hamid Ghorbani, Nima Eslami, Mohammad Hossein Moaiyeri","doi":"10.1016/j.future.2025.108191","DOIUrl":"10.1016/j.future.2025.108191","url":null,"abstract":"<div><div>The rising prevalence of severe health conditions has increased the demand for e-healthcare solutions, particularly wearable devices for continuous monitoring and early detection. Nevertheless, the limited battery life and the substantial computational demands necessary for disease diagnosis pose significant challenges for these devices when using conventional computing systems. This paper presents a novel ternary memory architecture supporting in-memory computing (IMC) to address these challenges. The design features an advanced 1-transistor 1-RRAM (1T1R) ternary memory basic cell, which reduces storage requirements and enhances latency and power efficiency. This architecture also supports ternary logic operations within memory, utilizing a ternary polymorphic design to efficiently perform ternary operations, including basic logic functions, addition, and multiplication. This functionality enables the development of low-power, energy-efficient ternary neural networks for early disease detection and continuous monitoring. The post-layout simulations performed using the Cadence Virtuoso tool and the well-established TSMC 40 nm CMOS technology indicate that the proposed design achieves a 91 % reduction in storage energy consumption compared to existing alternatives. An in-memory implementation of the ternary full adder and multiplier results in an energy savings of 90.3 % compared to previous designs. Furthermore, the proposed ternary architecture is applied to skin cancer diagnosis in wearable devices, using the ternary IRV2 neural network for classification with the HAM10000 dataset. The ternary implementation yields a 97 % power saving compared to full-precision classification while maintaining an accuracy of 88 %.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108191"},"PeriodicalIF":6.2,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The NextGen Quantum-Secure Edge AI-Blockchain System: Enhancing Supply Chain Trust with Vision Transformers","authors":"Israelin Insulata J, J. Roselin","doi":"10.1016/j.future.2025.108179","DOIUrl":"10.1016/j.future.2025.108179","url":null,"abstract":"<div><div>Counterfeit products pose a critical threat to global supply chains, jeopardizing consumer safety, brand reputation, and economic stability. This paper introduces the NextGen Quantum-Secure Edge AI–Blockchain System, a scalable and resilient framework that integrates Vision Transformers (ViT), Federated Learning (FL), Blockchain, and Zero-Knowledge Proofs (ZKP) to achieve high-accuracy counterfeit detection and transparent product authentication. Leveraging multi-modal verification, combining RFID metadata validation, IoT-based anomaly detection, and advanced image analysis, the system ensures robust authentication while preserving data privacy. A hybrid on-chain/off-chain storage model optimizes data management and reduces blockchain congestion, while the novel PoA-X consensus mechanism enhances transaction throughput and minimizes latency. Post-quantum cryptographic primitives further safeguard against emerging quantum threats. Experimental evaluation demonstrates over 96% detection accuracy, stable counterfeit detection times of 355–395 ms, and an average throughput of 174 transactions per second, maintaining strong performance even under high traffic and adversarial conditions. By uniting adaptive AI, decentralized verification, and quantum-secure cryptography, the proposed framework delivers a future-proof, privacy-preserving, and highly efficient solution for counterfeit mitigation in global supply chains.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108179"},"PeriodicalIF":6.2,"publicationDate":"2025-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francesco Blefari , Cristian Cosentino , Francesco Aurelio Pironti , Angelo Furfaro , Fabrizio Marozzo
{"title":"CyberRAG: An agentic RAG cyber attack classification and reporting tool","authors":"Francesco Blefari , Cristian Cosentino , Francesco Aurelio Pironti , Angelo Furfaro , Fabrizio Marozzo","doi":"10.1016/j.future.2025.108186","DOIUrl":"10.1016/j.future.2025.108186","url":null,"abstract":"<div><div>Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can generate hundreds of thousands of alerts per hour, overwhelming analysts with logs requiring rapidly evolving expertise. Conventional machine-learning detectors reduce alert volume but still yield many false positives, while standard Retrieval-Augmented Generation (RAG) pipelines often retrieve irrelevant context and fail to justify predictions. We present CyberRAG, a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks. A central LLM agent orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii) tool adapters for enrichment and alerting; and (iii) an iterative retrieval-and-reason loop that queries a domain-specific knowledge base until evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning. This architecture autonomously refines threat labels and natural-language justifications, reducing false positives and enhancing interpretability. It is also extensible: new attack types can be supported by adding classifiers without retraining the core agent. CyberRAG was evaluated on SQL Injection, XSS, and SSTI, achieving over 94 % accuracy per class and a final classification accuracy of 94.92 % through semantic orchestration. Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based expert evaluation, with robustness preserved against adversarial and unseen payloads. These results show that agentic, specialist-oriented RAG can combine high detection accuracy with trustworthy, SOC-ready prose, offering a flexible path toward partially automated cyber-defense workflows.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108186"},"PeriodicalIF":6.2,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincenzo Maisto , Alessandro Cilardo , Emilio Billi , Chuck Fader
{"title":"A hardware/software architecture for multi-threaded offloading of erasure codes in distributed file systems","authors":"Vincenzo Maisto , Alessandro Cilardo , Emilio Billi , Chuck Fader","doi":"10.1016/j.future.2025.108187","DOIUrl":"10.1016/j.future.2025.108187","url":null,"abstract":"<div><div>Big Data analytics and cloud computing impose an ever-growing demand for data-center providers in terms of computational requirements, latency, and storage. Distributed file systems offer the strategic advantage of scaling-out computing and storage resources, hence allowing for notable speed-ups with massively parallel and distributed computing paradigms. On the other hand, such distributed clusters are constantly challenged with storage failures. Data replication is often deployed to ensure fault tolerance and business continuity, typically in a 3x configuration. This results in expensive 200 % overheads in storage space, write propagation, and energy costs. Erasure codes offer an alternative approach for fault tolerance by allowing reconstruction of erased data chunks, while reducing storage overhead down to 30 %. However, a considerable share of CPU cycles and energy is spent computing such codes, effectively reducing the cluster’s efficiency and starving other user and system tasks. Offloading on a custom accelerator is a non-trivial issue, due to the highly multi-threaded nature of such tasks and the lack of robust multi-threading support in conventional accelerator runtimes.</div><div>In this work, we present a heterogeneous hardware/software architectural design for large-scale and multi-threaded acceleration of distributed erasure codes on PCIe accelerators, and a new abstraction and integration model for distributed accelerators in fault-tolerant storage systems. We enable safe and seamless deployment of multi-threaded SYCL-based IP cores through a hardware thread proxying layer providing software thread-isolation, and integration with cluster-level middlewares. In addition, our design allows for heterogeneous cluster configurations, with full compatibility and transparent integration of heterogeneously-accelerated and CPU-only nodes. We systematically evaluate the individual layers of our architecture and validate design’s integration in a container-based HDFS cluster, comparing performance against the state-of-the-art AVX-512-accelerated ISA-L library and other SYCL substrates, such as GPUs and single-threaded FPGAs.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108187"},"PeriodicalIF":6.2,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}