Future Generation Computer Systems-The International Journal of Escience最新文献

筛选
英文 中文
Encoder-decoder based watermarking for federated learning models 基于编码器-解码器的联邦学习模型水印
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-10-01 DOI: 10.1016/j.future.2025.108175
Yuling Luo , Yuanze Li , Xue Ouyang , Siyuan Zu , Zhaohui Chen , Qiang Fu , Sheng Qin , Junxiu Liu
{"title":"Encoder-decoder based watermarking for federated learning models","authors":"Yuling Luo ,&nbsp;Yuanze Li ,&nbsp;Xue Ouyang ,&nbsp;Siyuan Zu ,&nbsp;Zhaohui Chen ,&nbsp;Qiang Fu ,&nbsp;Sheng Qin ,&nbsp;Junxiu Liu","doi":"10.1016/j.future.2025.108175","DOIUrl":"10.1016/j.future.2025.108175","url":null,"abstract":"<div><div>Federated learning, as a significant branch of deep learning, addresses issues related to data silos, data privacy, security, and communication bandwidth. In terms of intellectual property, it faces similar challenges as deep neural networks, namely vulnerabilities in protecting model ownership. Currently, some protection schemes are available, but existing federated learning protection schemes lack concealment in embedded watermark information, failing to ensure high robustness and security. Moreover, after embedding a large amount of watermark information, the impact on model performance cannot be guaranteed. Therefore, this paper proposes a novel federated learning protection framework consisting of three steps: watermark information generation, embedding, and ownership detection. In the generation of watermark information, an encoder-decoder structure is used for embedding. For embedding watermark information, a threshold processing method is employed to embed watermarks simultaneously in convolutional layers and BN layers. Experimental results show that the use of an encoder-decoder structure ensures high robustness, security, and concealment. It also allows for embedding a large amount of watermark information with minimal impact on the model’s original task, as the accuracy only decreases by 1.16% after embedding watermark information in four types of models. In addition, it exhibits high robustness against various common attacks, including fine-tuning, pruning, and equivalent attacks.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108175"},"PeriodicalIF":6.2,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed compressive genomics: Fundamental pattern matching primitives via spark 分布式压缩基因组学:基于spark的基本模式匹配原语
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-09-29 DOI: 10.1016/j.future.2025.108169
Lorenzo Di Rocco , Umberto Ferraro Petrillo , Raffaele Giancarlo , Giuseppe Cattaneo
{"title":"Distributed compressive genomics: Fundamental pattern matching primitives via spark","authors":"Lorenzo Di Rocco ,&nbsp;Umberto Ferraro Petrillo ,&nbsp;Raffaele Giancarlo ,&nbsp;Giuseppe Cattaneo","doi":"10.1016/j.future.2025.108169","DOIUrl":"10.1016/j.future.2025.108169","url":null,"abstract":"<div><div>Compressive genomics leverages compressed data representations to enhance the efficiency of bioinformatics tasks like sequence comparison and search. Surprisingly, the fundamental operation of pattern matching on large DNA sequence collections remains unexplored in the realm of genomic analysis. However, distributed systems like Spark offer the scalability necessary to process increasingly large genomic datasets efficiently. We present the first Spark-based implementation of the FM-Index and Compressed Boyer-Moore (CBM) algorithms, evaluating their performance and providing insights into their advantages for large-scale bioinformatics applications. A comprehensive experimental study demonstrates clear performance gains over uncompressed approaches. Furthermore, we introduce <em>SparkGeco</em>, a distributed compressive genomics software library designed to simplify the integration of FM-Index and CBM algorithms into DNA sequence analysis pipelines within Apache Spark, thus supporting the development of efficient and scalable genomic analysis workflows. This work provides a concrete step towards high-performance, data-centric eScience solutions in computational biology.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108169"},"PeriodicalIF":6.2,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic transparent streaming in file-based workflows with CAPIO 使用CAPIO的基于文件的工作流中的动态透明流
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-09-27 DOI: 10.1016/j.future.2025.108159
Marco Edoardo Santimaria , Iacopo Colonnelli , Barbara Cantalupo , Massimo Torquati , Doriana Medić , Nicola Tuccari , Eva Sciacca , Marco Aldinucci
{"title":"Dynamic transparent streaming in file-based workflows with CAPIO","authors":"Marco Edoardo Santimaria ,&nbsp;Iacopo Colonnelli ,&nbsp;Barbara Cantalupo ,&nbsp;Massimo Torquati ,&nbsp;Doriana Medić ,&nbsp;Nicola Tuccari ,&nbsp;Eva Sciacca ,&nbsp;Marco Aldinucci","doi":"10.1016/j.future.2025.108159","DOIUrl":"10.1016/j.future.2025.108159","url":null,"abstract":"<div><div>Advances in big data and the growth in complexity of modern applications highlight the necessity for optimizing workflow executions on different levels, such as hybrid workflow executions, automatic optimization of data movements, and efficient use of IO. Following this line, streaming features are the desired capabilities for file-based workflows as they can reduce overall execution times. Expanding workflows with streaming capabilities usually requires rewriting the application, which is time-consuming and requires deep knowledge of the application. With this work, we introduce the Cross-Application Programmable IO (CAPIO) methodology, of which the stack is composed of two parts: the CAPIO-CL coordination language and the CAPIO middleware (which implements the semantics expressed by the CAPIO-CL coordination language). The CAPIO-CL coordination language annotates synchronization semantics between files produced and consumed by workflow steps. At the same time, the CAPIO middleware improves the performance of file-based workflows, leveraging the information provided by the CAPIO-CL language while not having to change (recompile) the code of the original workflow steps. By design, the CAPIO middleware supports multiple backends and can be extended to support more. It is dynamic, and it supports dynamic job scheduling. Benchmarks, done on both microbenchmarks and real-life workflows, prove that with CAPIO, it is possible to reduce the workflow execution time by up to <span><math><mrow><mo>∼</mo><mn>50</mn><mo>%</mo></mrow></math></span>.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108159"},"PeriodicalIF":6.2,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel sorting algorithm classification: is manual instrumentation necessary? 并行排序算法分类:是否需要人工仪表?
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-09-27 DOI: 10.1016/j.future.2025.108170
Michael McKinsey , Dewi Yokelson , Stephanie Brink , Tom Scogland , Olga Pearce
{"title":"Parallel sorting algorithm classification: is manual instrumentation necessary?","authors":"Michael McKinsey ,&nbsp;Dewi Yokelson ,&nbsp;Stephanie Brink ,&nbsp;Tom Scogland ,&nbsp;Olga Pearce","doi":"10.1016/j.future.2025.108170","DOIUrl":"10.1016/j.future.2025.108170","url":null,"abstract":"<div><div>Understanding parallel algorithms is crucial for accelerating scientific simulations on complex, distributed memory, high-performance computers. Modern algorithm classification approaches learn semantics directly from source code to differentiate between algorithms, however, accessing source code is not always possible. We can learn about parallel algorithms from observing their performance, as programs running the same algorithms and using the same hardware should exhibit similar performance characteristics. We present an approach to learn algorithm classes from parallel performance data directly in order to classify algorithms without access to the source code. We extend previous work to enable classifying parallel sorting algorithms using automatic instrumentation instead of requiring manual region annotations in the source code. In this work, we design and demonstrate a study for classification of parallel sorting algorithms using parallel performance data collected from automatic instrumentation, and evaluate the performance of our new methodology on classification. We leverage Caliper to collect the performance data, Thicket for our exploratory data analysis (EDA), and PyTorch and Scikit-learn to evaluate the effectiveness of random forests, support vector machines (SVMs), decision trees, neural networks, and logistic regressions on parallel performance data. Additionally, we study noise in parallel performance data, whether the removal of noise and pre-processing of the data is necessary to accurately classify parallel sorting algorithms, and determine the effectiveness of features created from performance data. We demonstrate classification accuracy for these five different models of up to 97.7% across four different parallel algorithm classes.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108170"},"PeriodicalIF":6.2,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimization-based hybrid offloading framework for IoMT in edge-cloud healthcare systems 边缘云医疗保健系统中基于优化的IoMT混合卸载框架
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-09-26 DOI: 10.1016/j.future.2025.108163
Sheharyar Khan, Shijun Liu, Li Pan, Guangxu Mei
{"title":"Optimization-based hybrid offloading framework for IoMT in edge-cloud healthcare systems","authors":"Sheharyar Khan,&nbsp;Shijun Liu,&nbsp;Li Pan,&nbsp;Guangxu Mei","doi":"10.1016/j.future.2025.108163","DOIUrl":"10.1016/j.future.2025.108163","url":null,"abstract":"<div><div>The Internet of Medical Things (IoMT) produces substantial amounts of real-time data from devices like ECG and EEG monitors, presenting significant issues in latency, energy efficiency, and resource allocation. Traditional offloading methods often fail to satisfy the low-latency and high-reliability requirements of modern healthcare systems. To address these limitations, this study presents a hybrid computing framework that integrates edge and cloud resources to facilitate efficient and scalable data processing. The proposed system integrates Genetic Algorithm (GA), Particle Swarm Optimization (PSO), and Graph Neural Networks (GNNs) to enhance task offloading, reduce latency, and optimize resource utilization. Experimental findings indicate that the approach significantly enhances system performance, minimizes energy consumption, and ensures consistent connectivity among diverse IoMT devices. The framework enables adaptable and efficient real-time processing, thereby enhancing advanced healthcare systems and optimizing both clinical decision-making and patient outcomes.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108163"},"PeriodicalIF":6.2,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive green cloud applications: Balancing emissions, revenue, and user experience through approximate computing 适应性绿色云应用:通过近似计算平衡排放、收入和用户体验
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-09-25 DOI: 10.1016/j.future.2025.108143
Monica Vitali , Philipp Wiesner , Kevin Kreutz , Roberto Gandola
{"title":"Adaptive green cloud applications: Balancing emissions, revenue, and user experience through approximate computing","authors":"Monica Vitali ,&nbsp;Philipp Wiesner ,&nbsp;Kevin Kreutz ,&nbsp;Roberto Gandola","doi":"10.1016/j.future.2025.108143","DOIUrl":"10.1016/j.future.2025.108143","url":null,"abstract":"<div><div>Organisations will soon be required to take an active role in the green transition by minimising the environmental impact of their operations, including emissions from their information systems. National and international regulations are expected to drive this shift by enforcing carbon budgets that organisations must comply with. As a result, applications must not only be aware of their carbon footprint but also operate within these budgetary constraints.</div><div>Traditional methods, such as time and location shifting, have been used to mitigate emissions, but their impact is limited and not applicable to all types of applications. Recent research suggests that reducing an application’s environmental footprint can be achieved through approximation techniques, where workflows dynamically adjust at runtime by scaling back certain functionalities or features. However, this approach introduces trade-offs: limiting functionalities can reduce revenue, especially when tied to third-party agreements, and may also degrade the user experience. Thus, striking a balance between carbon reduction, business objectives, and user satisfaction is crucial.</div><div>We present a carbon-aware application management approach that leverages approximate computing techniques to balance sustainability, user experience, and revenue. Our method dynamically optimises the configuration and scaling of individual software components within a predefined carbon budget. Through simulation-based evaluation across diverse regions, carbon budgets, and application setups, we demonstrate that the approach effectively adapts to fluctuating workloads and regional variations in carbon intensity.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108143"},"PeriodicalIF":6.2,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast automatic radiotherapy planning via algorithmic improvements and computational acceleration 通过算法改进和计算加速快速自动放疗计划
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-09-25 DOI: 10.1016/j.future.2025.108168
Juan José Moreno , Savíns Puertas-Martín , Nelson G. Roman , Juana L. Redondo , Ester M. Garzón
{"title":"Fast automatic radiotherapy planning via algorithmic improvements and computational acceleration","authors":"Juan José Moreno ,&nbsp;Savíns Puertas-Martín ,&nbsp;Nelson G. Roman ,&nbsp;Juana L. Redondo ,&nbsp;Ester M. Garzón","doi":"10.1016/j.future.2025.108168","DOIUrl":"10.1016/j.future.2025.108168","url":null,"abstract":"<div><div>Intensity-Modulated Radiation Therapy enhances dose delivery by dynamically adjusting beam intensities to target tumorous tissues while preserving healthy organs. One of the most effective planning approaches uses the Generalized Equivalent Uniform Dose metric, which ensures high-quality treatment plans but requires tuning several hyperparameters for each anatomical structure. Traditionally, this process is performed manually by clinical experts, making it time-consuming and dependent on human expertise. To address these challenges, a previous method combined multi-objective evolutionary search with gradient-based optimization to automate the tuning process. However, this hybrid strategy incurs high computational cost, as each candidate solution must undergo a complete gradient-based optimization step, repeated thousands of times throughout the process. This study introduces two complementary strategies to improve the efficiency of this framework. First, we analyze alternative multi-objective evolutionary algorithms that converge more rapidly, thereby reducing the number of required function evaluations, and we compare three gradient-based optimization methods to identify the one that accelerates convergence without compromising plan quality. Second, we implement a parallel computing framework that distributes the function evaluations across heterogeneous multicore computing clusters using a static batch scheduling strategy adapted to each node’s computational capacity. Combined, these algorithmic and computational enhancements yield an acceleration factor of 4049 compared to the original implementation. As a result, high-quality radiotherapy treatment plans can be automatically generated in approximately one hour, making this approach viable for integration into time-constrained clinical workflows.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108168"},"PeriodicalIF":6.2,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A distributive and attentive generative model for multi-party data synthesis in highly imbalanced data 高度不平衡数据中多方数据综合的分布式细心生成模型
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-09-25 DOI: 10.1016/j.future.2025.108166
Imam Mustafa Kamal, Chastine Fatichah
{"title":"A distributive and attentive generative model for multi-party data synthesis in highly imbalanced data","authors":"Imam Mustafa Kamal,&nbsp;Chastine Fatichah","doi":"10.1016/j.future.2025.108166","DOIUrl":"10.1016/j.future.2025.108166","url":null,"abstract":"<div><div>In the era of Artificial Intelligence (AI), where data plays a pivotal role, researchers are increasingly leveraging synthetic data to address privacy concerns, mitigate data scarcity, and enhance model robustness. This approach is particularly promising in critical domains such as healthcare, finance, government, and autonomous systems, where diverse and representative datasets are essential for effective AI training. The integration of data from multiple sources or parties in the context of big data can significantly enrich the available information. However, the data contributed by each party often exhibits distinct characteristics, leading to highly imbalanced distributions. This challenge introduces an additional layer of complexity known as the double imbalance problem, characterized by imbalances both within individual parties and across multiple parties. To address these challenges, we propose a novel generative adversarial network (GAN) framework incorporating distributed discriminators and dual attention mechanisms. Our approach utilizes a single generator to synthesize data conditioned on multiple parties, with each party maintaining its own Critic and dataset to ensure privacy preservation. We introduce local and global attention mechanisms, along with gradient-casting techniques during training, to effectively address the dual imbalance issues prevalent in multi-party data synthesis. The local attention mechanism addresses imbalances within individual parties, while the global attention mechanism targets imbalances across parties, resulting in a more stable generative model in the presence of highly imbalanced data distributions. To validate our approach, we conducted empirical experiments using six real-world tabular datasets, deliberately setting up dual imbalance scenarios across various intra- and inter-party contexts. We evaluated the utility of the synthetic data generated by multiple parties by assessing its efficacy in machine learning tasks. The results demonstrate that our distributed GAN with dual attention mechanisms outperforms existing generative models in addressing these challenges.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108166"},"PeriodicalIF":6.2,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145268848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New fuzzy K-nearest neighbor algorithms for classification performance improvement 改进分类性能的新模糊k近邻算法
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-09-25 DOI: 10.1016/j.future.2025.108139
Hassan I. Abdalla , Ali A. Amer , Mohammad Nassef
{"title":"New fuzzy K-nearest neighbor algorithms for classification performance improvement","authors":"Hassan I. Abdalla ,&nbsp;Ali A. Amer ,&nbsp;Mohammad Nassef","doi":"10.1016/j.future.2025.108139","DOIUrl":"10.1016/j.future.2025.108139","url":null,"abstract":"<div><div>In fuzzy k-nearest neighbor, smooth class boundaries are provided by each instance’s fuzzy degree of membership. However, there are additional costs associated with calculating the memberships due to memory limitations and runtime overhead. Furthermore, in the presence of class imbalance and outliers, the effectiveness and efficiency of the most advanced fuzzy kNNs continue to decline. Thus, new fuzzy kNNs with straightforward designs are developed in this study to substantially lessen the influence of these problems and improve overall performance. The local mean vectors with the single linkage and the cumulative means of neighbors are combined, establishing these models, which are referred to as LMSL-FkNN and CMDW-FkNN, respectively. A comprehensive evaluation study spanning five experimental stages is carried out against six cutting-edge kNN competitors utilizing fifty-four real-world (balanced, imbalanced, noisy, and time series) datasets in order to illustrate the competitiveness of the established models. With CMDW-FkNN comfortably dominating the competition across the vast majority of datasets (specifically UCI, highly-Imbalanced, and Time Series datasets), the results supported by statistical tests, across three assessment metrics-accuracy, F-measure, and ROC-show that both models have significantly more promise than their rivals.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108139"},"PeriodicalIF":6.2,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of data access and schedule optimization for VTA compiled instruction streams VTA编译指令流的数据访问和调度优化设计
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-09-24 DOI: 10.1016/j.future.2025.108165
Ruohan Cheng , Yanshuo Gao , Chenglong Zeng , Yinghai Zhao , Kuizhi Mei
{"title":"Design of data access and schedule optimization for VTA compiled instruction streams","authors":"Ruohan Cheng ,&nbsp;Yanshuo Gao ,&nbsp;Chenglong Zeng ,&nbsp;Yinghai Zhao ,&nbsp;Kuizhi Mei","doi":"10.1016/j.future.2025.108165","DOIUrl":"10.1016/j.future.2025.108165","url":null,"abstract":"<div><div>In recent years, the development and rapid implementation of convolutional neural network models have become a key research area in the field of deep learning, and model deployment schemes based on deep learning compilers have been widely studied. Acceleration of the process of convolution operations and improvement of model inference performance is one of the important research areas in the field of deep learning compilers. In this paper, based on the open source deep learning compilation framework TVM and the architecture of the deep learning accelerator VTA, we propose a minimum data access design based on the input prioritized schedule and the on-chip weight-memory reuse scheme, which provides an optimization scheme with generality for inference acceleration of convolutional neural networks. By applying the optimized schedule scheme proposed, we can avoid redundant data accesses for convolutional computation with proper shape. Comparison experiments show that the model inference time of YOLOv3 is reduced by about 10<span><math><mo>%</mo></math></span> with limited hardware resources.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"176 ","pages":"Article 108165"},"PeriodicalIF":6.2,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信