IEEE Transactions on Computers最新文献

筛选
英文 中文
AVL Function Table for LeafHooks Insertion With Obfuscated Control Flow Integrity
IF 3.6 2区 计算机科学
IEEE Transactions on Computers Pub Date : 2024-12-31 DOI: 10.1109/TC.2024.3524080
Sirong Zhao;Guoqi Xie;Chenglai Xiong;Kenli Li;Xuejun Yu;Bo Wan;Yiwen Jiang
{"title":"AVL Function Table for LeafHooks Insertion With Obfuscated Control Flow Integrity","authors":"Sirong Zhao;Guoqi Xie;Chenglai Xiong;Kenli Li;Xuejun Yu;Bo Wan;Yiwen Jiang","doi":"10.1109/TC.2024.3524080","DOIUrl":"https://doi.org/10.1109/TC.2024.3524080","url":null,"abstract":"Control flow is the execution order of individual statements, instructions, or function calls within an imperative program. Malicious operation of control flow (e.g., tampering with normal function addresses) leads to severe consequences such as data leakage and system crash. Control Flow Integrity (CFI) is a defense restricting the execution order of program within Control Flow Graph (CFG). IndexHooks is an existing CFI solution designed against forward function calls tampering (including direct and indirect jump). This solution constructs a read-only linear function table that stores function addresses during compilation. Then, IndexHooks checks the table to make program jump to the correct target address during runtime. However, IndexHooks faces limitations in backtracking CFG construction, which can lead to excessive memory usage; the linear structure of the function table is vulnerable to brute force tampering. Addressing the limitations of IndexHooks, this study develops an obfuscated CFI solution called LeafHooks. LeafHooks is implemented during compilation by the LLVM compiler, which performs static analysis and instrumentation on the LLVM Intermediate Representation (IR) code of a program. We make the following three innovations: 1) we propose a speculation-free identification method for indirect function calls by linear traversing and analyzing codes to obtain legal function information (function address); 2) we save this information into a function table in the form of a Balanced Binary Tree (also known as AVL), enhancing the fuzzification of function addresses to defend against brute force; 3) we design a method to simulate control tamper attacks on ARM64 architecture to verify the ability of LeafHooks to protection. LeafHooks shows less overhead than state-of-the-art solutions and reduces 2.9% and 0.55% overhead on average using UnixBench and Phoronix, respectively.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1334-1347"},"PeriodicalIF":3.6,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flexible Job Scheduling With Spatial-Temporal Compatibility for In-Network Aggregation
IF 3.6 2区 计算机科学
IEEE Transactions on Computers Pub Date : 2024-12-27 DOI: 10.1109/TC.2024.3523420
Yulong Li;Wenxin Li;Yuxuan Du;Yinan Yao;Song Zhang;Linxuan Zhong;Keqiu Li
{"title":"Flexible Job Scheduling With Spatial-Temporal Compatibility for In-Network Aggregation","authors":"Yulong Li;Wenxin Li;Yuxuan Du;Yinan Yao;Song Zhang;Linxuan Zhong;Keqiu Li","doi":"10.1109/TC.2024.3523420","DOIUrl":"https://doi.org/10.1109/TC.2024.3523420","url":null,"abstract":"In-Network Aggregation (INA) solutions represent the forefront in advancing All-Reduce, utilizing limited switch memory for efficient gradient aggregation. However, existing INA solutions primarily focus on enhancing aggregation efficiency, often overlooking the efficient utilization of memory. Isolation solutions typically pre-allocate resources for each job, leading to memory wastage due to the uncontrolled use of resources. In contrast, the sharing solutions encounter significant memory contention, resulting in performance degradation within a multi-tenant environment. In this paper, we propose DynaINA, a flexible job scheduler to support multi-tenant training. The core idea of DynaINA is to provide spatial and temporal compatibility between jobs. For spatial compatibility, DynaINA utilizes multiple dynamic memory pools to provide job isolation. For temporal compatibility, DynaINA employs contention-aware job scheduling to facilitate memory sharing. Furthermore, DynaINA prioritizes communication-intensive jobs, leveraging the benefits of INA to enhance overall performance in training clusters. Extensive experiments with popular vision and language models demonstrate that DynaINA reduces training time by up to 65.16% and improves switch memory utilization by up to 85.02% compared to state-of-the-art solutions in a 100Gbps network.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1322-1333"},"PeriodicalIF":3.6,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Qu-Trefoil: Large-Scale Quantum Circuit Simulator Working on FPGA With SATA Storages
IF 3.6 2区 计算机科学
IEEE Transactions on Computers Pub Date : 2024-12-23 DOI: 10.1109/TC.2024.3521546
Kaijie Wei;Hideharu Amano;Ryohei Niwase;Yoshiki Yamaguchi;Takefumi Miyoshi
{"title":"Qu-Trefoil: Large-Scale Quantum Circuit Simulator Working on FPGA With SATA Storages","authors":"Kaijie Wei;Hideharu Amano;Ryohei Niwase;Yoshiki Yamaguchi;Takefumi Miyoshi","doi":"10.1109/TC.2024.3521546","DOIUrl":"https://doi.org/10.1109/TC.2024.3521546","url":null,"abstract":"Quantum circuits are fundamental components of quantum computing, and state-vector-based quantum circuit simulation is a widely used technique for tracking qubit behavior throughout circuit evolution. However, simulating a circuit with <inline-formula><tex-math>$n$</tex-math></inline-formula> qubits requires <inline-formula><tex-math>$2^{n+4}$</tex-math></inline-formula> bytes of memory, making simulations of more than 40 qubits feasible only on supercomputers. To address this limitation, we propose the Qu-Trefoil, a system designed for large-scale quantum circuit simulations on an FPGA-based platform called Trefoil. Trefoil is a multi-FPGA system connected to eight storage subsystems, each equipped with 32 SATA disks. Qu-Trefoil integrates a suite of HLS-based universal quantum gates, including Clifford gates (Hadamard (H), Pauli-Z (Z), Phase (S), Controlled-NOT (CNOT)), the T gate, and unitary matrix computation, along with HDL-designed modules for system-wide integration. Our extensive evaluation demonstrates the system's robustness and flexibility, covering quantum gate performance, chunk size, disk extensibility, and efficiency across different SATA generations. We successfully simulated quantum circuits with over 43 qubits, which required more than 128 TB of memory, in approximately 3.72 to 13.06 hours on a single storage subsystem equipped with one FPGA. This achievement represents a significant milestone in the advancement of quantum computing simulations. Furthermore, thanks to its unique architecture, Qu-Trefoil is more accessible, flexible, and cost-efficient than other existing simulators for large-scale quantum circuit simulations, making it a viable option for researchers with limited access to supercomputers.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1306-1321"},"PeriodicalIF":3.6,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10812903","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Microarchitectural Attacks and Mitigations on Retire Resources in Modern Processors
IF 3.6 2区 计算机科学
IEEE Transactions on Computers Pub Date : 2024-12-23 DOI: 10.1109/TC.2024.3521225
Ke Xu;Ming Tang;Quancheng Wang;Han Wang
{"title":"Microarchitectural Attacks and Mitigations on Retire Resources in Modern Processors","authors":"Ke Xu;Ming Tang;Quancheng Wang;Han Wang","doi":"10.1109/TC.2024.3521225","DOIUrl":"https://doi.org/10.1109/TC.2024.3521225","url":null,"abstract":"In modern processors, the Retire Control Unit (RCU) is responsible for receiving the µops decoded from the frontend and retiring the completed µops in order through the retirement. Consequently, the retirement may stall differently depending on the execution time of the first instruction in the RCU, causing varying stalling in the RCU reception. Moreover, We find that the RCU reception in AMD processors and retirement in Intel processors are shared between two logical cores of the same physical core, allowing an attacker to infer the instructions executed by another logical core based on its retire resources efficiency. Based on these findings, we introduce the retirement covert channel on Intel processors and the RCU covert channel on AMD processors. Furthermore, we explores additional applications of retire resources. On the one hand, we combined the misprediction penalty mechanism to apply our covert channels to the Spectre attacks. On the other hand, based on the principle that different programs result in varied usage patterns of retire resources, we propose an attack method that leverages the retire resources to infer the program run by the victim. Finally, we design the corresponding mitigations and extend our mitigation to fetch unit to reduce the performance overhead.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1253-1266"},"PeriodicalIF":3.6,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of a Universal Decoder Model Based on DNA Winner-Takes-All Neural Networks
IF 3.6 2区 计算机科学
IEEE Transactions on Computers Pub Date : 2024-12-23 DOI: 10.1109/TC.2024.3521230
Chun Huang;Jiaying Shao;Baolei Peng;Qingshuang Guo;Panlong Li;Junwei Sun;Yanfeng Wang
{"title":"Design of a Universal Decoder Model Based on DNA Winner-Takes-All Neural Networks","authors":"Chun Huang;Jiaying Shao;Baolei Peng;Qingshuang Guo;Panlong Li;Junwei Sun;Yanfeng Wang","doi":"10.1109/TC.2024.3521230","DOIUrl":"https://doi.org/10.1109/TC.2024.3521230","url":null,"abstract":"DNA computing has proven to possess strong parallel processing capabilities, offering notable advantages for multi-objective computation. Traditional, complex nonlinear DNA molecular logic circuits require the pre-construction of basic logic gates, followed by their cascading to achieve logic functions. However, as the number of cascade levels increases, more DNA strands are required to amplify and recover signals, causing the system's reaction time to grow exponentially. This paper introduces a novel approach for building complex nonlinear digital logic circuits using DNA winner-take-all neural networks. The logic circuit comprises four computational modules: weight multiplication, summation, competitive annihilation, and reporting. First, an annihilation strand is designed to control the reaction rate between two competing signals, resolving the interference between weight multiplication and competitive annihilation. Second, new grouping strategies—complementary annihilation, equal annihilation, and denoise annihilation—are introduced. These strategies exponentially reduce the number of competitive strands and significantly decrease system reaction time. The effect becomes more pronounced as the number of input patterns increases. Finally, 2-4, 3-8, and 4-16 decoder circuits are built using the winner-take-all neural networks, and an <inline-formula><tex-math>$boldsymbol{n}boldsymbol{-}boldsymbol{2}^{boldsymbol{n}}$</tex-math></inline-formula> universal decoder model is further developed. This study presents an effective method for implementing complex nonlinear logic circuits through DNA strand displacement reactions.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1267-1277"},"PeriodicalIF":3.6,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comprehensive Scan Test Cost Model to Optimize the Production of Very Large SoCs
IF 3.6 2区 计算机科学
IEEE Transactions on Computers Pub Date : 2024-12-23 DOI: 10.1109/TC.2024.3521246
Giusy Iaria;Paolo Bernardi;Claudia Bertani;Lorenzo Cardone;Giuseppe Garozzo;Vincenzo Tancorre
{"title":"A Comprehensive Scan Test Cost Model to Optimize the Production of Very Large SoCs","authors":"Giusy Iaria;Paolo Bernardi;Claudia Bertani;Lorenzo Cardone;Giuseppe Garozzo;Vincenzo Tancorre","doi":"10.1109/TC.2024.3521246","DOIUrl":"https://doi.org/10.1109/TC.2024.3521246","url":null,"abstract":"This paper explores the trade-offs of reducing scan test patterns during Wafer Sort, accepting additional packaging costs, and screening more chips during Package Tests. Previous works proposed ways of selecting or reordering patterns to bring the most efficient to the left. Unlike such studies, this work quantifies the benefit of removing patterns directly from the tail of any pattern set. The paper elaborates on novel formulas to propose a comprehensive cost model that combines yield, Wafer Sort, packaging, and Package Test costs. The model evolves from known concepts by assuming that mass production defectivity is non-uniformly distributed over the die population and accounts for sacrificial lots to extract guiding information. It is shown that reducing patterns at Wafer Sort is beneficial under certain conditions of yield, fault coverage, and considering equipment and production costs. The model accurately estimates the number of patterns to remove for maximum gain in these cases. As a further by-product, the paper shows that a significant cost advantage can be achieved if pattern generation is guided based on the basics of the non-uniform failure distribution. This approach is validated with an academic benchmark and by observing six months of production for a real-world microcontroller by STMicroelectronics.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1278-1292"},"PeriodicalIF":3.6,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10812061","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New ECC Configuration Method for DRAM System Considering Metadata
IF 3.6 2区 计算机科学
IEEE Transactions on Computers Pub Date : 2024-12-23 DOI: 10.1109/TC.2024.3521545
Jaeil Lim;Jaewon Chung;Donghun Jeong;Daegeun Jee;Euicheol Lim
{"title":"A New ECC Configuration Method for DRAM System Considering Metadata","authors":"Jaeil Lim;Jaewon Chung;Donghun Jeong;Daegeun Jee;Euicheol Lim","doi":"10.1109/TC.2024.3521545","DOIUrl":"https://doi.org/10.1109/TC.2024.3521545","url":null,"abstract":"In this paper, a new ECC (error correcting code) solution for DRAM (dynamic random access memory) in computing systems is proposed. Existing papers on ECC for DRAM systems do not consider storage space for metadata. The methodology proposed in this paper considers storing metadata attached to a cacheline data in DRAM. We infer the maximum number of single-chip error correction cases that a linear code can support while considering metadata storage space. This can be said to be the maximum theoretical correction probability for a single chip error. A methodology to construct a code with maximum single-chip error correction is presented. A decoding methodology for the code is proposed. The proposed ECC solution can correct not only single chip failure but also additional small bit errors. We calculate the correction capability of the proposed methodology and verified it through simulation. The encoder and decoder hardware were synthesized and compared with existing methodologies.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1293-1305"},"PeriodicalIF":3.6,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trident: The Acceleration Architecture for High-Performance Private Set Intersection
IF 3.6 2区 计算机科学
IEEE Transactions on Computers Pub Date : 2024-12-18 DOI: 10.1109/TC.2024.3517738
Jinkai Zhang;Yinghao Yang;Zhe Zhou;Zhicheng Hu;Xin Zhao;Liang Chang;Hang Lu;Xiaowei Li
{"title":"Trident: The Acceleration Architecture for High-Performance Private Set Intersection","authors":"Jinkai Zhang;Yinghao Yang;Zhe Zhou;Zhicheng Hu;Xin Zhao;Liang Chang;Hang Lu;Xiaowei Li","doi":"10.1109/TC.2024.3517738","DOIUrl":"https://doi.org/10.1109/TC.2024.3517738","url":null,"abstract":"Private Set Intersection (PSI) is imperative in discovering the properties of the same data owned by two competitive parties, without revealing anything else of their respective data asset. Existing PSI solutions such as APSI and ORI-PSI suffer from severe communication and computation overhead due to inefficient communication and FHE polynomial evaluation, which hinders their deployment in practice. This issue is evident in both the upper-level protocol and the lower-level hardware platform. In this paper, we propose a novel software/hardware co-design acceleration architecture for PSI, termed as “Trident”, which includes two tightly coupled segments: from the protocol perspective, we investigate existing bottlenecks and propose a new PSI protocol with significantly less communication and computation under the security guarantee; besides, we re-architect the hardware platform by designing a PSI-specific accelerator, implemented with both FPGA and ASIC, targeting the key operations in the proposed protocol. We build a real-world experimental environment with two instantiated parties to verify the acceleration architecture, and highlight the following results: (1) up to 130<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula>/145<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula> speedup for the computation of <i>receiver</i> and <i>sender</i> parties; (2) up to 37<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula> reduction of communication overhead. (3) up to 93,651<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula> and 74,326<inline-formula><tex-math>$boldsymbol{times}$</tex-math></inline-formula> higher energy efficiency over the CPU-based ORI-PSI and APSI, respectively.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1152-1167"},"PeriodicalIF":3.6,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Big-Computing and Little-Storing STT-MRAM PIM Architecture With Charge Domain Based MAC Operation
IF 3.6 2区 计算机科学
IEEE Transactions on Computers Pub Date : 2024-12-16 DOI: 10.1109/TC.2024.3517754
Yunho Jang;Dongsu Kim;Yeseul Kim;Jongsun Park
{"title":"Big-Computing and Little-Storing STT-MRAM PIM Architecture With Charge Domain Based MAC Operation","authors":"Yunho Jang;Dongsu Kim;Yeseul Kim;Jongsun Park","doi":"10.1109/TC.2024.3517754","DOIUrl":"https://doi.org/10.1109/TC.2024.3517754","url":null,"abstract":"Spin transfer torque magnetic random access memory (STT-MRAM) is a promising memory technology for processing in memory (PIM) thanks to its high endurance and relatively low device-to-device and cycle-to-cycle variations. However, the low OFF/ON ratio of STT device limits the number of active row-lines during multiply-accumulate (MAC) operations, degrading energy efficiency and computation speed. In this paper, we present an energy efficient and high speed Big-computing and Little-storing STT-MRAM PIM (BCLS-SP) architecture, which can increase the number of active row-lines with almost no area overhead. In the BCLS-SP architecture, a charge domain-based STT-MRAM PIM (CD-SP) structure is employed to concurrently activate many row-lines by improving MAC operation reliability. Filter-wise weight compression (FWC) and weight sharing (WS) are also devised to compress the weights stored in CD-SP, thus reducing area cost. In addition, the proposed architecture performs MAC operations with skipping zero-valued input (SZI) and zero-conversion scheme (ZCS) for better energy efficiency and performance. The simulations using 28nm CMOS process show that the BCLS-SP architecture shows energy reduction of 29% and performance improvement of 3.6 compared to the recent memristive device-based PIM using weight compression and input skipping.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1239-1252"},"PeriodicalIF":3.6,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Sketch Deployment for Software Switches
IF 3.6 2区 计算机科学
IEEE Transactions on Computers Pub Date : 2024-12-16 DOI: 10.1109/TC.2024.3517749
Kejun Guo;Fuliang Li;Jiaxing Shen;Xingwei Wang;Jiannong Cao
{"title":"Distributed Sketch Deployment for Software Switches","authors":"Kejun Guo;Fuliang Li;Jiaxing Shen;Xingwei Wang;Jiannong Cao","doi":"10.1109/TC.2024.3517749","DOIUrl":"https://doi.org/10.1109/TC.2024.3517749","url":null,"abstract":"Network measurement is critical for various network applications, but scaling measurement techniques to the network-wide level is challenging for existing sketch-based solutions. In software switches, centralized deployment provides low resource usage but suffers from poor load balancing. In contrast, collaborative measurement achieves load balancing through flow distribution across software switches but requires high resource usage. This paper presents a novel distributed deployment framework that overcomes the limitations above. First, our framework is lightweight such that it splits sketches into segments and allocates them across forwarding paths to minimize resource usage and achieve load balancing. This also enables per-packet load balancing by distributing computations across software switches. Second, through a novel collaborative strategy, our framework achieves finer-grained flow distribution and further optimizes load balancing. Third, we further optimize load balancing by eliminating the mutual influence among forwarding paths. We evaluate the proposed framework on various network topologies and different sketches. Results indicate our solution matches the load balancing of collaborative measurement while approaching the low resource usage of centralized deployment. Moreover, it achieves superior performance in per-packet load balancing, which is not considered in previous deployment solutions.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 4","pages":"1210-1223"},"PeriodicalIF":3.6,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信