Josie E. Rodriguez Condia;Juan-David Guerrero-Balaguera;Robert Limas Sierra;Matteo Sonza Reorda
{"title":"Investigating and Mitigating Critical Faults in Floating-Point and Posit Arithmetic Hardware","authors":"Josie E. Rodriguez Condia;Juan-David Guerrero-Balaguera;Robert Limas Sierra;Matteo Sonza Reorda","doi":"10.1109/TETC.2025.3615827","DOIUrl":"https://doi.org/10.1109/TETC.2025.3615827","url":null,"abstract":"Mature computing formats, such as Floating-Point (FP), provide optimal accuracy to process real values and are essential in most scientific domains. However, the massive market adoption of highly parallel systems, with advanced technology nodes, in several domains exacerbates the need for highly reliable systems. Formerly, most reliability evaluations targeted FP hardware. Unfortunately, fine-grain assessments on cores with recent arithmetic format alternatives, such as Posit (particularly suited for Artificial Intelligence), have remained partially unexplored. Similarly, the effects of corruption on operations due to faulty hardware are not well-known, which may prevent the proposal of effective mitigation mechanisms. This work exhaustively evaluates the fine-grain effects of permanent faults in the hardware of arithmetic cores for the three most extensively used operations in modern applications (<italic>Add</i>, <italic>Multiply</i>, and <italic>Multiply and Add</i>), including machine learning, implemented in Posit and FP. Our results indicate that Posit cores are less fault-vulnerable than FP ones. However, Posit cores are more prone to induce significant operational corruption than FP ones (5.2% to 7.5%). We also found that absolute errors in faulty FP cores are higher by up to 2 orders of magnitude than in Posit ones. Finally, we applied and evaluated three mitigation mechanisms (<italic>Self-Check and repair</i>, <italic>Dual Modular Redundancy</i>, and <italic>Triple Modular Redundancy</i>), effectively reducing the most critical errors with moderate area costs (20% to 110%).","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 4","pages":"1605-1617"},"PeriodicalIF":5.4,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nan Wang;Lijun Lu;Songping Liu;Hongqing Zhu;Yu Zhu
{"title":"Security-Driven Task Scheduling Under Deadline Constraints for MPSoCs With Untrusted 3PIP Cores","authors":"Nan Wang;Lijun Lu;Songping Liu;Hongqing Zhu;Yu Zhu","doi":"10.1109/TETC.2025.3614659","DOIUrl":"https://doi.org/10.1109/TETC.2025.3614659","url":null,"abstract":"The high penetration of third-party intellectual property in MPSoCs gives rise to security concerns, and a set of security-driven constraints is imposed into the task scheduling step of the design process to protect MPSoCs against hardware Trojan attacks. Due to the significant performance and area overheads incurred, designers start to selectively apply security-driven constraints to achieve the design targets, but they often ignore that parts of a design may be more vulnerable to hardware Trojan attacks. In this study, the differences in vulnerability to hardware Trojan attacks are also considered in the MPSoC design process, and a security-driven task scheduling method is proposed to minimize both the design vulnerability and chip area under deadline constraints. First, the schedule length is iteratively optimized by a maximum weight independent set-based method that minimizes the vulnerability increment. Second, tasks are assigned to IP vendors with a minimized number of cores required by maximizing the core sharing of tasks. Finally, tasks are scheduled to time periods using the force-directed scheduling method. Experimental results demonstrate the effectiveness of the proposed method in reducing the number of cores while maintaining system security under deadline constraints.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 4","pages":"1577-1590"},"PeriodicalIF":5.4,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tien-Dung Cao;Nguyen T. Vuong;Thai Q. Le;Hoang V. N. Dao;Tram Truong-Huu
{"title":"Asyn2F: An Asynchronous Federated Learning Framework With Bidirectional Model Aggregation","authors":"Tien-Dung Cao;Nguyen T. Vuong;Thai Q. Le;Hoang V. N. Dao;Tram Truong-Huu","doi":"10.1109/TETC.2025.3609004","DOIUrl":"https://doi.org/10.1109/TETC.2025.3609004","url":null,"abstract":"In federated learning, the models can be trained synchronously or asynchronously. Many existing works have focused on developing an aggregation method for the server to aggregate multiple local models into the global model with improved performance. They ignore the heterogeneity of the training workers, which causes the delay in the training of the local models, leading to the obsolete information issue. In this paper, we design and develop <sc>Asyn2F</small>, an <sc>Asyn</small>chronous <sc>F</small>ederated learning <sc>F</small>ramework with bidirectional model aggregation. By bidirectional aggregation, <sc>Asyn2F</small>, on one hand, allows the server to asynchronously aggregate multiple local models and generate a new global model. On the other hand, it allows the training workers to aggregate the new version of the global model into a local model, which is being optimized even in the middle of a training epoch. We develop <sc>Asyn2F</small> considering various practical implementation requirements with geographically distributed and heterogeneous training workers. Extensive experiments with different datasets show that the models trained by <sc>Asyn2F</small> achieve higher performance compared to the state-of-the-art techniques. The experiments also demonstrate the effectiveness, practicality, and scalability of <sc>Asyn2F</small>, making it ready for practical deployment.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 4","pages":"1618-1632"},"PeriodicalIF":5.4,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alberto Bosio;Samuele Germiniani;Graziano Pravadelli;Marcello Traiola
{"title":"A Genetic Approach for Automatic AxC Design Exploration at RTL Based on Assertion Mining and Fault Analysis","authors":"Alberto Bosio;Samuele Germiniani;Graziano Pravadelli;Marcello Traiola","doi":"10.1109/TETC.2025.3609050","DOIUrl":"https://doi.org/10.1109/TETC.2025.3609050","url":null,"abstract":"In Approximate Computing (AxC), design exploration methods have been introduced to automatically identify approximation targets at the gate level. However, only some of them are applicable at Register Transfer Level (RTL); furthermore, the benefits of combining information from assertions and fault analysis have not been fully explored. This paper proposes a novel methodology for guiding AxC design exploration at RTL considering two approximation techniques: bit-width reduction and statement reduction. Then, it employs fault injection to mimic the approximation effect on the design under approximation. To guide the designer while assessing the approximation choices, assertions, which formally capture the behaviors implemented in the design, are dynamically generated from the RTL simulation traces. Then, the impact of fault injections on the truth values of the assertions is employed as a proxy for measuring the functional accuracy of the corresponding approximations. Based on this evaluation, a genetic algorithm is finally used to rank and cluster the approximation targets, thus providing the designer with an efficient and effective way to automatically analyze AxC variants in terms of the trade-off between accuracy and performance. The experiments carried out on state-of-the-art benchmarks show that the proposed approach represents a promising solution for the automation of AxC design exploration at RTL.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 4","pages":"1633-1648"},"PeriodicalIF":5.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11174094","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OER-Miner: One-Off Episode Rule Mining for Process Event Logs","authors":"Youxi Wu;Zhihong Dong;Jing Liu;Yan Li;Cong Liu;Lijie Wen;Xindong Wu","doi":"10.1109/TETC.2025.3607892","DOIUrl":"https://doi.org/10.1109/TETC.2025.3607892","url":null,"abstract":"Episode mining is an active subfield of data mining in which the aim is to retrieve important knowledge from temporal data and can be used to analyze fault reports and web navigation logs. However, existing methods generally do not consider time gap constraints, and overestimate the frequency of episodes, which may lead to mining a large number of episodes that users are not interested in. To tackle this problem, this paper investigates one-off episode rule (OER) mining with time gap constraints for process event logs and proposes a one-off episode rule mining algorithm called OER-Miner that can mine frequent one-off episodes and the implicit relationship among them. To generate fewer and prune unpromising candidate episodes, OER-Miner utilizes episode join and pruning strategies, respectively. To efficiently calculate the candidate episode support, position indexes, and depth-first search and backtracking strategies are applied to calculate the number of occurrences. Experimental results verify that OER-Miner yields a better performance than seven other competitive algorithms on nine publicly available event logs. More importantly, OER-Miner can be applied to a real-industrial log to identify rework phenomena in the production process by mining strong one-off episode rules, to discover the optimal processes and deficiencies of the system, and provide recommendations for further improvement.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 4","pages":"1497-1509"},"PeriodicalIF":5.4,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Training and Neuro-Encoding for Bridging Hybrid ANN and SNN Computation","authors":"Musheer Abdullah;De Xu;Zhaoqi Miao;Yuhao Tai;Sawsan Alhabashi;Chen Zhao;Wu Gao","doi":"10.1109/TETC.2025.3607104","DOIUrl":"https://doi.org/10.1109/TETC.2025.3607104","url":null,"abstract":"The complementary strengths of Spiking Neural Networks (SNNs) and Artificial Neural Networks (ANNs) have promoted interest in leveraging hybrid ANN/SNN computation. While most existing efforts focus on ANN-SNN conversion for pure SNN inference, hybrid ANN/SNN inference present unique challenges where complexity and performance in both domains are critical. Key limitations include achieving ultra-low latency, maintaining unified training parameters for resource sharing, and developing efficient neural and encoding models for hybrid data interactions. To address these challenges, We introduce the Adaptive Clip-Floor-Shift (ACFS) activation to bridge the ANN-SNN gap with unified parameters, balancing inference accuracy and complexity across both domains. Our Hybrid Neuro-Encoding Bridge (HNEB) integrating Clipped-ReLU for ANNs, proposed Selective Integrate-and-Fire (SIF) model for enhanced SNN sparsity, and a Stateless Spike Encoding (SSE) mechanism for resource-efficient activation-spike conversion. Experimental results on VGG16 and ResNet demonstrate SNNs achieving competitive accuracy (<inline-formula><tex-math>$leq ! 0.89%$</tex-math></inline-formula> loss) versus ANNs at ultra-low latency (e.g., <inline-formula><tex-math>$T leq 4$</tex-math></inline-formula> for CIFAR10, <inline-formula><tex-math>$T leq 8$</tex-math></inline-formula> for CIFAR100). Experimental analysis reveals Hybrid Neural Netwroks (HNNs) provide superior energy-accuracy trade-offs, improving energy efficiency by up to 84.13% over pure SNNs while maintaining accuracy through layer-wise ANN/SNN partitioning and minimized encoding overhead.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 4","pages":"1591-1604"},"PeriodicalIF":5.4,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Emerging Topics in Computing Publication Information","authors":"","doi":"10.1109/TETC.2025.3607300","DOIUrl":"https://doi.org/10.1109/TETC.2025.3607300","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"C2-C2"},"PeriodicalIF":5.4,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11159605","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Breakout Local Search Solution to the Offloading Decision Problem in a Multi-Access Edge Computing Cloud-Enabled Network","authors":"Mina Kato;Tiago Koketsu Rodrigues;Nei Kato","doi":"10.1109/TETC.2025.3598369","DOIUrl":"https://doi.org/10.1109/TETC.2025.3598369","url":null,"abstract":"Cloud offloading is an important technique for Internet of Things systems, as it allows devices with limited capabilities to access the powerful resources in the cloud when executing their applications. However, relying solely on the remote cloud is problematic, as the long access time from the far distance to the server makes real-time applications impossible to be executed. Multi-access edge computing addresses this by deploying cloud servers near the devices. The issue then becomes how to allocate devices between either remote cloud and multi-access edge computing, based on the device requirements. In this paper, we propose a Breakout Local Search-based solution that, given our designed binary integer linear programming model of the offloading problem, finds a near-optimal configuration for allocating devices between the two cloud types. The proposal is based on iterating between exploiting the local optimum found so far and perturbation of the current solution to explore more the search space. A comparison study shows that our proposal is better than baseline and conventional algorithms, speeding up the total service delay of tasks by at least 30 ms.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1328-1338"},"PeriodicalIF":5.4,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Incentive Mechanism Design for Hierarchical Federated Learning With Selfishness Queue Stability","authors":"Zhuo Li;Fangxing Geng","doi":"10.1109/TETC.2025.3562336","DOIUrl":"https://doi.org/10.1109/TETC.2025.3562336","url":null,"abstract":"The potential privacy breaches in centralized artificial intelligence model training have raised significant public concern. Hierarchical federated learning, as a technology addressing privacy and network efficiency issues, coordinates local devices using edge servers for model training and parameter updates, thereby reducing communication with central cloud servers and diminishing the risk of privacy leaks. However, in this context, the rise of node selfishness presents a significant challenge, undermining training efficiency and the quality of local models, thereby impacting the overall system’s performance. This paper addresses the issue by introducing a virtual node selfish queue to characterize dynamic selfishness, considering both training costs and rewards, and formulating the problem of maximizing model quality within the bounds of controlled node selfishness. Utilizing Lyapunov optimization, this issue is divided into two subproblems: controlling the quantity of node data and optimizing node associations. To solve these, we propose the Data Quantity Control and Client Association (DCCA) algorithm, based on the Hungarian method. This algorithm is shown to ensure boundedness, stability, and optimality in the system. Experimental results demonstrate that the DCCA algorithm enhances model quality by 8.43% and 13.83% compared to the Fmore and FedAvg algorithms, respectively.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1316-1327"},"PeriodicalIF":5.4,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Saurabh Kumar;Cristian Molinaro;Lirika Sola;V. S. Subrahmanian
{"title":"GLAMP: Generative Learning for Adversarially-Robust Malware Prediction","authors":"Saurabh Kumar;Cristian Molinaro;Lirika Sola;V. S. Subrahmanian","doi":"10.1109/TETC.2025.3583872","DOIUrl":"https://doi.org/10.1109/TETC.2025.3583872","url":null,"abstract":"We propose a novel <i>Generative Malware Defense</i> strategy. When an antivirus company detects a malware sample <inline-formula><tex-math>$m$</tex-math></inline-formula>, they should: (i) generate a set <inline-formula><tex-math>${Var}(m)$</tex-math></inline-formula> of several variants of <inline-formula><tex-math>$m$</tex-math></inline-formula> and then (ii) train their malware classifiers on their usual training set augmented with <inline-formula><tex-math>${Var}(m)$</tex-math></inline-formula>. We believe this leads to a more proactive defense by making the classifiers more robust to future malware developed by the attacker. We formally define the malware generation problem as a non-traditional optimization problem. Our novel GLAMP (Generative Learning for Adversarially-robust Malware Prediction) framework analyzes the complexity of the malware generation problem and includes novel malware variant generation algorithms for (i) that leverage the complexity results. Our experiments show that a sufficiently large percentage of samples generated by GLAMP are able to evade both commercial anti-virus and machine learning classifiers with evasion rates up to 83.81% and 50.54%, respectively. GLAMP then proposes an adversarial training model as well. Our experiments show that GLAMP generates running malware that can evade 11 white boxclassifiers and 4 commercial (i.e., black box) detectors. Our experiments show GLAMP’s best adversarial training engine improves the recall by 16.1% and the F1 score by 2.4%-5.4% depending on the test set used.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"1299-1315"},"PeriodicalIF":5.4,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}