{"title":"LUT-Based Convolutional Tsetlin Machine Accelerator With Dynamic Clause Scaling for Resources-Constrained FPGAs","authors":"Rashed Al Amin;Roman Obermaisser","doi":"10.1109/JXCDC.2026.3676833","DOIUrl":null,"url":null,"abstract":"The rapid growth of machine learning (ML) workloads, particularly in computer vision applications, has significantly increased computational and energy demands in modern electronic systems, motivating the use of hardware accelerators to offload processing from general-purpose processors. Despite advances in computationally efficient ML models, achieving energy-efficient inference on resource-constrained edge devices remains a significant challenge. The Tsetlin machine (TM) has emerged as an attractive alternative for image classification due to its high throughput and inherently energy-efficient learning paradigm. However, existing TM-based hardware accelerators struggle to balance classification accuracy and energy efficiency, limiting their practical deployment at the edge. This article presents a resource- and energy-efficient convolutional TM (CTM) accelerator with dynamic clause scaling, optimized explicitly for edge field-programmable gate array (FPGA) platforms. The proposed architecture employs LUT-based pipelining and targeted resource-optimization techniques to minimize FPGA resource utilization while maintaining high-energy efficiency and performance. The accelerator is implemented on a Xilinx Zybo-Z20 FPGA and evaluated using the MNIST, Fashion-MNIST (FMNIST), and Kuzushiji-MNIST (KMNIST) datasets, achieving classification accuracies of 97.78%, 85.53%, and 88.54%, respectively, with an energy consumption of up to <inline-formula> <tex-math>$0.3~\\mu $ </tex-math></inline-formula>J per image classification. Compared with state-of-the-art CTM accelerators, the proposed design achieves up to <inline-formula> <tex-math>$40\\times $ </tex-math></inline-formula> improvements in resource and energy efficiency, demonstrating its suitability for real-time image and pattern classification on edge FPGA-based systems.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"12 ","pages":"45-55"},"PeriodicalIF":2.7000,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11454583","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11454583/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/3/23 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The rapid growth of machine learning (ML) workloads, particularly in computer vision applications, has significantly increased computational and energy demands in modern electronic systems, motivating the use of hardware accelerators to offload processing from general-purpose processors. Despite advances in computationally efficient ML models, achieving energy-efficient inference on resource-constrained edge devices remains a significant challenge. The Tsetlin machine (TM) has emerged as an attractive alternative for image classification due to its high throughput and inherently energy-efficient learning paradigm. However, existing TM-based hardware accelerators struggle to balance classification accuracy and energy efficiency, limiting their practical deployment at the edge. This article presents a resource- and energy-efficient convolutional TM (CTM) accelerator with dynamic clause scaling, optimized explicitly for edge field-programmable gate array (FPGA) platforms. The proposed architecture employs LUT-based pipelining and targeted resource-optimization techniques to minimize FPGA resource utilization while maintaining high-energy efficiency and performance. The accelerator is implemented on a Xilinx Zybo-Z20 FPGA and evaluated using the MNIST, Fashion-MNIST (FMNIST), and Kuzushiji-MNIST (KMNIST) datasets, achieving classification accuracies of 97.78%, 85.53%, and 88.54%, respectively, with an energy consumption of up to $0.3~\mu $ J per image classification. Compared with state-of-the-art CTM accelerators, the proposed design achieves up to $40\times $ improvements in resource and energy efficiency, demonstrating its suitability for real-time image and pattern classification on edge FPGA-based systems.