在边缘设备上构建准确且可解释的在线分类器

IF 6 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-06-13 DOI:10.1109/TPDS.2025.3579121

Yuanming Zhang;Pinghui Wang;Kuankuan Cheng;Junzhou Zhao;Jing Tao;Jingxin Hai;Junlan Feng;Chao Deng;Xidian Wang

{"title":"在边缘设备上构建准确且可解释的在线分类器","authors":"Yuanming Zhang;Pinghui Wang;Kuankuan Cheng;Junzhou Zhao;Jing Tao;Jingxin Hai;Junlan Feng;Chao Deng;Xidian Wang","doi":"10.1109/TPDS.2025.3579121","DOIUrl":null,"url":null,"abstract":"By integrating machine learning with edge devices, we can augment the capabilities of edge devices, such as IoT devices, household appliances, and wearable technologies. These edge devices generally operate on microcontrollers with inherently limited resources, such as constrained RAM capacity and limited computational power. Nonetheless, they often process data in a high-velocity stream fashion, exemplified by sequences of activities and statuses monitored by advanced industrial sensors. In practical scenarios, models must be interpretable to facilitate troubleshooting and behavior understanding. Implementing machine learning models on edge devices is valuable and challenging, striking a balance between model efficacy and resource constraint. To address this challenge, we introduce our novel Onfesk, which combines online learning algorithms with an innovative interpretable kernel. Specifically, our Onfesk trains an online classifier over the kernel’s feature sketches. Benefiting from our specially designed modules, the kernel’s feature sketches can be efficiently produced, and the memory requirements of the classifier can be significantly reduced. As a result, Onfesk delivers effective and efficient performance in environments with limited resources without compromising on model interpretability. Extensive experiments with diverse real-world datasets have shown that Onfesk outperforms state-of-the-art methods, achieving up to a 7.4% improvement in accuracy within identical memory constraints.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 8","pages":"1779-1796"},"PeriodicalIF":6.0000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Building Accurate and Interpretable Online Classifiers on Edge Devices\",\"authors\":\"Yuanming Zhang;Pinghui Wang;Kuankuan Cheng;Junzhou Zhao;Jing Tao;Jingxin Hai;Junlan Feng;Chao Deng;Xidian Wang\",\"doi\":\"10.1109/TPDS.2025.3579121\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"By integrating machine learning with edge devices, we can augment the capabilities of edge devices, such as IoT devices, household appliances, and wearable technologies. These edge devices generally operate on microcontrollers with inherently limited resources, such as constrained RAM capacity and limited computational power. Nonetheless, they often process data in a high-velocity stream fashion, exemplified by sequences of activities and statuses monitored by advanced industrial sensors. In practical scenarios, models must be interpretable to facilitate troubleshooting and behavior understanding. Implementing machine learning models on edge devices is valuable and challenging, striking a balance between model efficacy and resource constraint. To address this challenge, we introduce our novel Onfesk, which combines online learning algorithms with an innovative interpretable kernel. Specifically, our Onfesk trains an online classifier over the kernel’s feature sketches. Benefiting from our specially designed modules, the kernel’s feature sketches can be efficiently produced, and the memory requirements of the classifier can be significantly reduced. As a result, Onfesk delivers effective and efficient performance in environments with limited resources without compromising on model interpretability. Extensive experiments with diverse real-world datasets have shown that Onfesk outperforms state-of-the-art methods, achieving up to a 7.4% improvement in accuracy within identical memory constraints.\",\"PeriodicalId\":13257,\"journal\":{\"name\":\"IEEE Transactions on Parallel and Distributed Systems\",\"volume\":\"36 8\",\"pages\":\"1779-1796\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Parallel and Distributed Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11034678/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Parallel and Distributed Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11034678/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

通过将机器学习与边缘设备集成，我们可以增强边缘设备的功能，例如物联网设备、家用电器和可穿戴技术。这些边缘设备通常在具有有限资源的微控制器上运行，例如受限的RAM容量和有限的计算能力。尽管如此，它们通常以高速流的方式处理数据，例如由先进的工业传感器监测的活动序列和状态。在实际场景中，模型必须是可解释的，以便于故障排除和行为理解。在边缘设备上实现机器学习模型是有价值的，也是具有挑战性的，需要在模型有效性和资源约束之间取得平衡。为了应对这一挑战，我们推出了新颖的Onfesk，它将在线学习算法与创新的可解释内核相结合。具体来说，我们的Onfesk在内核的特征草图上训练一个在线分类器。得益于我们特别设计的模块，可以有效地生成内核的特征草图，并且可以显着降低分类器的内存需求。因此，Onfesk在资源有限的环境中提供了有效和高效的性能，而不会影响模型的可解释性。对不同真实数据集的大量实验表明，Onfesk优于最先进的方法，在相同的内存限制下，准确率提高了7.4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Building Accurate and Interpretable Online Classifiers on Edge Devices

By integrating machine learning with edge devices, we can augment the capabilities of edge devices, such as IoT devices, household appliances, and wearable technologies. These edge devices generally operate on microcontrollers with inherently limited resources, such as constrained RAM capacity and limited computational power. Nonetheless, they often process data in a high-velocity stream fashion, exemplified by sequences of activities and statuses monitored by advanced industrial sensors. In practical scenarios, models must be interpretable to facilitate troubleshooting and behavior understanding. Implementing machine learning models on edge devices is valuable and challenging, striking a balance between model efficacy and resource constraint. To address this challenge, we introduce our novel Onfesk, which combines online learning algorithms with an innovative interpretable kernel. Specifically, our Onfesk trains an online classifier over the kernel’s feature sketches. Benefiting from our specially designed modules, the kernel’s feature sketches can be efficiently produced, and the memory requirements of the classifier can be significantly reduced. As a result, Onfesk delivers effective and efficient performance in environments with limited resources without compromising on model interpretability. Extensive experiments with diverse real-world datasets have shown that Onfesk outperforms state-of-the-art methods, achieving up to a 7.4% improvement in accuracy within identical memory constraints.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Parallel and Distributed Systems 工程技术-工程：电子与电气

CiteScore

11.00

自引率

9.40%

发文量

281

审稿时长

5.6 months

期刊介绍： IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers. Particular areas of interest include, but are not limited to: a) Parallel and distributed algorithms, focusing on topics such as: models of computation; numerical, combinatorial, and data-intensive parallel algorithms, scalability of algorithms and data structures for parallel and distributed systems, communication and synchronization protocols, network algorithms, scheduling, and load balancing. b) Applications of parallel and distributed computing, including computational and data-enabled science and engineering, big data applications, parallel crowd sourcing, large-scale social network analysis, management of big data, cloud and grid computing, scientific and biomedical applications, mobile computing, and cyber-physical systems. c) Parallel and distributed architectures, including architectures for instruction-level and thread-level parallelism; design, analysis, implementation, fault resilience and performance measurements of multiple-processor systems; multicore processors, heterogeneous many-core systems; petascale and exascale systems designs; novel big data architectures; special purpose architectures, including graphics processors, signal processors, network processors, media accelerators, and other special purpose processors and accelerators; impact of technology on architecture; network and interconnect architectures; parallel I/O and storage systems; architecture of the memory hierarchy; power-efficient and green computing architectures; dependable architectures; and performance modeling and evaluation. d) Parallel and distributed software, including parallel and multicore programming languages and compilers, runtime systems, operating systems, Internet computing and web services, resource management including green computing, middleware for grids, clouds, and data centers, libraries, performance modeling and evaluation, parallel programming paradigms, and programming environments and tools.