{"title":"加速描述逻辑归纳学习的硬件方法","authors":"Eyad Algahtani","doi":"10.1145/3665277","DOIUrl":null,"url":null,"abstract":"\n The employment of machine learning (ML) techniques in embedded systems, has seen constant growth in recent years, especially for black-box ML techniques (such as artificial neural networks, ANNs). However, despite the successful employment of ML techniques in embedded environments, yet, their performance potential is constrained by the limited computing resources of their embedded computers. Several hardware based approaches were developed (e.g. using FPGAs and ASICs), to address the constraints of limited computing resources. The scope of this work, focuses on improving the performance for Inductive Logic Programming (ILP) on embedded environments. ILP is a powerful logic-based ML technique that uses logic programming, to construct human-interpretable ML models; where those logic-based ML models, are capable of describing complex and multi-relational concepts. In this work, we present a hardware-based approach that accelerate the hypothesis evaluation task for ILPs in embedded environments, that uses Description Logic (DL) languages as their logic-based representation; In particular, we target the\n \n \\(\\mathcal {ALCQ}^{\\mathcal {(D)}} \\)\n \n language. According to experimental results (through an FPGA implementation), our presented approach has achieved speedups up to 48.7 folds for a disjunction of 32 concepts on 100M individuals; where the baseline performance is the sequential CPU performance of the Raspberry Pi 4. For role and concrete role restrictions, the FPGA implementation achieved speedups up to 2.4 folds (for MIN cardinality role restriction on 1M role assertions); all FPGA implemented role and concrete role restrictions, have achieved similar speedups. In the worst case scenario, the FPGA implementation achieved either a similar or slightly better performance to the baseline (for all DL operations); where worst case scenario results from using a small dataset such as: using conjunction & disjunction on < 100 individuals, and using role & concrete (float/string) role restrictions on < 100, 000 assertions.\n","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Hardware Approach For Accelerating Inductive Learning In Description Logic\",\"authors\":\"Eyad Algahtani\",\"doi\":\"10.1145/3665277\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n The employment of machine learning (ML) techniques in embedded systems, has seen constant growth in recent years, especially for black-box ML techniques (such as artificial neural networks, ANNs). However, despite the successful employment of ML techniques in embedded environments, yet, their performance potential is constrained by the limited computing resources of their embedded computers. Several hardware based approaches were developed (e.g. using FPGAs and ASICs), to address the constraints of limited computing resources. The scope of this work, focuses on improving the performance for Inductive Logic Programming (ILP) on embedded environments. ILP is a powerful logic-based ML technique that uses logic programming, to construct human-interpretable ML models; where those logic-based ML models, are capable of describing complex and multi-relational concepts. In this work, we present a hardware-based approach that accelerate the hypothesis evaluation task for ILPs in embedded environments, that uses Description Logic (DL) languages as their logic-based representation; In particular, we target the\\n \\n \\\\(\\\\mathcal {ALCQ}^{\\\\mathcal {(D)}} \\\\)\\n \\n language. According to experimental results (through an FPGA implementation), our presented approach has achieved speedups up to 48.7 folds for a disjunction of 32 concepts on 100M individuals; where the baseline performance is the sequential CPU performance of the Raspberry Pi 4. For role and concrete role restrictions, the FPGA implementation achieved speedups up to 2.4 folds (for MIN cardinality role restriction on 1M role assertions); all FPGA implemented role and concrete role restrictions, have achieved similar speedups. In the worst case scenario, the FPGA implementation achieved either a similar or slightly better performance to the baseline (for all DL operations); where worst case scenario results from using a small dataset such as: using conjunction & disjunction on < 100 individuals, and using role & concrete (float/string) role restrictions on < 100, 000 assertions.\\n\",\"PeriodicalId\":50914,\"journal\":{\"name\":\"ACM Transactions on Embedded Computing Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Embedded Computing Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3665277\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Embedded Computing Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3665277","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
摘要
近年来,机器学习(ML)技术在嵌入式系统中的应用持续增长,尤其是黑盒 ML 技术(如人工神经网络)。然而,尽管 ML 技术在嵌入式环境中得到了成功应用,但其性能潜力却受到了嵌入式计算机有限计算资源的限制。为了解决计算资源有限的问题,人们开发了几种基于硬件的方法(如使用 FPGA 和 ASIC)。这项工作的重点是在嵌入式环境中提高归纳逻辑编程(ILP)的性能。ILP 是一种功能强大的基于逻辑的 ML 技术,它使用逻辑编程来构建人类可解释的 ML 模型;这些基于逻辑的 ML 模型能够描述复杂的多关系概念。在这项工作中,我们提出了一种基于硬件的方法,可以加速嵌入式环境中 ILP 的假设评估任务,该方法使用描述逻辑(DL)语言作为其基于逻辑的表示;特别是,我们的目标是 \(\mathcal {ALCQ}^{\mathcal {(D)}}) 语言。\)语言。根据实验结果(通过 FPGA 实现),我们提出的方法在 100M 个体上对 32 个概念的析取实现了高达 48.7 倍的提速;其中基线性能是 Raspberry Pi 4 的连续 CPU 性能。对于角色和具体角色限制,FPGA 实现的速度提高了 2.4 倍(对于 100 万个角色断言的 MIN cardinality 角色限制);所有 FPGA 实现的角色和具体角色限制都实现了类似的速度提高。在最坏的情况下,FPGA 实现的性能与基线类似或略胜一筹(针对所有 DL 操作);其中最坏的情况是使用小数据集的结果,例如:在小于 100 个个体上使用连接和析取,在小于 100,000 个断言上使用角色和具体(浮点/字符串)角色限制。
A Hardware Approach For Accelerating Inductive Learning In Description Logic
The employment of machine learning (ML) techniques in embedded systems, has seen constant growth in recent years, especially for black-box ML techniques (such as artificial neural networks, ANNs). However, despite the successful employment of ML techniques in embedded environments, yet, their performance potential is constrained by the limited computing resources of their embedded computers. Several hardware based approaches were developed (e.g. using FPGAs and ASICs), to address the constraints of limited computing resources. The scope of this work, focuses on improving the performance for Inductive Logic Programming (ILP) on embedded environments. ILP is a powerful logic-based ML technique that uses logic programming, to construct human-interpretable ML models; where those logic-based ML models, are capable of describing complex and multi-relational concepts. In this work, we present a hardware-based approach that accelerate the hypothesis evaluation task for ILPs in embedded environments, that uses Description Logic (DL) languages as their logic-based representation; In particular, we target the
\(\mathcal {ALCQ}^{\mathcal {(D)}} \)
language. According to experimental results (through an FPGA implementation), our presented approach has achieved speedups up to 48.7 folds for a disjunction of 32 concepts on 100M individuals; where the baseline performance is the sequential CPU performance of the Raspberry Pi 4. For role and concrete role restrictions, the FPGA implementation achieved speedups up to 2.4 folds (for MIN cardinality role restriction on 1M role assertions); all FPGA implemented role and concrete role restrictions, have achieved similar speedups. In the worst case scenario, the FPGA implementation achieved either a similar or slightly better performance to the baseline (for all DL operations); where worst case scenario results from using a small dataset such as: using conjunction & disjunction on < 100 individuals, and using role & concrete (float/string) role restrictions on < 100, 000 assertions.
期刊介绍:
The design of embedded computing systems, both the software and hardware, increasingly relies on sophisticated algorithms, analytical models, and methodologies. ACM Transactions on Embedded Computing Systems (TECS) aims to present the leading work relating to the analysis, design, behavior, and experience with embedded computing systems.