Metasurface-Generated Large and Arbitrary Analog Convolution Kernels for Accelerated Machine Vision

IF 6.7 1区物理与天体物理 Q1 MATERIALS SCIENCE, MULTIDISCIPLINARY

ACS Photonics Pub Date : 2024-12-04 DOI:10.1021/acsphotonics.4c0187410.1021/acsphotonics.4c01874

Ruiqi Liang, Shuai Wang, Yiying Dong, Liu Li, Ying Kuang, Bohan Zhang and Yuanmu Yang*,

{"title":"Metasurface-Generated Large and Arbitrary Analog Convolution Kernels for Accelerated Machine Vision","authors":"Ruiqi Liang, Shuai Wang, Yiying Dong, Liu Li, Ying Kuang, Bohan Zhang and Yuanmu Yang*, ","doi":"10.1021/acsphotonics.4c0187410.1021/acsphotonics.4c01874","DOIUrl":null,"url":null,"abstract":"<p >In the rapidly evolving field of artificial intelligence, convolutional neural networks are essential for tackling complex challenges, such as machine vision and medical diagnosis. Recently, to address the challenges in processing speed and power consumption of conventional digital convolution operations, many optical components have been suggested to replace the digital convolution layer in the neural network, accelerating various machine vision tasks. Nonetheless, the analogous nature of the optical convolution kernel has not been fully explored. Here, we develop a spatial frequency domain training method to create arbitrarily shaped analog convolution kernels using an optical metasurface as the convolution layer, with its receptive field largely surpassing digital convolution kernels. By employing spatial multiplexing, the multiple parallel convolution kernels with both positive and negative weights are generated under the incoherent illumination condition. We experimentally demonstrate a 98.59% classification accuracy on the MNIST data set, with simulations showing 92.63% and 68.67% accuracy on the Fashion-MNIST and CIFAR-10 data sets with additional digital layers. This work underscores the unique advantage of analogue optical convolution, offering a promising avenue to accelerate machine vision tasks, especially in edge devices.</p>","PeriodicalId":23,"journal":{"name":"ACS Photonics","volume":"11 12","pages":"5430–5438 5430–5438"},"PeriodicalIF":6.7000,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Photonics","FirstCategoryId":"101","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acsphotonics.4c01874","RegionNum":1,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

In the rapidly evolving field of artificial intelligence, convolutional neural networks are essential for tackling complex challenges, such as machine vision and medical diagnosis. Recently, to address the challenges in processing speed and power consumption of conventional digital convolution operations, many optical components have been suggested to replace the digital convolution layer in the neural network, accelerating various machine vision tasks. Nonetheless, the analogous nature of the optical convolution kernel has not been fully explored. Here, we develop a spatial frequency domain training method to create arbitrarily shaped analog convolution kernels using an optical metasurface as the convolution layer, with its receptive field largely surpassing digital convolution kernels. By employing spatial multiplexing, the multiple parallel convolution kernels with both positive and negative weights are generated under the incoherent illumination condition. We experimentally demonstrate a 98.59% classification accuracy on the MNIST data set, with simulations showing 92.63% and 68.67% accuracy on the Fashion-MNIST and CIFAR-10 data sets with additional digital layers. This work underscores the unique advantage of analogue optical convolution, offering a promising avenue to accelerate machine vision tasks, especially in edge devices.

Abstract Image

查看原文本刊更多论文

用于加速机器视觉的超表面生成的大型任意模拟卷积核

在快速发展的人工智能领域，卷积神经网络对于解决机器视觉和医疗诊断等复杂挑战至关重要。最近，为了解决传统数字卷积运算在处理速度和功耗方面的挑战，人们提出了许多光学元件来取代神经网络中的数字卷积层，以加速各种机器视觉任务。尽管如此，光学卷积核的类似性质尚未得到充分探讨。在这里，我们开发了一种空间频域训练方法，使用光学超表面作为卷积层来创建任意形状的模拟卷积核，其接受场在很大程度上超过了数字卷积核。利用空间复用技术，在非相干光照条件下生成了多个具有正负权的并行卷积核。我们在MNIST数据集上实验证明了98.59%的分类准确率，在Fashion-MNIST和CIFAR-10数据集上的模拟显示了92.63%和68.67%的准确率。这项工作强调了模拟光学卷积的独特优势，为加速机器视觉任务提供了一条有前途的途径，特别是在边缘设备中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACS Photonics NANOSCIENCE & NANOTECHNOLOGY-MATERIALS SCIENCE, MULTIDISCIPLINARY

CiteScore

11.90

自引率

5.70%

发文量

438

审稿时长

2.3 months

期刊介绍： Published as soon as accepted and summarized in monthly issues, ACS Photonics will publish Research Articles, Letters, Perspectives, and Reviews, to encompass the full scope of published research in this field.