在内容检索任务中改进搜索时间和内存占用的降维方法:在半导体检测图像中的应用

IF 3.9 Q2 ENGINEERING, INDUSTRIAL
Thomas Vial, Farah Dhouib, Louison Roger, Annabelle Blangero, Frédéric Duvivier, Karim Sayadi, Marisa N. Faraggi
{"title":"在内容检索任务中改进搜索时间和内存占用的降维方法:在半导体检测图像中的应用","authors":"Thomas Vial,&nbsp;Farah Dhouib,&nbsp;Louison Roger,&nbsp;Annabelle Blangero,&nbsp;Frédéric Duvivier,&nbsp;Karim Sayadi,&nbsp;Marisa N. Faraggi","doi":"10.1016/j.aime.2022.100097","DOIUrl":null,"url":null,"abstract":"<div><p>Quality control in semiconductors is a crucial step to produce high quality microchips. During the last years, advances in artificial vision have significantly improved image quality control techniques. In the semiconductor industry, automated visual inspection is fundamental to avoid human intervention and keep the pipeline sanitized. Different types of images are collected during this process, feeding image databases that continually grow and cannot be labelled by humans in an exhaustive manner. Advances in image retrieval search methods are fundamental to develop more efficient techniques that meet user requirements.</p><p>In this work we propose a dimensionality reduction approach on the feature vectors computed by a classifying deep learning model, while keeping a high retrieval performance. To validate this technique, we evaluate four well-known reduction algorithms on a subset of the full database: Principal Component Analysis (PCA), Sparse Random Projection (SRP), Isomap, Locally Linear Embedding (LLE), in combination with three similarity metrics: Euclidian (<span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>), cosine and inner product. As the number of components of the vectors is reduced, the performance of the image retrieval is measured by recall, time to search, and memory footprint of the database.</p><p>PCA offers the best results, allowing a significant reduction in search time and memory usage, while SRP becomes an option only when the cosine distance is used. With PCA, we were able to divide the memory footprint by a factor of 16, the search time by 6, while maintaining an average recall of 0.96.</p></div>","PeriodicalId":34573,"journal":{"name":"Advances in Industrial and Manufacturing Engineering","volume":null,"pages":null},"PeriodicalIF":3.9000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666912922000241/pdfft?md5=8d1eb3351fc96b16fe2ebb5868cb4860&pid=1-s2.0-S2666912922000241-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Dimensionality reduction to improve search time and memory footprint in content-retrieval tasks: Application to semiconductor inspection images\",\"authors\":\"Thomas Vial,&nbsp;Farah Dhouib,&nbsp;Louison Roger,&nbsp;Annabelle Blangero,&nbsp;Frédéric Duvivier,&nbsp;Karim Sayadi,&nbsp;Marisa N. Faraggi\",\"doi\":\"10.1016/j.aime.2022.100097\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Quality control in semiconductors is a crucial step to produce high quality microchips. During the last years, advances in artificial vision have significantly improved image quality control techniques. In the semiconductor industry, automated visual inspection is fundamental to avoid human intervention and keep the pipeline sanitized. Different types of images are collected during this process, feeding image databases that continually grow and cannot be labelled by humans in an exhaustive manner. Advances in image retrieval search methods are fundamental to develop more efficient techniques that meet user requirements.</p><p>In this work we propose a dimensionality reduction approach on the feature vectors computed by a classifying deep learning model, while keeping a high retrieval performance. To validate this technique, we evaluate four well-known reduction algorithms on a subset of the full database: Principal Component Analysis (PCA), Sparse Random Projection (SRP), Isomap, Locally Linear Embedding (LLE), in combination with three similarity metrics: Euclidian (<span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>), cosine and inner product. As the number of components of the vectors is reduced, the performance of the image retrieval is measured by recall, time to search, and memory footprint of the database.</p><p>PCA offers the best results, allowing a significant reduction in search time and memory usage, while SRP becomes an option only when the cosine distance is used. With PCA, we were able to divide the memory footprint by a factor of 16, the search time by 6, while maintaining an average recall of 0.96.</p></div>\",\"PeriodicalId\":34573,\"journal\":{\"name\":\"Advances in Industrial and Manufacturing Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666912922000241/pdfft?md5=8d1eb3351fc96b16fe2ebb5868cb4860&pid=1-s2.0-S2666912922000241-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Industrial and Manufacturing Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666912922000241\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, INDUSTRIAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Industrial and Manufacturing Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666912922000241","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0

摘要

半导体的质量控制是生产高质量微芯片的关键一步。在过去的几年里,人工视觉的进步大大改善了图像质量控制技术。在半导体行业,自动目视检查是避免人为干预和保持管道清洁的基础。在这个过程中收集不同类型的图像,为不断增长的图像数据库提供数据,而人类无法以详尽的方式进行标记。图像检索搜索方法的进步是开发更有效的技术以满足用户需求的基础。在这项工作中,我们提出了一种对分类深度学习模型计算的特征向量进行降维的方法,同时保持了较高的检索性能。为了验证这一技术,我们在整个数据库的一个子集上评估了四种著名的约简算法:主成分分析(PCA)、稀疏随机投影(SRP)、Isomap、局部线性嵌入(LLE),并结合了三种相似度度量:欧氏(L2)、余弦和内积。随着向量的组成部分数量的减少,图像检索的性能通过召回率、搜索时间和数据库的内存占用来衡量。PCA提供了最好的结果,允许显著减少搜索时间和内存使用,而SRP只有在使用余弦距离时才成为一种选择。使用PCA,我们能够将内存占用除以16,将搜索时间除以6,同时保持平均召回率0.96。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dimensionality reduction to improve search time and memory footprint in content-retrieval tasks: Application to semiconductor inspection images

Quality control in semiconductors is a crucial step to produce high quality microchips. During the last years, advances in artificial vision have significantly improved image quality control techniques. In the semiconductor industry, automated visual inspection is fundamental to avoid human intervention and keep the pipeline sanitized. Different types of images are collected during this process, feeding image databases that continually grow and cannot be labelled by humans in an exhaustive manner. Advances in image retrieval search methods are fundamental to develop more efficient techniques that meet user requirements.

In this work we propose a dimensionality reduction approach on the feature vectors computed by a classifying deep learning model, while keeping a high retrieval performance. To validate this technique, we evaluate four well-known reduction algorithms on a subset of the full database: Principal Component Analysis (PCA), Sparse Random Projection (SRP), Isomap, Locally Linear Embedding (LLE), in combination with three similarity metrics: Euclidian (L2), cosine and inner product. As the number of components of the vectors is reduced, the performance of the image retrieval is measured by recall, time to search, and memory footprint of the database.

PCA offers the best results, allowing a significant reduction in search time and memory usage, while SRP becomes an option only when the cosine distance is used. With PCA, we were able to divide the memory footprint by a factor of 16, the search time by 6, while maintaining an average recall of 0.96.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Advances in Industrial and Manufacturing Engineering
Advances in Industrial and Manufacturing Engineering Engineering-Engineering (miscellaneous)
CiteScore
6.60
自引率
0.00%
发文量
31
审稿时长
18 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信