Wei Gou, David Banyamin, Mark Rezk, Wictor Fedorowait, Daniel Burbano, Sasan Haghani
{"title":"The LanternPredator: A Machine-learning-based Robot for Controlling the Spread of Invasive Species","authors":"Wei Gou, David Banyamin, Mark Rezk, Wictor Fedorowait, Daniel Burbano, Sasan Haghani","doi":"10.1109/ICCE59016.2024.10444478","DOIUrl":"https://doi.org/10.1109/ICCE59016.2024.10444478","url":null,"abstract":"Lanternflies are an invasive pest species that can cause significant economic damage to agriculture by affecting plants and crops and disrupting the balance of natural ecosystems. These insects have a fast reproduction cycle, can withstand high-temperature variations, and have no natural predators in the US, making it very difficult to control their spread. Motivated by this environmental issue, we designed the LanternPredator, an autonomous pest control robot to help control the population growth of lanternflies. The proposed solution integrates machine learning algorithms for detecting the right insect species, an acoustic stimulus to attract the insects to a zap trap, and sonar for autonomous navigation.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"65 5","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fast Adaptive Motion Vector Resolution Algorithm for AVS3","authors":"Yukun Zhang, Guoqing Xiang, Chen Li, Peng Zhang, Hao Lin, Wei Yan","doi":"10.1109/ICCE59016.2024.10444166","DOIUrl":"https://doi.org/10.1109/ICCE59016.2024.10444166","url":null,"abstract":"Audio Video Coding (AVS3) is one of the latest generation of video coding standards and has significantly improved the coding performance compared to its predecessor. Its efficiency is achieved by the more flexible block partitioning and new encoding tools. Among them, Adaptive Motion Vector Resolution (AMVR) is one of the tools providing efficient coding for motion vectors. It allows motion vector differences to be coded in five resolutions to reduce the encoding bits. However, it’s costly to select the optimal resolution by exhaustively performing motion estimation and rate-distortion optimization. Therefore, we propose our fast AMVR algorithm to select the most probable resolution by utilizing spatial and temporal information. Experimental results demonstrate our algorithm reduces the overall AMVR encoding time by 41%, 32%, and 53% with coding loss of 0.25%, 0.19%, and 0.27% under the low-delay B, low-delay P, and random access configuration.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"66 10","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Traditional Transformation Theory Guided Model for Learned Image Compression","authors":"Zhiyuan Li, Chenyang Ge, Shun Li","doi":"10.1109/ICCE59016.2024.10444483","DOIUrl":"https://doi.org/10.1109/ICCE59016.2024.10444483","url":null,"abstract":"Recently, many deep image compression methods have been proposed and achieved remarkable performance. However, these methods are dedicated to optimizing the compression performance and speed at medium and high bitrates, while research on ultra low bitrates is limited. In this work, we propose a ultra low bitrates enhanced invertible encoding network guided by traditional transformation theory, experiments show that our codec outperforms existing methods in both compression and reconstruction performance. Specifically, we introduce the Block Discrete Cosine Transformation to model the sparsity of features and employ traditional Haar transformation to improve the reconstruction performance of the model without increasing the bitstream cost.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"65 3","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Distributed Anonymous Reputation System for V2X Communication","authors":"Shahidatul Sadiah, Toru Nakanishi","doi":"10.1109/ICCE59016.2024.10444501","DOIUrl":"https://doi.org/10.1109/ICCE59016.2024.10444501","url":null,"abstract":"Real-time traffic and road conditions shared by the V2X system lead to efficient urban management, but privacy is easily compromised when the shared information is bound to vehicle identification. Furthermore, false information shared by malicious vehicles affects the fairness of the information and threatens road safety. Therefore, it is essential to realize privacy-preserving trust management in V2X communication. Previously, Lu et. al. proposed a blockchain-based anonymous reputation system (BARS) to establish a privacy-preserving trust model for V2X communication. However, BARS seems to have a scalability disadvantage because the certificate update processes are centralized. In this paper, a distributed anonymous reputation system for V2X communication is proposed. The proposed system distributes the task to update the vehicles’ reputation certificates to RSUs, in which the nearest RSU updates the certificate anonymously at the end of each interval. This approach resolves the bottleneck in the certificate update process and improves the scalability.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"110 2","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Supatta Viriyavisuthisakul, P. Sanguansat, Toshihiko Yamasaki
{"title":"A Comprehensive Study of Scene Text Recognition in Scene Text Image Super-Resolution with Parametric Frameworks","authors":"Supatta Viriyavisuthisakul, P. Sanguansat, Toshihiko Yamasaki","doi":"10.1109/ICCE59016.2024.10444229","DOIUrl":"https://doi.org/10.1109/ICCE59016.2024.10444229","url":null,"abstract":"Scene Text Recognition (STR) is a technique to detect and recognize text in images. Predicting text in real-world scene images is challenging due to various uncontrollable environmental factors. State-of-the-art text detection and recognition models leverage deep learning and Transformer architectures, consequently achieving impressive accuracy on benchmark datasets. However, challenges persist in accurately processing text within real-world images, often due to unseen data or limited datasets. Both the limitations of STR and the quality of scene text images are crucial factors. Recently, a parametric weight and multiple parametric regularizations were proposed to improve the quality of real-world scene text images. Different from previous surveys in this area, this study has three main objectives. Firstly, to confirm the performance of parametric methods, the text recognition accuracy between with and without methods is compared by using different STR methods. Second, to make the comprehensive experiments, the outcomes of each STR method are compared to show their prediction performances. Third, several existing challenges and research directions are discussed.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"95 6","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Forecasting Indoor Air Quality Using Machine Learning Models","authors":"Ashay Singh, Mohaiminul Islam, Nga Dinh","doi":"10.1109/ICCE59016.2024.10444387","DOIUrl":"https://doi.org/10.1109/ICCE59016.2024.10444387","url":null,"abstract":"As people typically spend a significant portion of their time indoors, indoor air pollution is the primary cause of nausea, dizziness, headaches and other health issues. Therefore, indoor air quality (IAQ) monitoring and prediction is important to protect people from indoor air pollution. The indoor pollutant prediction can be efficiently tackled by using machine learning (ML) models. This paper focuses on predicting IAQ based on several important pollutants including CO2, humidity, PM10, PM2.5, temperature, and volatile organic compounds (VOC). In particular, we evaluate and compare eight ML models namely Light Gradient Boosting Machines (LightGBM), eXtreme Gradient Boosting (XGBoost), Random Forest (RF), K-Nearest Neighbors (KNN), Support Vector Regression (SVR), Decision Tree (DT), Linear Regression (LR), and Long Short-Term Memory (LSTM). These ML models are trained and then predict pollutants on GAMS dataset which is not well-investigated in literature. The evaluation of the models employs standard measures such as mean square error (MSE) and mean absolute percentage error (MAPE). Our results highlight LightGBM, SVR, and XGBoost as optimal models for IAQ prediction. Specifically, LightGBM achieves an impressive CO2 prediction MAPE score of 0.0960%. SVR demonstrates strong MAPE scores for humidity (0.0185%), temperature (0.0264%), and VOC (3.1953%) predictions. XGBoost attains notable MAPE scores of 0.0414% and 0.0399% for PM10 and PM2.5 models respectively. Thus, advocating for their application across diverse settings—residences, offices, educational institutions, and healthcare facilities—to forecast and monitor IAQ, the study contributes to mitigating health risks tied to indoor air pollution.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"92 9","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Viewer Experience: Combining TV and Projector Technologies for a New Era of Entertainment","authors":"Junhee Woo, Daeki Kim, Jisu Lee, Sungchang Jang","doi":"10.1109/ICCE59016.2024.10444273","DOIUrl":"https://doi.org/10.1109/ICCE59016.2024.10444273","url":null,"abstract":"The aim of this paper is to propose a method for enhancing the user's viewing experience by projecting various patterns around the TV using a projector, based on the colors in the content playing on the TV. To achieve this, we extract representative colors from different video segments and consider data composition, transmission frequency settings, and transmission methods. Furthermore, we aim to compare and and analyze the connections between multiple projectors and TV.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"92 4","pages":"1-3"},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Practical Framework for Designing and Deploying Tiny Deep Neural Networks on Microcontrollers","authors":"Brenda Zhuang, Danilo Pau","doi":"10.1109/ICCE59016.2024.10444435","DOIUrl":"https://doi.org/10.1109/ICCE59016.2024.10444435","url":null,"abstract":"For many applications, Deep Neural Networks (DNNs) trained on powerful CPUs and GPUs are expected to efficiently perform inference on tiny devices. However, deploying productively un-constrained complex models to microcontrollers (MCUs) remains a time-consuming task. In this paper, a comprehensive methodology is presented that combines advanced optimization techniques in hyperparameter search, model compression, and deployability evaluation using benchmark data.MCUs typically have low-power processors, limited embedded RAM memory and FLASH storage, providing orders of magnitude fewer computational resources than what cloud assets offer. Designing DNNs for such platforms requires effective strategies to balance high accuracy performance with low memory usage and inference latency. To address this challenge, Bayesian optimization has been applied, a powerful complexity-bounded technique, to hyperparameter tuning to select tiny model architecture candidates. Several pruning and quantization methods have been developed to compress all the models and evaluated the numerical performance after compression. Additionally, cloud-based deployment tools have been utilized to iteratively validate the on-device memory and latency performance on off-the-shelf MCUs. Through evaluating the benchmarks against the stringent requirements of tiny devices at the edge, practical insights have been gained into these models.Multiple image classification applications have been applied on a variety of STM32 MCUs. The practical framework can: a) maintain top-1 classification accuracy within tolerance from the floating-point network after compression; b) reduce memory footprint by at least 4 times; c) reduce inference runtime significantly by avoiding external RAM usage; d) adaptable to many different applications.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"88 12","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Face Image Restoration Method Using Semantic and Transformer Splitting Networks","authors":"Hyoungki Choi, Jinsol Choi, Heunseung Lim, Joonki Paik","doi":"10.1109/ICCE59016.2024.10444243","DOIUrl":"https://doi.org/10.1109/ICCE59016.2024.10444243","url":null,"abstract":"This paper delves into the hardware constraints of consumer-grade surveillance camera systems, proposing a unique network architecture that splits into four distinct branches tailored for mainstream consumer electronics. While there have been significant advancements in consumer camera technology, the financial barriers related to surveillance applications in consumer markets remain notably high. Responding to this, our research presents a state-of-the-art method, optimized for everyday consumer devices, to enhance facial regions in videos by utilizing our specialized splitting network design. This model, ideal for consumer technology applications, demonstrates the capacity to precisely reconstruct damaged facial features at a pixel-level, all the while preserving the true aesthetics and authenticity of human faces. Recognizing the critical role of facial regions for personal safety in consumer settings, our solution presents a compelling answer to current challenges. This research accentuates the profound potential of advanced deep learning techniques to fortify personal safety in the modern consumer electronics landscape.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"24 5","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kiwon Sohn, Insup Choi, Seongwan Kim, Jaeho Lee, Jungyong Lee, Joonghang Kim
{"title":"A Strategy to Maximize the Utilization of AI Neural Processors on an Automotive Computing Platform","authors":"Kiwon Sohn, Insup Choi, Seongwan Kim, Jaeho Lee, Jungyong Lee, Joonghang Kim","doi":"10.1109/ICCE59016.2024.10444298","DOIUrl":"https://doi.org/10.1109/ICCE59016.2024.10444298","url":null,"abstract":"Advancements in AI are transforming the automotive industry, creating opportunities for AI-powered software and hardware. AI-driven features in automobiles are increasingly embraced due to their potential to significantly improve the driving experience. High-performance computing, particularly with NPUs, becomes crucial for executing the AI features. To maximize the efficiency and utilization of NPUs, DAIMO-NPU optimizes the inference sequence of the DNN models that form the backbones of the AI features. Not only does it organize and schedule the model inference tasks but also supports the tasks to be executed on heterogeneous NPU settings. Three main components are involved in the implementation of DAIMO-NPU. The schedule-table generator is responsible for creating a detailed plan for the model inference tasks, which is to be updated whenever an AI feature is added, removed, or upgraded. The onboard operator reads the schedule table and carries out the tasks accordingly. And, by dividing models into smaller segments, while not mandatory, the schedule table can be further optimized. In the subsequent developments, the integration of additional NPU hardware properties into DAIMO-NPU will be pursued.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"22 6","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140531782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}