{"title":"A Compact Tri-Modal Camera Unit for RGBDT Vision","authors":"Julian Strohmayer, M. Kampel","doi":"10.1145/3523111.3523116","DOIUrl":"https://doi.org/10.1145/3523111.3523116","url":null,"abstract":"The combination of RGBD and thermal cameras in multi-modal person-centric vision applications has great potential. As a complementary modality, thermal cameras can compensate for weaknesses such as the inability to operate in absolute darkness of conventional RGB cameras or the range limitations associated with consumer depth cameras, resulting in a more robust computer vision system. In addition, the high contrast between persons and their surroundings in thermal images can ease fundamental detection and segmentation tasks. Unfortunately, the market supply of low-cost consumer RGBDT vision systems is non-existent at the moment, which slows down progress in the field of person-centric vision. We address this problem by proposing a Compact Tri-modal CAmera uniT (CTCAT) for RGBDT vision, which can be manufactured from off-the-shelf components and 3D printed parts. CTCAT features a 1280 × 720 RGB camera, a 640 × 480 structured light depth camera with an operating range of 0.6 − 8m, and a 160 × 120 uncooled radiometric thermal camera. RGB, depth, and thermal images can be captured simultaneously at frame rates up to 9 fps. In this work, we describe the components, fabrication, and calibration of CTCAT. In addition, a new multi-modal calibration target suitable for the geometric calibration of RGB, depth, and thermal cameras is presented, which offers advantages over the state of the art in terms of contrast and practicality. Moreover, radiometric calibration of CTCAT is performed to evaluate the applicability to person-centric vision applications requiring radiometry.","PeriodicalId":185161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Machine Vision and Applications","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124701459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stock Volatility Forecast Base on Comparative Learning and Autoencoder Framework","authors":"Yuxiao Du, Qinyu Li, Zeyu Zhang, Yuxin Liu","doi":"10.1145/3523111.3523126","DOIUrl":"https://doi.org/10.1145/3523111.3523126","url":null,"abstract":"Volatility is an important indicator of derivatives pricing, financial risk measurement, and market panic sentiment measurement. A reasonable prediction of volatility is of great significance to market participants and regulators. This article proposes a new volatility forecast model. We use comparative learning and autoencoders to improve the accuracy and robustness of the model. Reduce the instability of financial data due to noise. And this article expands traditional machine learning research methods. The traditional model is compared with other deep learning models. Our model has made very competitive progress in accuracy and loss compared to other models.","PeriodicalId":185161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Machine Vision and Applications","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121776975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transfer Learning based Precise Pose Estimation with Insufficient Data","authors":"Wonje Choi, Honguk Woo","doi":"10.1145/3523111.3523118","DOIUrl":"https://doi.org/10.1145/3523111.3523118","url":null,"abstract":"With the recent advance in computer vision techniques and the growing utility of real-time human pose detection and tracking, deep learning-based pose estimation has been intensively studied in recent years. These studies rely on large-scale datasets of human pose images, for which expensive annotation jobs are required due to the complex spatial structure of pose keypoints. In this work, we present a transfer learning-based pose estimation model that leverages low-cost synthetic datasets and regressive domain adaptation, enabling the sample-efficient learning on precise human poses. In evaluation, we demonstrate that our model achieves the high accurate pose estimation on a dataset of golf swing images, which is targeted for a virtual golf coaching application.","PeriodicalId":185161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Machine Vision and Applications","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125124877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Study of Emotional Brain to Detect Emotions Using Brain EEG Signals and Improving Accuracy of Emotion Detection System Using Feature Selection Techniques","authors":"N. Kimmatkar, V. Babu","doi":"10.1145/3523111.3523115","DOIUrl":"https://doi.org/10.1145/3523111.3523115","url":null,"abstract":"Now a days Emotion detection using brain EEG signal is becoming interest area of many researchers because of it's tremendous application in healthcare and BCI field. Database acquisition, pre-processing, feature extraction and classification are the main stages in this process. In this research study first existing database of brain EEG signal are studied. Most of the researchers used DEAP database for emotion classification. DEAP database is especially made for music recommendation system. Because of the non-linear and non- stationary nature and poor spatial resolution of Brain EEG signals, researchers faced challenges in each phase of emotion detection process. It is found that the classification accuracy is very low. It becomes necessary to study emotional brain and according to that select electrodes for emotion detection to improve classification accuracy. In this research study self-created dataset is used. Two way approach is used for feature selection to improve accuracy. In the first approach least correlated features are omitted from feature set. and in the second approach RFE recursive feature elimination technique is used for feature ranking. The features ranked high are considered in feature set. It is found that classification accuracy is improved using these techniques.","PeriodicalId":185161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Machine Vision and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129097690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. K. Paul, P. Hoseini, Arjun Vettath Gopinath, M. Nicolescu, M. Nicolescu
{"title":"Simultaneous Integration of Multimodal Interfaces for Generating Structured and Reliable Robotic Task Configurations","authors":"S. K. Paul, P. Hoseini, Arjun Vettath Gopinath, M. Nicolescu, M. Nicolescu","doi":"10.1145/3523111.3523120","DOIUrl":"https://doi.org/10.1145/3523111.3523120","url":null,"abstract":"This paper presents a framework that simultaneously integrates multiple input interfaces and extracts task parameters suitable for task execution in a human-robot collaborative environment. We used pointing gestures and natural language instruction as inputs as they provide the most natural interaction interfaces for humans. In the proposed method, the pointing gesture type and the pointing direction are estimated from RGB images, and the object being pointed at is inferred from the prior gesture information and the objects detected in the scene. Subsequently, the verbal command is parsed to extract task action, the object of interest along with its attributes and position in the 2D image frame. This extracted information from gesture recognition and verbal command is used to form task configurations for the desired human-robot collaborative tasks as well as to help resolve any uncertain or missing task parameters. The proposed framework shows very promising results in identifying the relevant task parameters for the intended robotic tasks in different real-world interaction scenarios.","PeriodicalId":185161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Machine Vision and Applications","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128480587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Face Representation Learning Conditioned on the Soft Biometrics","authors":"JongWon Hwang, L. Tiong, A. Teoh","doi":"10.1145/3523111.3523112","DOIUrl":"https://doi.org/10.1145/3523111.3523112","url":null,"abstract":"Abstract: In this paper, we present a method to leverage soft biometric as a means of conditioning biometrics for better face representation learning. By conditioning, we meant the soft biometric trait (age, gender, etc.) is used as an auxiliary biometric for training along with face modality while it is absent during the inference stage. We propose a two-stream deep neural network consisting of a multilayer perceptron network (MLP) and a convolutional neural network (CNN), which can learn a feature representation from soft biometric vectors and face images, respectively. The two-stream network can be optimized simultaneously and the information can be exploited from both biometrics. The learned conditioning soft biometric representation from the MLP serves as a center prototype of the feature learned from the face network, which is beneficial to contract the intra-class variation of the face feature representation. Due to the lacking of the face dataset that comes along with soft biometrics, we construct a database for evaluation purposes. Extensive experiments are performed on two face datasets that equip with soft biometrics and the results show the superiority of our method compared to the face modality alone.","PeriodicalId":185161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Machine Vision and Applications","volume":"349 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134074409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis and Example Implementation of Data Visualization Technology","authors":"Xianyu Meng, Liangli Ma, Yingxue Zhou","doi":"10.1145/3523111.3523119","DOIUrl":"https://doi.org/10.1145/3523111.3523119","url":null,"abstract":"Abstract: The development of visualization technology and data mining technology provides powerful means and tools for the visual analysis of diversified data. In this era of information and data explosion, various information resources are very rich, but due to the large amount of data, the characteristics of the data are not so obvious. This requires us to study the corresponding methods and means to extract the characteristics of data. Data visualization technology has developed rapidly under this background. This paper introduces the basic theory of data visualization technology, and selects the GDP data of the world's major economies over the years as a visualization example.","PeriodicalId":185161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Machine Vision and Applications","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134282954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Depth and Thermal Images in Face Detection - A Detailed Comparison Between Image Modalities","authors":"Wiktor Mucha, M. Kampel","doi":"10.1145/3523111.3523114","DOIUrl":"https://doi.org/10.1145/3523111.3523114","url":null,"abstract":"Face detection is a well-known issue in image processing, and numerous studies are present in this field. A prominent part of the work is devoted to RGB images, leaving depth and thermal data with less interest. However, in some conditions like low-light areas where face detection is needed, non-RGB sensors might perform better. Also, mounting an additional RGB camera could be challenging or not possible, considering privacy concerns. In this work, current deep learning methodologies are employed to train depth and thermal detection models. The training is done using combined publicly available data that is processed by us for this purpose in order to create necessary annotations for a learning process. The resulting models are validated on a new trimodal dataset collected for this experiments purpose. It contains images captured with RGB, depth, and thermal sensors. Various scenes with single and multiple faces appearances can be found. The results show that non-RGB solutions can be applied in practice with highly robust accuracy and their efficiency is close to RGB detectors. However, their performance depends on the environment and that circumstances are described later in this article.","PeriodicalId":185161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Machine Vision and Applications","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133641162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jarrett Ethan Singian, Jade Nicole Tan, Martin Angelo Tierro, Neil Patrick Del Gallego, J. Ilao, Arren Matthew C. Antioquia
{"title":"Ghosting Effect Removal for Multi-Frame Super-Resolution on CCTV Videos with Moving Objects","authors":"Jarrett Ethan Singian, Jade Nicole Tan, Martin Angelo Tierro, Neil Patrick Del Gallego, J. Ilao, Arren Matthew C. Antioquia","doi":"10.1145/3523111.3523117","DOIUrl":"https://doi.org/10.1145/3523111.3523117","url":null,"abstract":"With the increased use of closed-circuit television (CCTV) footage for security and surveillance purposes as well as for object or person recognition and efficiency monitoring, high-quality CCTV videos are necessary. In this paper, we propose Corgi Eye, a moving object removal + super-resolution framework for enhancing CCTV footages to remove ghosting artifacts caused by performing multi-frame super-resolution (MISR) on moving objects. Our method extends the framework of Eagle Eye, which is an existing MISR framework tailored for mobile devices. Our results demonstrate that the system can completely remove ghosting effects caused by moving objects while performing MISR on CCTV footage. Our proposed method demonstrates competitive performance when compared to Eagle Eye, achieving a 16% increase in terms of PSNR metric. Additionally, our method can produce clear images, on par with deep learning approaches such as ESPCN and SOF-VSR.","PeriodicalId":185161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Machine Vision and Applications","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129642117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Hou, Shaofei Shen, Jing Han, Siqi Xu, Yijing Liu
{"title":"Propensity Score Matching on Discrete Treatment: Beijing Pm2.5 Case Study","authors":"J. Hou, Shaofei Shen, Jing Han, Siqi Xu, Yijing Liu","doi":"10.1145/3523111.3523125","DOIUrl":"https://doi.org/10.1145/3523111.3523125","url":null,"abstract":"Abstract—In causal inference, propensity score matching (PSM) is an effective method to estimate the causal effect between treatment and potential outcomes. The PSM with binary treatment has been widely used in medicine, economics, and sociology fields to evaluate the influence of treatment on the potential outcomes. However, the binary treatment is a special case of discrete treatment. The multi-level treatment is also a universal case of discrete treatment. Therefore, this essay will focus on the discrete treatment (from binary to multi-level) effect estimation by the propensity score matching method. In the procedure of propensity score matching, apart from the logistic model, more other machine learning models can be applied to estimate the propensity score for different types of treatment. This paper aims to combine machine learning models with propensity score matching and apply the methods to the Beijing pm 2.5 dataset.","PeriodicalId":185161,"journal":{"name":"Proceedings of the 2022 5th International Conference on Machine Vision and Applications","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116217792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}