{"title":"LCNet: Location Combination for Object Detection","authors":"Xin Yi, Bo Ma","doi":"10.1145/3529570.3529596","DOIUrl":"https://doi.org/10.1145/3529570.3529596","url":null,"abstract":"Object detection is a widely studied task in the computer vision field. In recent years, some milestone approaches and solid benchmarks have been proposed, which significantly boosts the development of related researches. The previous object detection methods follow a paradigm: the classification head and the regression head share the same feature extracted by the backbone network. In this paper, we revisit this paradigm for two-stage detectors and prove that the regression head can achieve better results by using the local features. In our proposed Location Combination Networks (LCNet), we extract the effective region of the feature in a Laplace way, and we introduce auxiliary confidence gain loss, Intersection over Union (IoU) gain loss, and distribution loss to guide its convergence. In the classification head, we combine these local features into the global feature for better classification. In the regression head, by ranking these effective regions in the spatial dimension, we can select the local features closest to each foreground boundary and use the selected features to predict the offset of each foreground boundary. Finally, we combine the locations of the four boundaries to obtain the final bounding box prediction. Extensive experimental results on the MS COCO benchmark validate the effectiveness of our proposed method.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131990430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leta Yobsan Bayisa, Weidong Wang, Qing-xian Wang, Meseret Debele Gurmu, Lamessa Bona Debela
{"title":"Inference and Prediction in Big Data Using Sparse Gaussian Process Method","authors":"Leta Yobsan Bayisa, Weidong Wang, Qing-xian Wang, Meseret Debele Gurmu, Lamessa Bona Debela","doi":"10.1145/3529570.3529580","DOIUrl":"https://doi.org/10.1145/3529570.3529580","url":null,"abstract":"Gaussian process is one of computationally expensive algorithm for large datasets and lack of the flexibility to model different datasets is a common problem for modeling it. We introduce sparse Gaussian regression with the combination of designed kernels to solve the computational complexity of a traditional Gaussian process by taking pseudo input from large datasets and developing a model with better accuracy which enables Gaussian process application. We design a better combination of the kernel that can catch up with most of our data points. We demonstrate the approach on a large weather dataset and sales record dataset. Both are open source big datasets available online. Numerous experiments and comparisons with traditional Gaussian process methods using both large datasets demonstrate the efficiency and accuracy of sparse Gaussian processes.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114436944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Baoliang Sun, C. Jiang, Yuguang Song, K. Xue, Weike Shi
{"title":"Direction-of-Arrival Estimation of Acoustic Sources Using Acoustic Array Based on SOM and BP Neural Network","authors":"Baoliang Sun, C. Jiang, Yuguang Song, K. Xue, Weike Shi","doi":"10.1145/3529570.3529605","DOIUrl":"https://doi.org/10.1145/3529570.3529605","url":null,"abstract":"Abstract-A direction-of-arrival (DOA) estimation algorithm of acoustic sources using acoustic array based on self-organizing feature map (SOM) and back propagation neural networks (BPNN) was proposed in this paper. Based on time difference of arrival (TDOA), this algorithm maps TDOA vectors with similar topology into one spatial zone, and gets the characteristic TDOA vector of this spatial zone. This characteristic TDOA vector will be input into BPNN for settlement, thus getting the DOA estimation. The blind zone of array was identified by analyzing sound localization of a rectangular pyramid array of five sensors, in which sound localization error of the acoustic array increased dramatically. However, the proposed DOA estimation algorithm can separate the blind zone and detectable zone, improving DOA estimation accuracy of acoustic sources in different regions. The simulation test and actual experiment demonstrated that the algorithm has high DOA estimation accuracy and robustness.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129444054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Frequency-Dependent Head-Related Transfer Functions Modeling Approach Based on Spherical Harmonic Expansion: FREQUENCY-DEPENDENT HRTF MODELING","authors":"Yunan Wang, Hongbo Zhao, W. Feng, Dingding Yao","doi":"10.1145/3529570.3529603","DOIUrl":"https://doi.org/10.1145/3529570.3529603","url":null,"abstract":"Modeling head-related transfer functions (HRTFs) using spherical harmonics (SHs) expansion is an efficient solution for HRTF-related tasks, such as interpolation and binaural rendering. However, the accurate reconstruction of HRTFs requires a large number of SH coefficients. To model HRTFs for accurate perceptual localization performance with fewer SH expansion coefficients, this study proposes a frequency dependent HRTFs modeling approach by utilizing a higher-order SH expansion for the frequency regions that play more important roles for sound localization. The reconstructed HRTFs are then evaluated by the auditory model, which could predict psychoacoustic measures of localization performance. The experimental results show that the proposed method can achieve better HRTF reconstruction for sound source localization with fewer additional SH coefficients, thus can be further used to simplify the complexity of binaural playback for spatial audio applications.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129444664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human Motion Generation Using Variational Recurrent Neural Network","authors":"Makoto Murakami, Takahiro Ikezawa","doi":"10.1145/3529570.3529588","DOIUrl":"https://doi.org/10.1145/3529570.3529588","url":null,"abstract":"∗ Human motion control, edit, and synthesis are important tasks to create 3D computer graphics video games or movies, because some characters act like humans in most of them. The purpose of this study is to construct a system which can generate various natural character motions. In this study, we consider that the process of human motion generation is complicated and non-linear, and it can be modeled by deep neural network. Since the motion generation process (deep neural network parameters) cannot be observed di-rectly, it needs to be estimated by learning from observable human motion data recorded by motion capture system. On the other hand, the process of inference which is opposite to the generation is also expressed by deep neural network. And inference and generation are performed for human motion data, and the parameters of the both deep neural networks are optimized based on the criteria that the original motion should be obtained through inference and generation processes. In this study, we constructed a human motion generative model using recurrent neural network and variational autoencoders, and confirmed that various human motions can be generated from a low-dimensional latent space.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128485217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Effective Method for Weak Multi-target Detection and Tracking in Clutter Environment","authors":"Chun Li, X. Bai, Juan Zhao, T. Shan","doi":"10.1145/3529570.3529593","DOIUrl":"https://doi.org/10.1145/3529570.3529593","url":null,"abstract":"Weak target detection and tracking is a difficult problem, especially in the case of multi-target and strong clutters. Track-before-detect (TBD) is the common method to deal with this problem, and this paper proposes a new effective method based on TBD. Firstly, keystone transform (KT) and phase gradient autofocus (PGA) are used for migration compensation to improve the signal-to-noise ratio (SNR) of moving targets. Then dynamic programming based TBD (DP-TBD) with joint intensity-spatial CFAR (J-CA-CFAR) is presented for noncoherent integration, where J-CA-CFAR uses both intensity and spatial information to achieve automatic target detection. Finally, the effectiveness of the proposed method was demonstrated by experimental results on real data.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125293051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Iriscode Matching Comparator to Improve Decidability of Human Iris Recognition","authors":"Yenlung Lai, Tong-Yuen Chai, MingJie Lee, B. Goi","doi":"10.1145/3529570.3529591","DOIUrl":"https://doi.org/10.1145/3529570.3529591","url":null,"abstract":"Eye iris has been widely recognized as one of the strongest biometrics attributed to its high accuracy performance. However, any compromised event of iris data potentially leads to severe security and privacy issues because the human iris is permanently linked to individuals and not revocable. Excising protection schemes protect the iris data with the expense of decreased accuracy performance. This paper introduces a new protection scheme to generate a protected template from iris data that can be safely store in the database for future authentication. Experiment results showed that the proposed scheme enjoys a particular S-curve property required to offer strong system security while ensuring high system usability in terms of low false acceptance and false rejection rate.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134412347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Kindergarten Learning Environment via Interactive Projection Design: A Concept Framework","authors":"Bing Lai","doi":"10.1145/3529570.3529585","DOIUrl":"https://doi.org/10.1145/3529570.3529585","url":null,"abstract":"A child's thoughts, feelings, and conduct will be influenced by their physical environment. The phrase \"physical environment\" relates to how structures such as classrooms and schools are organised and designed. As a result, a comfortable kindergarten atmosphere is critical for increasing children's productivity, learning, and well-being. The advancement of digital technology has significantly improved the living conditions of children. Create a system that enables children to interact autonomously while learning and provides multiple interactive modalities, as well as intuitive interactive spaces in kindergarten, based on the rapid growth of interactive technology. The purpose of this paper is to identify the planning criteria for specific interactive projection methods used in kindergarten, to present the fundamental design concepts, and to discuss various aspects of the interactive projection mechanism, with the goal of providing a safe living space, entertainment, and learning for children's experience development in terms of motivation, self-involvement, joy, physical needs, communication, and a balanced flow of experiences. It is intended that this conceptual framework would provide some direction for kindergarten instructors and designers in terms of improving the physical environment's quality, particularly in terms of providing interactive environments for children that fulfil current needs.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134506739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multilinear Compressed Sensing using Tensor Least Angle Regression (T-LARS)","authors":"Ishan Wickramasingha, S. Sherif","doi":"10.1145/3529570.3529571","DOIUrl":"https://doi.org/10.1145/3529570.3529571","url":null,"abstract":"Multilinear compressed sensing generalizes the compressed sensing formulation to tensor signals, where the tensor signal is reconstructed using much fewer samples obtained in a sparse domain by solving a multilinear sparse coding problem. The Kronecker-OMP, a generalization of Orthogonal Matching Pursuit (OMP) solves the L0 constrained multilinear sparse least-squares problems. However, with the problem dimensions and the number of iterations, the space and computational cost of Kronecker-OMP increase in the polynomial order. Authors have previously developed a generalized least-angle regression(LARS), known as Tensor Least Angle Regression (T-LARS), with a lower asymptotic space and computational complexity than Kronecker-OMP to efficiently solve both L0 and L1 constrained multilinear sparse least-squares problems. In this paper, we used T-LARS to solve multilinear compressed sensing problems and compared the results with Kronecker-OMP, where the T-LARS is 56 times faster than Kronecker-OMP in reconstructing the 3D PET-CT images using compressed sensing samples.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130407414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploration of Depth Algorithm Applied to Time-Frequency Image Processing Method of ECG Signal","authors":"Peng-yu Ran, Jinjie Xie, Jingwen Wang","doi":"10.1145/3529570.3529608","DOIUrl":"https://doi.org/10.1145/3529570.3529608","url":null,"abstract":"The classification of arrhythmia is of great significance for the prevention and treatment of heart disease. Based on the deep learning algorithm, it has excellent performance in image classification and recognition. The ECG signal is divided into two cases of abnormal interval and abnormal amplitude to perform signal image classification. The time-domain abnormal signal is directly processed into a two-dimensional image set, and the time domain information of the amplitude abnormal signal is Fourier transformed to obtain a two-dimensional time-frequency image set, and different image sets are migrated to VGG16 After the model is reduced by the PCA algorithm, it can clearly distinguish between normal ECG signals and ECG signals with abnormal intervals or amplitude abnormalities. Finally, after a fine-tuned fully connected layer, the abnormal intervals and amplitudes can be obtained. The accuracy rates of abnormal classification were 96.15% and 92.98%, respectively. After the image processing of the ECG signal, this method can effectively distinguish the abnormal signal from the normal signal.","PeriodicalId":430367,"journal":{"name":"Proceedings of the 6th International Conference on Digital Signal Processing","volume":"407 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123381308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}