{"title":"Stereo Matching and Image Inpainting Based on Binocular Camera","authors":"Yibo Du, Ke-bin Jia, Chang Liu","doi":"10.1109/APSIPAASC47483.2019.9023172","DOIUrl":"https://doi.org/10.1109/APSIPAASC47483.2019.9023172","url":null,"abstract":"Stereo matching is one of the key technologies in the field of computer vision. The depth map obtained by stereo matching contains the three-dimensional information of the scene. The use of depth map is of great significance in the three-dimensional reconstruction of the map and the autonomous navigation of the robot. Aiming at the accuracy and speed of stereo matching, this paper applies a semi-global stereo matching method to match corrected left and right perspective images. Because there are noise points and holes in the matched disparity map, which affect the image quality, a sample block filling method which combines mean filtering and point-by-point scanning is proposed to repair the image. Then a gradient priority selection mechanism is proposed to maintain the edge structure of the object in the process of restoration. Experimental results show that the proposed method is good for the restoration of holes and noises in disparity maps, and the processing speed is improved by about 30% compared with the traditional Criminisi algorithm.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132949349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Study of Chinese Text Steganography using Typos","authors":"Linna Zhou, Derui Liao","doi":"10.1109/APSIPAASC47483.2019.9023029","DOIUrl":"https://doi.org/10.1109/APSIPAASC47483.2019.9023029","url":null,"abstract":"Nowadays, with the Information Explosion and the rapid development of information technology, huge amounts of data are constantly being generated every day on the Internet. But most of the texts provided online is of a kind that usually contain many typos, which is very common among individual users, self-media, etc. However, disambiguation is human's talent, so these typos often do not frustrate human understanding the text, and sometimes it is even difficult to recognize some typos. This phenomenon appears both in English and Chinese, so it seems to be cross-lingual. Therefore, in such texts, it is not surprising that one can perform information-hiding by judiciously injecting typos. We studied Chinese typos in the text contents on Weibo or WeChat, and propose a text steganography method based on Chinese typos with the help of NLP, which can embed secret information by carefully injected typos and guarantee the security of the secret and the readability of the texts. Unlike format-based steganography algorithms, our algorithm can resist format adjustments, OCR re-inputs, etc. Furthermore, Weibo and WeChat platform contain many kinds of media, so by combining other algorithms, Cross-Media or even Cross-Social Network information hiding is practical.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133037425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Random Signal Estimation by Ergodicity associated with Linear Canonical Transform","authors":"Liyun Xu","doi":"10.1109/APSIPAASC47483.2019.9023088","DOIUrl":"https://doi.org/10.1109/APSIPAASC47483.2019.9023088","url":null,"abstract":"The linear canonical transform (LCT) provides a general mathematical tool for solving problems in optical and quantum mechanics. For random signals, which are bandlimited in the LCT domain, the linear canonical correlation function and the linear canonical power spectral density can form a LCT pair. The linear canonical translation operator, which is used to define the convolution and correlation functions, also plays a significant role in the analysis of the random signal estimation. Firstly, the eigenfunctions which are invariant under the linear canonical translation and the unitarity property of it are discussed. Secondly, it shows that all of these connect the LCT sampling theorem and the von Neumann ergodic theorem in the sense of distribution, which will develop an estimation method for the power spectral density of a chirp stationary random signal from one sampling signal in the LCT domain. Finally, the potential applications and future work are discussed.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133404832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Speaker Embedding Extraction with Multi-feature Integration Structure","authors":"Zheng Li, Hao Lu, Jianfeng Zhou, Lin Li, Q. Hong","doi":"10.1109/APSIPAASC47483.2019.9023103","DOIUrl":"https://doi.org/10.1109/APSIPAASC47483.2019.9023103","url":null,"abstract":"Recently x-vector has achieved a promising performance of speaker verification task and becomes one of the mainstream systems. In this paper, we analyzed the feature engineering based on the x-vector structure, and proposed a multi-feature integration method to further improve the feature representation of speaker characteristic. The proposed multi-feature integration method could be implemented in two ways, with the symmetric branches and the asymmetric branches, respectively, to incorporate different types of acoustic features in one neural network. While each branch processed one type of acoustic features on the frame level, the outputs of the two branches for each frame were spliced together as a super vector before being input into the statistics pooling layer. The experiments were executed on the VoxCeleb1 data set, and the results showed that the proposed multi-feature integration method obtained a 22.8% relative improvement over the baseline in EER value.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133308274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimization-Based Fundus Image Decomposition for Diagnosis Support of Diabetic Retinopathy","authors":"D. Kitahara, Swathi Ananda, A. Hirabayashi","doi":"10.1109/APSIPAASC47483.2019.9023233","DOIUrl":"https://doi.org/10.1109/APSIPAASC47483.2019.9023233","url":null,"abstract":"Diabetes mellitus often leads to a serious eye disease called diabetic retinopathy, which is one major cause of blindness among adults. Since this blindness can be prevented if the diabetic retinopathy is detected at an early stage and appropriate medical treatment is provided, routine screening tests with fundus images are very important. However, as the number of diabetic patients increases, the routine screening tests are becoming big burdens for ophthalmologists. To reduce these burdens, in this paper, we propose a diagnosis support method by using convex optimization. The proposed method decomposes a green channel fundus image into a basic image composed of non-disease parts, a positive image including exudates, and a negative image including hemorrhages. Numerical experiments show the effectiveness of our method.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132103134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Study on Mispronunciation Detection Based on Fine-grained Speech Attribute","authors":"Minghao Guo, Cai Rui, Wei Wang, Binghuai Lin, Jinsong Zhang, Yanlu Xie","doi":"10.1109/APSIPAASC47483.2019.9023156","DOIUrl":"https://doi.org/10.1109/APSIPAASC47483.2019.9023156","url":null,"abstract":"Over the last decade, several studies have investigated speech attribute detection (SAD) for improving computer assisted pronunciation training (CAPT) systems. The predefined speech attribute categories either is IPA or language dependent categories, which is difficult to handle multiple languages mispronunciation detection. In this paper, we propose a fine-grained speech attribute (FSA) modeling method, which defines types of Chinese speech attribute by combining Chinese phonetics with the international phonetic alphabet (IPA). To verify FSA, a large scale Chinese corpus was used to train Time-delay neural networks (TDNN) based on speech attribute models, and tested on Russian learner data set. Experimental results showed that all FSA's accuracy on Chinese test set is about 95% on average, and the diagnosis accuracy of the FSA-based mispronunciation detection achieved a 2.2% improvement compared to that of segment-based baseline system. Besides, as the FSA is theoretically capable of modeling language-universal speech attributes, we also tested the trained FSA-based method on native English corpus, which achieved about 50% accuracy rate.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115668976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-band Spectral Entropy Information for Detection of Replay Attacks","authors":"Yitong Liu, Rohan Kumar Das, Haizhou Li","doi":"10.1109/APSIPAASC47483.2019.9023062","DOIUrl":"https://doi.org/10.1109/APSIPAASC47483.2019.9023062","url":null,"abstract":"Replay attacks have been proven to be a potential threat to practical automatic speaker verification systems. In this work, we explore a novel feature based on spectral entropy for the detection of replay attacks. The spectral entropy is a measure to capture spectral distortions and flatness. It is found that the replay speech carries artifacts in the process of recording and playback. We hypothesize that spectral entropy can be a useful information to capture such artifacts. In this regard, we explore multi-band spectral entropy feature for replay attack detection. The studies are conducted on ASVspoof 2017 Version 2.0 database that deals with replay speech attacks. A baseline system with popular constant-Q cepstral coefficient (CQCC) feature is also developed. Finally, a combined system is proposed with multi-band spectral entropy and CQCC features that outperforms the baseline. The experiments validate the idea of multi-band spectral entropy feature.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114712411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Universal Intelligence Measurement Method Based on Meta-analysis","authors":"Zheming Yang, Wen Ji","doi":"10.1109/APSIPAASC47483.2019.9023076","DOIUrl":"https://doi.org/10.1109/APSIPAASC47483.2019.9023076","url":null,"abstract":"The multiple factors of intelligence measurement are critical in the intelligent science. The intelligence measurement is typically built at a model based on the multiple factors. The different digital self is generally difficult to measure due to the uncertainty among multiple factors. Effective methods for the universal intelligence measurement are therefore important to different digital-selves. In this paper, we propose a universal intelligence measurement method based on meta-analysis. Firstly, we get study data through keywords in database and delete the low-quality data. Secondly, after encoding the data, we compute the effect value by Odds ratio, Relatve risk and Risk difference. Then we test the homogeneity by Q-test and analysis the bias by funnel plots. Thirdly, we select the Fixed Effect and Random Effect as statistical model. Finally, simulation results confirm that our method can effectively solve the multiple factors of different digital self. Especially for the intelligence of human, machine, company, government and institution.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123226011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Masahiro Tsumori, S. Nagai, Ryosuke Harakawa, Toru Sasaki, M. Iwahashi
{"title":"Restoration of Minute Light Emissions Observed by Streak Camera Based on N-CUP Method","authors":"Masahiro Tsumori, S. Nagai, Ryosuke Harakawa, Toru Sasaki, M. Iwahashi","doi":"10.1109/APSIPAASC47483.2019.9023298","DOIUrl":"https://doi.org/10.1109/APSIPAASC47483.2019.9023298","url":null,"abstract":"To observe high-speed phenomena such as discharge plasma, it is necessary to restore minute light emissions from an image observed by a streak camera, which includes multiple light emissions at each time. There has been proposed CUP method for restoring minute light emissions via a compressed sensing scheme; however, there is a case in which artefacts occur in the restoration results depending on initial values of the optimization for restoration. To overcome this limitation, N-CUP method that enables successful restoration of minute light emissions is proposed in this paper. N-CUP method estimates initial values suitable for the optimization by iteratively performing CUP method. Through simulation using image datasets emulating phenomena of fundamental light emissions, it was confirmed that N-CUP method obtained successful restoration results.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123679639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Training ResCNN-based Voice Activity Detection with Speech Enhancement","authors":"Tianjiao Xu, Hui Zhang, Xueliang Zhang","doi":"10.1109/APSIPAASC47483.2019.9023101","DOIUrl":"https://doi.org/10.1109/APSIPAASC47483.2019.9023101","url":null,"abstract":"Voice activity detection (VAD) is considered as a solved problem in noise-free condition, but it is still a challenging task in low signal-to-noise ratio (SNR) noisy conditions. Intuitively, reducing noise will improve the VAD. Therefore, in this study, we introduce a speech enhancement module to reduce noise. Specifically, a convolutional recurrent neural network (CRN) based encoder-decoder speech enhancement module is trained to reduce noise. Then the low-dimensional features code from its encoder together with the raw spectrum of noisy speech are feed into a deep residual convolutional neural network (ResCNN) based VAD module. The speech enhancement and VAD modules are connected and trained jointly. To balance the training speed of the two modules, an empirical dynamic gradient balance strategy is proposed. Experimental results show that the proposed joint-training method has obvious advantages in generalization ability.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117241831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}