Nicolai Bæk Thomsen, Xiaodong Duan, Z. Tan, B. Lindberg, S. H. Jensen
{"title":"Improving the convergence of co-training for audio-visual person identification","authors":"Nicolai Bæk Thomsen, Xiaodong Duan, Z. Tan, B. Lindberg, S. H. Jensen","doi":"10.1109/SPLIM.2016.7528400","DOIUrl":"https://doi.org/10.1109/SPLIM.2016.7528400","url":null,"abstract":"Person identification is a very important task for intelligent devices when communicating or interacting with humans. A potential problem in real applications is that the amount of enrollment data is insufficient. When multiple modalities are available, it is possible to re-train the system online by exploiting the conditional independence between the modalities and thus improving classification accuracy. This can be achieved by the well-known CO-training algorithm [1]. In this work we present a novel modification to the CO-training algorithm, which is concerned with how new observations/samples are chosen at each iteration to re-train the system in order to improve the classification accuracy faster, i.e., better convergence. In our method, the new data are chosen not only based on the score from the other modality but also the score from the self modality. We demonstrate our proposed method on a multimodal person identification task using the MOBIO database, and show that it outperforms the baseline method, in terms of convergency, by a large margin.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116745791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards neural art-based face de-identification in video data","authors":"K. Brkić, T. Hrkać, I. Sikirić, Z. Kalafatić","doi":"10.1109/SPLIM.2016.7528406","DOIUrl":"https://doi.org/10.1109/SPLIM.2016.7528406","url":null,"abstract":"We propose a computer vision-based pipeline that enables altering the appearance of faces in videos. Assuming a surveillance scenario, we combine GMM-based background subtraction with an improved version of the GrabCut algorithm to find and segment pedestrians. Independently, we detect faces using a standard face detector. We apply the neural art algorithm, utilizing the responses of a deep neural network to obfuscate the detected faces through style mixing with reference images. The altered faces are combined with the original frames using the extracted pedestrian silhouettes as a guideline. Experimental evaluation indicates that our method has potential in producing de-identified versions of the input frames while preserving the utility of the de-identified data.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126790753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GMM-based speaker gender and age classification after voice conversion","authors":"J. Pribil, A. Přibilová, J. Matoušek","doi":"10.1109/SPLIM.2016.7528391","DOIUrl":"https://doi.org/10.1109/SPLIM.2016.7528391","url":null,"abstract":"This paper describes an experiment using the Gaussian mixture models (GMM) for classification of the speaker gender/age and for evaluation of the achieved success in the voice conversion process. The main motivation of the work was to test whether this type of the classifier can be utilized as an alternative approach instead of the conventional listening test in the area of speech evaluation. The proposed two-level GMM classifier was first verified for detection of four age categories (child, young, adult, senior) as well as discrimination of gender for all but children's voices in Czech and Slovak languages. Then the classifier was applied for gender/age determination of the basic adult male/female original speech together with its conversion. The obtained resulting classification accuracy confirms usability of the proposed evaluation method and effectiveness of the performed voice conversions.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134232519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanjie Chen, Yuhong Li, F. Qi, Zhanyu Ma, Honggang Zhang
{"title":"Cycled merging registration of point clouds for 3D human body modeling","authors":"Yanjie Chen, Yuhong Li, F. Qi, Zhanyu Ma, Honggang Zhang","doi":"10.1109/SPLIM.2016.7528394","DOIUrl":"https://doi.org/10.1109/SPLIM.2016.7528394","url":null,"abstract":"In this paper, we present a cycled merging registration method based on Iterative Closest Point (ICP). We capture the point clouds by a static Kinect with the object rotating on a turntable. Different views of scan are combined by ICP and then a globally consistent human model is obtained. Our method simplifies the process of successively registration, which is usually used to solve multi-views registration from a single cycle. The main contribution of this paper is to propose a pairwise-to-global registration method, which aligns several sub-integrate views in a merging order. Our method is consistent with some cycled registration constraints which are suitable for non-rigid registration. After all point clouds are merged, the surface of the model can be estimated by Moving Least Square (MLS). A model of a part of non-rigid human body is constructed in our experiments.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127234657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emanuela Piciucco, E. Maiorana, Christof Kauba, A. Uhl, P. Campisi
{"title":"Cancelable biometrics for finger vein recognition","authors":"Emanuela Piciucco, E. Maiorana, Christof Kauba, A. Uhl, P. Campisi","doi":"10.1109/SPLIM.2016.7528396","DOIUrl":"https://doi.org/10.1109/SPLIM.2016.7528396","url":null,"abstract":"Cancelable biometrics is one of the possible solutions to security and privacy problems in biometrics-based recognition systems. In this paper we propose the use of two classical transformations, block re-mapping and image warping, for the definition of cancelable biometrics from finger vein pattern images. Specifically, the impact on matching performance of the employed distortions, as well as the effects of their parameters' selection, are here investigated. An analysis of the renewability of the employed approaches is also provided. Performance comparable with what achieved in the unprotected approach can be reached in case of block re-mapping transformation, which is also able to provide renewability.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116037919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hong Yu, A. K. Sarkar, Dennis Alexander Lehmann Thomsen, Z. Tan, Zhanyu Ma, Jun Guo
{"title":"Effect of multi-condition training and speech enhancement methods on spoofing detection","authors":"Hong Yu, A. K. Sarkar, Dennis Alexander Lehmann Thomsen, Z. Tan, Zhanyu Ma, Jun Guo","doi":"10.1109/SPLIM.2016.7528399","DOIUrl":"https://doi.org/10.1109/SPLIM.2016.7528399","url":null,"abstract":"Many researchers have demonstrated the good performance of spoofing detection systems under clean training and testing conditions. However, it is well known that the performance of speaker and speech recognition systems significantly degrades in noisy conditions. Therefore, it is of great interest to investigate the effect of noise on the performance of spoofing detection systems. In this paper, we investigate a multi-conditional training method where spoofing detection models are trained with a mix of clean and noisy data. In addition, we study the effect of different noise types as well as speech enhancement methods on a state-of-the-art spoofing detection system based on the dynamic linear frequency cepstral coefficients (LFCC) feature and a Gaussian mixture model maximum-likelihood (GMM-ML) classifier. In the experiment part we consider three additive noise types, Cantine, Babble and white Gaussian at different signal-to-noise ratios, and two mainstream speech enhancement methods, Wiener filtering and minimum mean-square error. The experimental results show that enhancement methods are not suitable for the spoofing detection task, as the spoofing detection accuracy will be reduced after speech enhancement. Multi-conditional training, however, shows potential at reducing error rates for spoofing detection.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124686278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carmen Magariños, Paula Lopez-Otero, Laura Docío Fernández, E. R. Banga, C. García-Mateo, D. Erro
{"title":"Piecewise linear definition of transformation functions for speaker de-identification","authors":"Carmen Magariños, Paula Lopez-Otero, Laura Docío Fernández, E. R. Banga, C. García-Mateo, D. Erro","doi":"10.1109/SPLIM.2016.7528408","DOIUrl":"https://doi.org/10.1109/SPLIM.2016.7528408","url":null,"abstract":"The main drawback of speaker de-identification approaches using voice conversion techniques is the need for parallel corpora to train transformation functions between the source and target speakers. In this paper, a voice conversion approach that does not require training any parameters is proposed: it consists in manually defining frequency warping (FW) based transformations by using piecewise linear approximations. An analysis of the de-identification capabilities of the proposed approach using FW only or combined with FW modification and spectral amplitude scaling (AS) was performed. Experimental results show that, using the manually defined transformations using only FW, it is not possible to obtain de-identified natural sounding speech. Nevertheless, when modifying the FW, both de-identification accuracy and naturalness increase to a great extent. A slight improvement in de-identification was also obtained when applying spectral amplitude scaling.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"459 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123096598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Najafian, Dwight W. Irvin, Ying Luo, B. Rous, J. Hansen
{"title":"Employing speech and location information for automatic assessment of child language environments","authors":"M. Najafian, Dwight W. Irvin, Ying Luo, B. Rous, J. Hansen","doi":"10.1109/SPLIM.2016.7528412","DOIUrl":"https://doi.org/10.1109/SPLIM.2016.7528412","url":null,"abstract":"Assessment of the language environment of children in early childhood is a challenging task for both human and machine, and understanding the classroom environment of early learners is an essential step towards facilitating language acquisition and development. This paper explores an approach for intelligent language environment monitoring based on the duration of child-to-child and adult-to-child conversations and a child's physical location in classrooms within a childcare center. The amount of child's communication with other children and adults was measured using an i-vector based child-adult diarization system (developed at CRSS). Furthermore the average time spent by each child across different activity areas within the classroom was measured using a location tracking system. The proposed solution here offers unique opportunities to assess speech and language interaction for children, and quantify location context which would contribute to improved language environments.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125081750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient fingerprint image protection principles using selective JPEG2000 encryption","authors":"Martin Draschl, Jutta Hämmerle-Uhl, A. Uhl","doi":"10.1109/SPLIM.2016.7528392","DOIUrl":"https://doi.org/10.1109/SPLIM.2016.7528392","url":null,"abstract":"Biometric system security requires cryptographic protection of sample data under certain circumstances. We introduce and assess low complexity selective encryption schemes applied to JPEG2000 compressed fingerprint data. From the results we are able to deduce design principles for such schemes which will guide to finally design recognition system aware encryption schemes with low encryption complexity and decent protection capability.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129957822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Mygdalis, Alexandros Iosifidis, A. Tefas, I. Pitas
{"title":"Kernel subclass support vector description for face and human action recognition","authors":"V. Mygdalis, Alexandros Iosifidis, A. Tefas, I. Pitas","doi":"10.1109/SPLIM.2016.7528409","DOIUrl":"https://doi.org/10.1109/SPLIM.2016.7528409","url":null,"abstract":"In this paper, we present the Kernel Subclass Support Vector Data Description classifier. We focus on face recognition and human action recognition applications, where we argue that sub-classes are formed within the training class. We modify the standard SVDD optimization problem, so that it exploits subclass information in its optimization process. We extend the proposed method to work in feature spaces of arbitrary dimensionality. We evaluate the proposed method in publicly available face recognition and human action recognition datasets. Experimental results have shown that increased performance can be obtained by employing the proposed method.","PeriodicalId":297318,"journal":{"name":"2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131215879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}