{"title":"How Private is Machine Learning?","authors":"Nicolas Carlini","doi":"10.1145/3437880.3458440","DOIUrl":"https://doi.org/10.1145/3437880.3458440","url":null,"abstract":"A machine learning model is private if it doesn't reveal (too much) about its training data. This three-part talk examines to what extent current networks are private. Standard models are not private. We develop an attack that extracts rare training examples (for example, individual people's names, phone numbers, or addresses) out of GPT-2, a language model trained on gigabytes of text from the Internet [2]. As a result there is a clear need for training models with privacy-preserving techniques. We show that InstaHide, a recent candidate, is not private. We develop a complete break of this scheme and can again recover verbatim inputs [1]. Fortunately, there exists provably-correct \"differentiallyprivate\" training that guarantees no adversary could ever succeed at the above attacks. We develop techniques to that allow us to empirically evaluate the privacy offered by such schemes, and find they may be more private than can be proven formally [3].","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130847286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DNN Watermarking: Four Challenges and a Funeral","authors":"M. Barni, F. Pérez-González, B. Tondi","doi":"10.1145/3437880.3460399","DOIUrl":"https://doi.org/10.1145/3437880.3460399","url":null,"abstract":"The demand for methods to protect the Intellectual Property Rights (IPR) associated to Deep Neural Networks (DNNs) is rising. Watermarking has been recently proposed as a way to protect the IPR of DNNs and track their usages. Although a number of techniques for media watermarking have been proposed and developed over the past decades, their direct translation to DNN watermarking faces the problem of the embedding being carried out on functionals instead of signals. This originates differences not only in the way performance, robustness and unobtrusiveness are measured, but also on the embedding domain, since there is the possibility of hiding information in the model behavior. In this paper, we discuss these dissimilarities that lead to a DNN-specific taxonomy of watermarking techniques. Then, we present four challenges specific to DNN watermarking that, for their practical importance and theoretical interest, should occupy the agenda of researchers in the next years. Finally, we discuss some bad practices that negatively affected research in media watermarking and that should not be repeated in the case of DNNs.","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128566330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Makrushin, Christof Kauba, Simon Kirchgasser, Stefan Seidlitz, Christian Kraetzer, A. Uhl, J. Dittmann
{"title":"General Requirements on Synthetic Fingerprint Images for Biometric Authentication and Forensic Investigations","authors":"A. Makrushin, Christof Kauba, Simon Kirchgasser, Stefan Seidlitz, Christian Kraetzer, A. Uhl, J. Dittmann","doi":"10.1145/3437880.3460410","DOIUrl":"https://doi.org/10.1145/3437880.3460410","url":null,"abstract":"Generation of synthetic biometric samples such as, for instance, fingerprint images gains more and more importance especially in view of recent cross-border regulations on security of private data. The reason is that biometric data is designated in recent regulations such as the EU GDPR as a special category of private data, making sharing datasets of biometric samples hardly possible even for research purposes. The usage of fingerprint images in forensic research faces the same challenge. The replacement of real datasets by synthetic datasets is the most advantageous straightforward solution which bears, however, the risk of generating \"unrealistic\" samples or \"unrealistic distributions\" of samples which may visually appear realistic. Despite numerous efforts to generate high-quality fingerprints, there is still no common agreement on how to define \"high-quality'' and how to validate that generated samples are realistic enough. Here, we propose general requirements on synthetic biometric samples (that are also applicable for fingerprint images used in forensic application scenarios) together with formal metrics to validate whether the requirements are fulfilled. Validation of our proposed requirements enables establishing the quality of a generative model (informed evaluation) or even the quality of a dataset of generated samples (blind evaluation). Moreover, we demonstrate in an example how our proposed evaluation concept can be applied to a comparison of real and synthetic datasets aiming at revealing if the synthetic samples exhibit significantly different properties as compared to real ones.","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114647318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yassine Yousfi, Jan Butora, J. Fridrich, Clément Fuji Tsang
{"title":"Improving EfficientNet for JPEG Steganalysis","authors":"Yassine Yousfi, Jan Butora, J. Fridrich, Clément Fuji Tsang","doi":"10.1145/3437880.3460397","DOIUrl":"https://doi.org/10.1145/3437880.3460397","url":null,"abstract":"In this paper, we study the EfficientNet family pre-trained on ImageNet when used for steganalysis using transfer learning. We show that certain \"surgical modifications\" aimed at maintaining the input resolution in EfficientNet architectures significantly boost their performance in JPEG steganalysis, establishing thus new benchmarks. The modified models are evaluated by their detection accuracy, the number of parameters, the memory consumption, and the total floating point operations (FLOPs) on the ALASKA II dataset. We also show that, surprisingly, EfficientNets in their \"vanilla form\" do not perform as well as the SRNet in BOSSbase+BOWS2. This is because, unlike ALASKA II images, BOSSbase+BOWS2 contains aggressively subsampled images with more complex content. The surgical modifications in EfficientNet remedy this underperformance as well.","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126189403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Neural Exposure: You Can Run, But Not Hide Your Neural Network Architecture!","authors":"Sayed Erfan Arefin, Abdul Serwadda","doi":"10.1145/3437880.3460415","DOIUrl":"https://doi.org/10.1145/3437880.3460415","url":null,"abstract":"Deep Neural Networks (DNNs) are at the heart of many of today's most innovative technologies. With companies investing lots of resources to design, build and optimize these networks for their custom products, DNNs are now integral to many companies' tightly guarded Intellectual Property. As is the case for every high-value product, one can expect bad actors to increasingly design techniques aimed to uncover the architectural designs of proprietary DNNs. This paper investigates if the power draw patterns of a GPU on which a DNN runs could be leveraged to glean key details of its design architecture. Based on ten of the most well-known Convolutional Neural Network (CNN) architectures, we study this line of attack under varying assumptions about the kind of data available to the attacker. We show the attack to be highly effective, attaining an accuracy in the 80 percentage range for the best performing attack scenario.","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130217138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Kuribayashi, Takuro Tanaka, Shunta Suzuki, Tatsuya Yasui, N. Funabiki
{"title":"White-Box Watermarking Scheme for Fully-Connected Layers in Fine-Tuning Model","authors":"M. Kuribayashi, Takuro Tanaka, Shunta Suzuki, Tatsuya Yasui, N. Funabiki","doi":"10.1145/3437880.3460402","DOIUrl":"https://doi.org/10.1145/3437880.3460402","url":null,"abstract":"For the protection of trained deep neural network(DNN) models, embedding watermarks into the weights of the DNN model have been considered. However, the amount of change in the weights is large in the conventional methods, and it is reported that the existence of hidden watermark can be detected from the analysis of weight variance. This helps attackers to modify the watermark by effectively adding noise to the weight. In this paper, we focus on the fully-connected layers of fine-tuning models and apply a quantization-based watermarking method to the weights sampled from the layers. The advantage of the proposed method is that the change caused by watermark embedding is much smaller and the distortion converges gradually without using any loss function. The validity of the proposed method was evaluated by varying the conditions during the training of DNN model. The results shows the impact of training for DNN model, effectiveness of the embedding method, and high robustness against pruning attacks.","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126542645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Florian Lugstein, S. Baier, Gregor Bachinger, A. Uhl
{"title":"PRNU-based Deepfake Detection","authors":"Florian Lugstein, S. Baier, Gregor Bachinger, A. Uhl","doi":"10.1145/3437880.3460400","DOIUrl":"https://doi.org/10.1145/3437880.3460400","url":null,"abstract":"As deepfakes become harder to detect by humans, more reliable detection methods are required to fight the spread of fake images and videos. In our work, we focus on PRNU-based detection methods, which, while popular in the image forensics scene, have not been given much attention in the context of deepfake detection. We adopt a PRNU-based approach originally developed for the detection of face morphs and facial retouching, and performed the first large scale test of PRNU-based deepfake detection methods on a variety of standard datasets. We show the impact of often neglected parameters of the face extraction stage on detection accuracy. We also document that existing PRNU-based methods cannot compete with state of the art methods based on deep learning but may be used to complement those in hybrid detection schemes.","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123066166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Altschaffel, Jonas Hielscher, Stefan Kiltz, J. Dittmann
{"title":"Meta and Media Data Stream Forensics in the Encrypted Domain of Video Conferences","authors":"R. Altschaffel, Jonas Hielscher, Stefan Kiltz, J. Dittmann","doi":"10.1145/3437880.3460412","DOIUrl":"https://doi.org/10.1145/3437880.3460412","url":null,"abstract":"Our paper presents a systematic approach to investigate whether and how events can be identified and extracted during the use of video conferencing software. Our approach is based on the encrypted meta and multimedia data exchanged during video conference sessions. It relies on the network data stream which contains data interpretable without decryption (plain data) and encrypted data (encrypted content) some of which is decrypted using our approach (decrypted content). This systematic approach uses a forensic process model and the fission of network data streams before applying methods on the specific individual data types. Our approach is applied exemplary to the Zoom Videoconferencing Service with Client Version 5.4.57862.0110 [4], the mobile Android App Client Version 5.5.2 (1328) [4], the webbased client and the servers (accessed between Jan 21st and Feb 4th). The investigation includes over 50 different configurations. For the heuristic speaker identification, two series of nine sets for eight different speakers are collected. The results show that various user data can be derived from characteristics of encrypted media streams, even if end-to-end encryption is used. The findings suggest user privacy risks. Our approach offers the identification of various events, which enable activity tracking (e.g. camera on/off, increased activity in front of camera) by evaluating heuristic features of the network streams. Further research into user identification within the encrypted audio stream based on pattern recognition using heuristic features of the corresponding network data stream is conducted and suggests the possibility to identify users within a specific set.","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122037346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fake Speech Detection Using Residual Network with Transformer Encoder","authors":"Zhenyu Zhang, Xiaowei Yi, Xianfeng Zhao","doi":"10.1145/3437880.3460408","DOIUrl":"https://doi.org/10.1145/3437880.3460408","url":null,"abstract":"Fake speech detection aims to distinguish fake speech from natural speech. This paper presents an effective fake speech detection scheme based on residual network with transformer encoder (TE-ResNet) for improving the performance of fake speech detection. Firstly, considering inter-frame correlation of the speech signal, we utilize transformer encoder to extract contextual representations of the acoustic features. Then, a residual network is used to process deep features and calculate score that the speech is fake. Besides, to increase the quantity of training data, we apply five speech data augmentation techniques on the training dataset. Finally, we fuse the different fake speech detection models on score-level by logistic regression for compensating the shortcomings of each single model. The proposed scheme is evaluated on two public speech datasets. Our experiments demonstrate that the proposed TE-ResNet outperforms the existing state-of-the-art methods both on development and evaluation datasets. In addition, the proposed fused model achieves improved performance for detection of unseen fake speech technology, which can obtain equal error rates (EERs) of 3.99% and 5.89% on evaluation set of FoR-normal dataset and ASVspoof 2019 LA dataset respectively.","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132066016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How to Pretrain for Steganalysis","authors":"Jan Butora, Yassine Yousfi, J. Fridrich","doi":"10.1145/3437880.3460395","DOIUrl":"https://doi.org/10.1145/3437880.3460395","url":null,"abstract":"In this paper, we investigate the effect of pretraining CNNs on ImageNet on their performance when refined for steganalysis of digital images. In many cases, it seems that just 'seeing' a large number of images helps with the convergence of the network during the refinement no matter what the pretraining task is. To achieve the best performance, the pretraining task should be related to steganalysis, even if it is done on a completely mismatched cover and stego datasets. Furthermore, the pretraining does not need to be carried out for very long and can be done with limited computational resources. An additional advantage of the pretraining is that it is done on color images and can later be applied for steganalysis of color and grayscale images while still having on-par or better performance than detectors trained specifically for a given source. The refining process is also much faster than training the network from scratch. The most surprising part of the paper is that networks pretrained on JPEG images are a good starting point for spatial domain steganalysis as well.","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130133923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}