{"title":"Privacy Protection in Transformer-based Neural Network","authors":"Jiaqi Lang, Linjing Li, Weiyun Chen, D. Zeng","doi":"10.1109/ISI.2019.8823346","DOIUrl":"https://doi.org/10.1109/ISI.2019.8823346","url":null,"abstract":"With the great success of neural networks, it is important to improve the information security of application systems based on them. This paper investigates a scenario where an attacker eavesdrops the intermediate representation computed by the encoder layers and tries to recover the private information of the input text. We propose a new metric to evaluate the encoder’s ability to protect privacy and evaluate the Transformer-based encoder, which is the first privacy research conducted on Transformer-based neural networks. We also propose an adversarial training method to enhance the privacy of Transformer-based neural networks.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124529336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pre-trained Contextualized Representation for Chinese Conversation Topic Classification","authors":"Yujun Zhou, Changliang Li, Saike He, Xiaoqi Wang, Yiming Qiu","doi":"10.1109/ISI.2019.8823172","DOIUrl":"https://doi.org/10.1109/ISI.2019.8823172","url":null,"abstract":"Topic classification plays an important role in facilitating security-related applications, which can help people reduce data scope and acquire key information quickly. Conversation is one of the important ways of communication between people. The utterances in a conversation may contain vital clues, such as people’s opinions, emotions and political slants. To explore more effective approaches for Chinese conversational topic classification, in this paper, we propose a neural network architecture with pre-trained contextualized representation. We firstly apply pretrained BERT model to fine-tune and generate the conversational embeddings, which are the inputs of our neural network models. Then we design several models based on neural networks to extract task-oriented advanced features for topic classification. Experimental results indicate that the models based on our neural network architecture all outperform the baseline only fine-tuned with the pre-trained BERT model. It demonstrates that the pretrained representations are effective to Chinese conversational topic classification, and the proposed architecture can further capture the salient features from the representations. And we release the code and dataset of this paper that can be obtained from https://github.com/njoe9/pretrained_representation.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115494284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wingyan Chung, Cagri Toraman, Yifan Huang, M. Vora, Jinwei Liu
{"title":"A Deep Learning Approach to Modeling Temporal Social Networks on Reddit","authors":"Wingyan Chung, Cagri Toraman, Yifan Huang, M. Vora, Jinwei Liu","doi":"10.1109/ISI.2019.8823399","DOIUrl":"https://doi.org/10.1109/ISI.2019.8823399","url":null,"abstract":"As terrorists are losing against counter-terrorism efforts, they turn to manipulating cryptocurrency prices through online social communities to gain illicit profit to fund their operations. Modeling temporal online social networks (OSNs) of these communities can possibly help to provide useful intelligence about these malicious activities. However, existing techniques do not learn sufficiently from diverse features to enable prediction and simulation of online social behavior. Research on simulating temporal OSN behavior is not widely available. This research developed and validated a deep learning approach, named Temporal Network Model (TNM), to modeling the complex features and dynamic behavior exhibited in the temporal OSNs of online communities. Using extensive features extracted from fine-grained data, TNM consists of weighted time series models, user and link prediction models, and temporal dependency model that predict respectively the macroscopic behavior, microscopic user participation and events, and time stamps of the events. Evaluation was done in comparison with a benchmark approach to examine TNM’s performance on predicting and simulating behavior of 42,627 users in 440,906 events on the Reddit cryptocurrency community during July-August of 2017. Results show that TNM outperformed the benchmark in 5 out of 8 simulation metrics. TNM achieved consistently better performance in user activity prediction, and performed generally better in structural (network-level) prediction. The research provides new findings on simulating temporal OSNs and new predictive analytics for understanding online social behavior.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126616840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The entropy source of pseudo random number generators: from low entropy to high entropy","authors":"Jizhi Wang, Jingshan Pan, Xueli Wu","doi":"10.1109/ISI.2019.8823457","DOIUrl":"https://doi.org/10.1109/ISI.2019.8823457","url":null,"abstract":"The pseudo random number generators (PRNG) is one type of deterministic functions. The information entropy of the output sequences depends on the entropy of the input seeds. The output sequences can be predicted if attackers could know or control the input seeds of PRNGs. Against that, it is necessary that the input seeds is unpredictable, that is to say, the information entropy of the seeds is high enough. However, if there is no high enough entropy sources in environment, how to generate the seeds of PRNG? In other words, how to increase the entropy of the input seeds? Many approaches for extracting entropy from physical environment have been proposed, which lack of theoretical analysis. The condition of entropy’s increasing is given. A model is built to verify the condition based on the functional programming language F*. An example of entropy’s increasing is proposed utilizing execution time randomness of arbitrary codes. Then an algorithm is described, which can generate the seed when the entropy value is given.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129207627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Rumor Detection in Social Media Using Dynamic Propagation Structures","authors":"Shuai Wang, Qingchao Kong, Yuqi Wang, Lei Wang","doi":"10.1109/ISI.2019.8823266","DOIUrl":"https://doi.org/10.1109/ISI.2019.8823266","url":null,"abstract":"Social media, such as Facebook and Twitter, has become one of the most important channels for information dissemination. However, these social media platforms are often misused to spread rumors, which has brought about severe social problems, and consequently, there are urgent needs for automatic rumor detection techniques. Existing work on rumor detection concentrates more on the utilization of textual features, but diffusion structure itself can provide critical propagating information in identifying rumors. Previous works which have considered structural information, only utilize limited propagation structures. Moreover, few related research has considered the dynamic evolution of diffusion structures. To address these issues, in this paper, we propose a Neural Model using Dynamic Propagation Structures (NM-DPS) for rumor detection in social media. Firstly, we propose a partition approach to model the dynamic evolution of propagation structure and then use temporal attention based neural model to learn a representation for the dynamic structure. Finally, we fuse the structure representation and content features into a unified framework for effective rumor detection. Experimental results on two real-world social media datasets demonstrate the salience of dynamic propagation structure information and the effectiveness of our proposed method in capturing the dynamic structure.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122299736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ISI 2019 Committees","authors":"","doi":"10.1109/isi.2019.8823499","DOIUrl":"https://doi.org/10.1109/isi.2019.8823499","url":null,"abstract":"","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"351 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134145228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenjie Guo, Li Yang, Yonggang Lu, Yi Yang, Lian Li, Zongli Liu
{"title":"Information Hiding in OOXML Format Data based on the Splitting of Text Elements","authors":"Wenjie Guo, Li Yang, Yonggang Lu, Yi Yang, Lian Li, Zongli Liu","doi":"10.1109/ISI.2019.8823564","DOIUrl":"https://doi.org/10.1109/ISI.2019.8823564","url":null,"abstract":"In this paper, a novel information hiding method is proposed to embed data in the Word documents that use OOXML format. The 2007 version and more recent versions of MS Word are all based on the OOXML format. The main document body of OOXML document consists of the text elements that correspond to the content of the document. It is found that, in the OOXML format, the printable text in an element can be “split” by separating the element into multiple elements. The digital code of the information determines if the adjacent characters in the text will be “split” or not, so as to achieve the purpose of information hiding. Since the format and all the other properties of the text in the OOXML format document is unchanged after the embedding, the embedded information is imperceptible. The code of the information is embedded circularly to enhance robustness. Experiments show that the proposed method can deal with many kinds of attacks, including “content”, “save as”, “copy” and part of “format”. The proposed method can be used in information security and copyright protection for OOXML format documents.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131918724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Corporation Lawsuit Prediction based on Guiding Learning and Collaborative Filtering Recommendation","authors":"Zhenyu Wu, Guangda Chen, Jingjing Yao","doi":"10.1109/ISI.2019.8823537","DOIUrl":"https://doi.org/10.1109/ISI.2019.8823537","url":null,"abstract":"It is meaningful to use data mining technology to predict the type of lawsuit which a company may receive so that enterprises can avoid lawsuit risks. So we propose a corporation lawsuit prediction algorithm based on guiding learning and collaborative filtering recommendation. Firstly, we use the adaptive synthetic sampling approach (ADASYN) to generate more synthetic data for different minority classes according to their different level of difficulty in learning, so that the training would focus on these minority classes that are difficulty to learn and reduce the learning bias introduced by the imbalance of data distribution. Secondly, for the sake of solving the problem that the insufficient samples make it difficult for the model to learn enough knowledge resulting in a large fluctuation of final scores during the training and poor model stability, we use guiding learning to integrate the basic knowledge of all types of lawsuit a company may receive in the future obtained by the multi-label classification model into the training process of TOP-1 and TOP-2 predictive models. Finally, in order to further improve the prediction accuracy, we use the collaborative filtering recommendation algorithm (CFRA) to select the most similar sample with each test sample from the training set, and the lawsuit type of the selected sample is directly used as the predicted lawsuit type of the corresponding test sample, thereby improving the total prediction accuracy. The experimental results show that the proposed algorithm can effectively predict the most probable lawsuit types of the Top2 for corporations.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"215 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132118052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Forensic Evidence Acquisition Model for Data Leakage Attacks","authors":"Weifeng Xu, Jie Yan, H. Chi","doi":"10.1109/ISI.2019.8823391","DOIUrl":"https://doi.org/10.1109/ISI.2019.8823391","url":null,"abstract":"Data leakage attack is a serious threat to daily business operations. Reconstructing scenes after attacks is critical because the reconstructed scenarios help security analysts to understand these attacks and prevent future incidents. In this paper, we have proposed a systematic approach to reconstruct attack scenes based on a forensic evidence acquisition model. We first build the model, i.e., data leakage-evidence tree, from which digital forensic examiners can collect forensic evidence, then we formalize the tree and evaluate the semantics of the tree based on the evidence found on digital devices and their supporting environments. Finally, we reconstruct the data leakage scenarios based on the semantics of the tree. Our empirical study reconstructs a data breach scenario using a real-world example.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133773449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Masashi Kadoguchi, S. Hayashi, Masaki Hashimoto, Akira Otsuka
{"title":"Exploring the Dark Web for Cyber Threat Intelligence using Machine Leaning","authors":"Masashi Kadoguchi, S. Hayashi, Masaki Hashimoto, Akira Otsuka","doi":"10.1109/ISI.2019.8823360","DOIUrl":"https://doi.org/10.1109/ISI.2019.8823360","url":null,"abstract":"In recent years, cyber attack techniques are increasingly sophisticated, and blocking the attack is more and more difficult, even if a kind of counter measure or another is taken. In order for a successful handling of this situation, it is crucial to have a prediction of cyber attacks, appropriate precautions, and effective utilization of cyber intelligence that enables these actions. Malicious hackers share various kinds of information through particular communities such as the dark web, indicating that a great deal of intelligence exists in cyberspace. This paper focuses on forums on the dark web and proposes an approach to extract forums which include important information or intelligence from huge amounts of forums and identify traits of each forum using methodologies such as machine learning, natural language processing and so on. This approach will allow us to grasp the emerging threats in cyberspace and take appropriate measures against malicious activities.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133050490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}