{"title":"An APT Attack Analysis Framework Based on Self-define Rules and Mapreduce","authors":"Yulu Qi, Rong Jiang, Yan Jia, Aiping Li","doi":"10.1109/DSC50466.2020.00017","DOIUrl":"https://doi.org/10.1109/DSC50466.2020.00017","url":null,"abstract":"The essence of Internet security is information security, as more and more industries rely on the Internet, in order to protect the information security of these industries, spawned local area networks (LANs), intranets and so on. With the development of information sensor technology, the Internet of Things (IoT) that interconnects physical devices has emerged. As a unity of computing process and physical process, the Cyberphysical systems (CPS) is the next generation intelligent system which integrates computing, communication and control. CyberPhysical systems cover a wide range of applications, including intelligent transportation systems, telemedicine, smart grids, aerospace, and many other fields, many of which involve critical infrastructure. The APT attacks are typically directed against these critical infrastructures around the world. So, timely and accurate detection APT attacks and take effective defensive measures, it is meaningful to protect the national information security. Although APT attacks seem destructive, their attack process are complex and changeable, in essence, they usually follow certain rules. This paper proposes an APT attack analysis framework based on the APT attack rules and current mainstream detection technologies. The framework iteratively matches the collected data with the cyber security knowledge graph, and implements constraints relies on the cyber security knowledge graph and self-defined attack rules, thereby realizing the current security status of the network in real time.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115005909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Which DGA Family does A Malicious Domain Name Belong To","authors":"Yunyi Zhang, Yuelong Wu, Shuyuan Jin","doi":"10.1109/DSC50466.2020.00016","DOIUrl":"https://doi.org/10.1109/DSC50466.2020.00016","url":null,"abstract":"The Domain Generation Algorithm (DGA) is a technology that generates a large amount of domains in a short time, commonly applied to malware by malicious attackers to circumvent the security mechanisms, such as domain blacklist. Besides discovering DGA domains, identifying DGA families also is significant for detecting and analyzing malware, which provides security professionals with the perspective of comprehensive analysis. In this paper, we investigate 22 different DGA families and propose an effective approach to portray and classify DGA families, which utilizes the strong host association and family portrait to identify different DGA families among massive DGA domains. The approach mitigates the hurdle caused by the nearly 100 times data difference among different families, implementing DGA family clustering. The experimental results show that the proposed approach identifies all of the DGA families accurately in the network that contains six families.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127339918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Refining Co-operative Competition of Robocup Soccer with Reinforcement Learning","authors":"Zhengqiao Wang, Yufan Zeng, Yue Yuan, Yibo Guo","doi":"10.1109/DSC50466.2020.00049","DOIUrl":"https://doi.org/10.1109/DSC50466.2020.00049","url":null,"abstract":"Reinforcement learning (RL) has been widely applied in RoboCup soccer games because of its great potential in enhancing the performance for the model-free competitive scenarios. In recent years, researchers have made a lot of efforts on reducing the input size of RL in order to speed up the training process of the RoboCup soccer agents. In this work, we proposed an improved DQN algorithm named Hierarchical Movement Grouped Deep-Q-Network (HMG-DQN). The algorithm can be trained when actions are in high hierarchy of movement groups, especially in the co-operative competition scenarios, such as 2v1 and 3v2 break-throughs. We conducted the experiments on a simulation platform based on RoboCup SPL rules, and the results showed that our improved algorithm has significantly improved the winning rate compared with DQN.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124780617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Class Overlap Model for System Log Anomaly Detection Based on Ensemble Learning","authors":"Yitong Ren, Zhaojun Gu, Lanlan Pan, Chunbo Liu","doi":"10.1109/DSC50466.2020.00064","DOIUrl":"https://doi.org/10.1109/DSC50466.2020.00064","url":null,"abstract":"Using machine learning to detect system log data is essential. It is prone to the phenomenon of class overlap because of too many similar system log data. The occurrence of this phenomenon will have a serious impact on the anomaly detection of the system logs. In order to solve the problem of class overlap in system logs, this paper proposes an anomaly detection model for class overlap on system logs. We first calculate the relationship between the sample data and the membership of different classes, normal or anomaly, and use the fuzziness to separate the sample data of the overlapping parts of the classes from the data of the other parts. AdaBoost, an ensemble learning approach, is used to detect overlapping data. Compared with machine learning algorithms, ensemble learning can better classify the data of the overlapping parts, so as to achieve the purpose of detecting the anomalies of the system logs. Experimental results show that our model can be effectively applied in a variety of basic algorithms, and the results of each measure have been improved.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125065545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Incorporating Entity Type Information into Knowledge Representation Learning","authors":"Wenyu Huang, Guohua Wang, Huakui Zhang, Feng Chen","doi":"10.1109/DSC50466.2020.00023","DOIUrl":"https://doi.org/10.1109/DSC50466.2020.00023","url":null,"abstract":"Knowledge Representation Learning (KRL), which is also known as Knowledge Embedding, is a very useful method to represent complex relations in knowledge graphs. The low-dimensional representation learned by KRL models makes a contribution to many tasks like recommender system and question answering. Recently, many KRL models are trained using square loss or cross entropy loss based on Closed World Assumption (CWA). Although CWA is an easy way for training, it violates the link prediction task which exploits KRL. To overcome the drawback, in this paper, we introduce a new method, Type-based Prior Possibility Assumption (TPPA). TPPA calculates type based prior possibilities for missing triplets instead of zeros in the training process of KRL to weaken the bad influence of CWA. We compare TPPA with the baseline method CWA in ConvE and TuckER, two common frameworks for knowledge representation learning. The experiment results on FB15k-237 dataset show that TPPA based training method outperforms CWA based training method in link prediction task.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126179603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fengcai Qiao, Yuanfa Zhang, Jinsheng Deng, Zhaoyun Ding, Aiping Li
{"title":"A Parallel Algorithm for Graph Transaction Based Frequent Subgraph Mining","authors":"Fengcai Qiao, Yuanfa Zhang, Jinsheng Deng, Zhaoyun Ding, Aiping Li","doi":"10.1109/DSC50466.2020.00061","DOIUrl":"https://doi.org/10.1109/DSC50466.2020.00061","url":null,"abstract":"Frequent subgraph patterns play an important role in feature mining for graph data. The problem of mining these patterns is defined as finding subgraphs that appear frequently according to a given frequency threshold. Usually, frequent subgraph mining (FSM) is conducted in graph transaction setting, in which graph database contains many small graphs. Since multicore processors are quite popular this day, many algorithms can be accelerated with multi-thread technique. This paper proposed a multi-thread frequent subgraph mining algorithm and achieved considerable acceleration in the experiments. In this paper, a parallel frequent subgraph mining algorithm named PTRGRAM (Parallel Transaction based Graph Mining) which can take full advantage of the multi-core performance of current processors was proposed. In the algorithm, the data synchronization between multiple threads is based on the producer-consumer model. In addition, to speed the support computing, the embedding node list is introduced for optimization. Finally, experimental performance evaluations were conducted with two graph datasets, demonstrating that the proposed algorithm outperforms the single-threaded gSpan and FFSM algorithm.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114161215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tweet Stance Detection: A Two-stage DC-BILSTM Model Based on Semantic Attention","authors":"Yuanyu Yang, Bin Wu, Kai Zhao, Wenying Guo","doi":"10.1109/DSC50466.2020.00012","DOIUrl":"https://doi.org/10.1109/DSC50466.2020.00012","url":null,"abstract":"Stance classification in tweet aims at detecting whether the author of the tweet is in FAVOR of, AGAINST, or NONE towards a pre-chosen target entity. Recently proposed Densely Connected BI-LSTM can effectively relieve overfitting and vanishing-gradient problems as well as dealing with long-term dependencies during multi-layer LSTM training. Based on this, we propose a two-stage deep attention neural network(T-DAN) for target-specific stance detection. This model employs densely connected BI-LSTM to encode tweet tokens and traditional bidirectional LSTM to encode target tokens. Besides, we decompose this ternary classification problem into two binary classification problems to mitigating the imbalanced distribution of labels. In the first stage, we find out the tweet is neutral or subjective about the specific target. In the second stage, we classify the stance of a given subjective tweet’s stance. Moreover, we propose a novel method of attention calculation based on the semantic similarity of tweet tokens and target tokens which can locate the crucial words related to target. Experimental results on English and Chinese datasets demonstrate that our proposed method surpasses some strong baselines and achieves the stateof-the-art performance.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121663184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qian Ji, Xiang Lin, Yinghua Ma, Gongshen Liu, Shilin Wang
{"title":"A Unified Labeling Model for Open-Domain Aspect-Based Sentiment Analysis","authors":"Qian Ji, Xiang Lin, Yinghua Ma, Gongshen Liu, Shilin Wang","doi":"10.1109/DSC50466.2020.00035","DOIUrl":"https://doi.org/10.1109/DSC50466.2020.00035","url":null,"abstract":"Aspect-based sentiment analysis involves aspect term extraction and sentiment prediction towards aspect terms. Recently, more researchers have proposed integrated approaches to accomplish two tasks simultaneously. However, such approaches always limit the domain, quantity, length and category of aspect terms, which greatly restricts its use. This paper aims to model the joint task as an extension of sequence labeling and presents a novel unified labeling model that supports a wide range of aspect terms. Unlike a conventional tagging scheme that predicts the boundary of an aspect term and classifies its sentiment step by step, our proposed model deals with two tasks simultaneously through one set of labels. Sentiment polarities are labeled directly on aspect term tokens, thus combining the boundary information with sentiment polarity in this unified tagging scheme. In this paper, we take Bidirectional Encoder Representations from Transformer (BERT) as the first representation layer to capture contextual features of the entire sentence. Conditional Random Field (CRF) follows BERT for minimizing empirical risk and labeling each token representation within given label sets based on the learned transition matrix. In our experiments, the proposed method demonstrates superior performance against multiple baselines on three benchmark datasets and one Twitter datasets collected by ourselves containing open-domain sentences and aspect terms with various categories, lengths and quantities.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133724202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoliang Zhang, Hongbo Xu, Jinqiao Shi, Tingwen Liu, Chun Liao
{"title":"Word Level Domain-Diversity Attention Based LSTM Model for Sentiment Classification","authors":"Haoliang Zhang, Hongbo Xu, Jinqiao Shi, Tingwen Liu, Chun Liao","doi":"10.1109/DSC50466.2020.00032","DOIUrl":"https://doi.org/10.1109/DSC50466.2020.00032","url":null,"abstract":"Sentiment classification is an important task in Natural Language Processing research and it has considerable application significance. The complexity of human sentimental opinion implies that the hidden information such as application scenes or domains that behind the text may play an important role in the prediction of sentiment polarity. This paper presents a novel model for Sentiment Classification, Domain-Diversity Attention Mechanism based LSTM Model (DDAM-LSTM), integrating word level domain relevant features into an input side attention mechanism of LSTM model. Firstly, we propose a representing and calculating method of domain relevant features for each word according to its context. Then we find that the common words and certain domain-specific words show obvious different distribution states as for domain tendency. On this basis, an attention mechanism is designed to assign scale weights to the words at the input side of LSTM network according to their diversity of domain tendency. By combining this unique attention mechanism with the LSTM model, we achieve the goal of fusing the implied domain knowledge with the Neural Network. Experimental results on three public benchmark datasets show that our proposed model yields obvious performance improvement.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128315309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey of Depth Estimation Based on Computer Vision","authors":"Yang Liu, Jie Jiang, Jiahao Sun, L. Bai, Qi Wang","doi":"10.1109/DSC50466.2020.00028","DOIUrl":"https://doi.org/10.1109/DSC50466.2020.00028","url":null,"abstract":"Currently, the method based on computer vision for depth information extraction and depth estimation is widely used. It can get depth information from 2D images, depth maps, or binocular vision images and has been a popular application in the field of artificial intelligence such as depth detection, pose estimation, as well as 3D reconstruction. This paper introduces the basic theory and some implementation methods of depth information acquisition based on computer vision. As well, it briefly summarizes the existing research results and makes an outlook on the future development trend of the field.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130045764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}