Detecting web attacks with end-to-end deep learning

IF 0.9 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Internet Services and Applications Pub Date : 2019-08-27 DOI:10.1186/s13174-019-0115-x

Yao Pan, Fangzhou Sun, Zhongwei Teng, Jules White, Douglas C. Schmidt, Jacob Staples, Lee Krause

{"title":"Detecting web attacks with end-to-end deep learning","authors":"Yao Pan, Fangzhou Sun, Zhongwei Teng, Jules White, Douglas C. Schmidt, Jacob Staples, Lee Krause","doi":"10.1186/s13174-019-0115-x","DOIUrl":null,"url":null,"abstract":"Web applications are popular targets for cyber-attacks because they are network-accessible and often contain vulnerabilities. An intrusion detection system monitors web applications and issues alerts when an attack attempt is detected. Existing implementations of intrusion detection systems usually extract features from network packets or string characteristics of input that are manually selected as relevant to attack analysis. Manually selecting features, however, is time-consuming and requires in-depth security domain knowledge. Moreover, large amounts of labeled legitimate and attack request data are needed by supervised learning algorithms to classify normal and abnormal behaviors, which is often expensive and impractical to obtain for production web applications. This paper provides three contributions to the study of autonomic intrusion detection systems. First, we evaluate the feasibility of an unsupervised/semi-supervised approach for web attack detection based on the Robust Software Modeling Tool (RSMT), which autonomically monitors and characterizes the runtime behavior of web applications. Second, we describe how RSMT trains a stacked denoising autoencoder to encode and reconstruct the call graph for end-to-end deep learning, where a low-dimensional representation of the raw features with unlabeled request data is used to recognize anomalies by computing the reconstruction error of the request data. Third, we analyze the results of empirically testing RSMT on both synthetic datasets and production applications with intentional vulnerabilities. Our results show that the proposed approach can efficiently and accurately detect attacks, including SQL injection, cross-site scripting, and deserialization, with minimal domain knowledge and little labeled training data.","PeriodicalId":46467,"journal":{"name":"Journal of Internet Services and Applications","volume":"2 1","pages":"1-22"},"PeriodicalIF":0.9000,"publicationDate":"2019-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"60","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Internet Services and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13174-019-0115-x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 60

Abstract

Web applications are popular targets for cyber-attacks because they are network-accessible and often contain vulnerabilities. An intrusion detection system monitors web applications and issues alerts when an attack attempt is detected. Existing implementations of intrusion detection systems usually extract features from network packets or string characteristics of input that are manually selected as relevant to attack analysis. Manually selecting features, however, is time-consuming and requires in-depth security domain knowledge. Moreover, large amounts of labeled legitimate and attack request data are needed by supervised learning algorithms to classify normal and abnormal behaviors, which is often expensive and impractical to obtain for production web applications. This paper provides three contributions to the study of autonomic intrusion detection systems. First, we evaluate the feasibility of an unsupervised/semi-supervised approach for web attack detection based on the Robust Software Modeling Tool (RSMT), which autonomically monitors and characterizes the runtime behavior of web applications. Second, we describe how RSMT trains a stacked denoising autoencoder to encode and reconstruct the call graph for end-to-end deep learning, where a low-dimensional representation of the raw features with unlabeled request data is used to recognize anomalies by computing the reconstruction error of the request data. Third, we analyze the results of empirically testing RSMT on both synthetic datasets and production applications with intentional vulnerabilities. Our results show that the proposed approach can efficiently and accurately detect attacks, including SQL injection, cross-site scripting, and deserialization, with minimal domain knowledge and little labeled training data.

查看原文本刊更多论文

端到端深度学习检测网络攻击

Web应用程序是网络攻击的热门目标，因为它们可以通过网络访问并且通常包含漏洞。入侵检测系统监视web应用程序，并在检测到攻击企图时发出警报。现有的入侵检测系统通常是从网络数据包或输入的字符串特征中提取特征，这些特征是人工选择的，与攻击分析相关。然而，手动选择特性非常耗时，并且需要深入的安全领域知识。此外，监督学习算法需要大量标记的合法和攻击请求数据来对正常和异常行为进行分类，这对于生产web应用程序来说通常是昂贵且不切实际的。本文对自主入侵检测系统的研究提供了三个贡献。首先，我们评估了基于鲁棒软件建模工具(RSMT)的无监督/半监督web攻击检测方法的可行性，RSMT可以自动监控和表征web应用程序的运行时行为。其次，我们描述了RSMT如何训练堆叠去噪自编码器来编码和重建端到端深度学习的调用图，其中使用带有未标记请求数据的原始特征的低维表示来通过计算请求数据的重建误差来识别异常。第三，我们分析了在合成数据集和具有故意漏洞的生产应用程序上对RSMT进行实证测试的结果。研究结果表明，该方法可以有效、准确地检测SQL注入、跨站脚本和反序列化等攻击，并且只需要很少的领域知识和标记训练数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊