{"title":"A high-performance Webshell detection model","authors":"Wenhao Yuan, Shanfeng Wang, Yixuan Feng, Shujie Li, Songhua Li, Ruyin Sun","doi":"10.1117/12.2655320","DOIUrl":null,"url":null,"abstract":"Webshell exists as a command execution environment in the form of a web page file, which is often referred to as a backdoor. After hacking a website, hackers usually upload it to the web directory of the web server and mix it with the normal web files, and then access the backdoor program through the browser, which can achieve the purpose of controlling the browser. Since there are many kinds of web backdoors in the form of asp, php, jsp or cgi files, here we choose the more popular php file as the research object. In this paper, the Webshell dataset comes from common Webshell samples on the Internet, and the white samples mainly use common open source software developed based on PHP. We use bag-of-words and TF-IDF models for feature extraction, and construct Webshell detection models based on the LightGBM algorithm. The results show that our model is more than 98% accurate and has higher performance in space and time compared to the current popular classification models.","PeriodicalId":105577,"journal":{"name":"International Conference on Signal Processing and Communication Security","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Signal Processing and Communication Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2655320","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Webshell exists as a command execution environment in the form of a web page file, which is often referred to as a backdoor. After hacking a website, hackers usually upload it to the web directory of the web server and mix it with the normal web files, and then access the backdoor program through the browser, which can achieve the purpose of controlling the browser. Since there are many kinds of web backdoors in the form of asp, php, jsp or cgi files, here we choose the more popular php file as the research object. In this paper, the Webshell dataset comes from common Webshell samples on the Internet, and the white samples mainly use common open source software developed based on PHP. We use bag-of-words and TF-IDF models for feature extraction, and construct Webshell detection models based on the LightGBM algorithm. The results show that our model is more than 98% accurate and has higher performance in space and time compared to the current popular classification models.