Rajat Subhra Bhowmick, Rahul Indra, Isha Ganguli, Jayanta Paul, J. Sil
{"title":"通过深度学习以最小的努力破解Captcha系统:对印度政府网站的实时风险评估","authors":"Rajat Subhra Bhowmick, Rahul Indra, Isha Ganguli, Jayanta Paul, J. Sil","doi":"10.1145/3584974","DOIUrl":null,"url":null,"abstract":"Captchas are used to prevent computer bots from launching spam attacks and automatically extracting data available in the websites. The government websites mostly contain sensitive data related to citizens and assets of the country, and the vulnerability to its captcha systems raises a major security challenge. The proposed work focuses on the real-time captcha systems used by the government websites of India and identifies the risks level. To effectively analyze its captcha security, we concentrate on the problem from an attacker’s perspective. From the viewpoint of an attacker, building an effective solver to breach the captcha security system from scratch with limited feature engineering knowledge of text and image processing is a challenge. Neural network models are useful in automated feature extraction, and a simple model can be trained with a minimum number of manually annotated real captchas. Along with popular text captchas, government websites of India use text instructions–based captchas. We analyze an effective neural network pipeline for solving text captchas. The text instructions captchas are relatively new, and the work provides novel end-to-end neural network architectures to break different types of text instructions captchas. The proposed models achieve more than 80% accuracy and on a desktop GPU has a maximum inference speed of 1.063 seconds. The study comes up with an ecosystem and procedure to rate the overall risk of a captcha system used on a website. We observe that concerning the importance of available information on these government websites, the effort required to solve the captcha systems by an attacker is alarming.","PeriodicalId":202552,"journal":{"name":"Digital Threats: Research and Practice","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Breaking Captcha System with Minimal Exertion through Deep Learning: Real-time Risk Assessment on Indian Government Websites\",\"authors\":\"Rajat Subhra Bhowmick, Rahul Indra, Isha Ganguli, Jayanta Paul, J. Sil\",\"doi\":\"10.1145/3584974\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Captchas are used to prevent computer bots from launching spam attacks and automatically extracting data available in the websites. The government websites mostly contain sensitive data related to citizens and assets of the country, and the vulnerability to its captcha systems raises a major security challenge. The proposed work focuses on the real-time captcha systems used by the government websites of India and identifies the risks level. To effectively analyze its captcha security, we concentrate on the problem from an attacker’s perspective. From the viewpoint of an attacker, building an effective solver to breach the captcha security system from scratch with limited feature engineering knowledge of text and image processing is a challenge. Neural network models are useful in automated feature extraction, and a simple model can be trained with a minimum number of manually annotated real captchas. Along with popular text captchas, government websites of India use text instructions–based captchas. We analyze an effective neural network pipeline for solving text captchas. The text instructions captchas are relatively new, and the work provides novel end-to-end neural network architectures to break different types of text instructions captchas. The proposed models achieve more than 80% accuracy and on a desktop GPU has a maximum inference speed of 1.063 seconds. The study comes up with an ecosystem and procedure to rate the overall risk of a captcha system used on a website. We observe that concerning the importance of available information on these government websites, the effort required to solve the captcha systems by an attacker is alarming.\",\"PeriodicalId\":202552,\"journal\":{\"name\":\"Digital Threats: Research and Practice\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Threats: Research and Practice\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3584974\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Threats: Research and Practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3584974","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Breaking Captcha System with Minimal Exertion through Deep Learning: Real-time Risk Assessment on Indian Government Websites
Captchas are used to prevent computer bots from launching spam attacks and automatically extracting data available in the websites. The government websites mostly contain sensitive data related to citizens and assets of the country, and the vulnerability to its captcha systems raises a major security challenge. The proposed work focuses on the real-time captcha systems used by the government websites of India and identifies the risks level. To effectively analyze its captcha security, we concentrate on the problem from an attacker’s perspective. From the viewpoint of an attacker, building an effective solver to breach the captcha security system from scratch with limited feature engineering knowledge of text and image processing is a challenge. Neural network models are useful in automated feature extraction, and a simple model can be trained with a minimum number of manually annotated real captchas. Along with popular text captchas, government websites of India use text instructions–based captchas. We analyze an effective neural network pipeline for solving text captchas. The text instructions captchas are relatively new, and the work provides novel end-to-end neural network architectures to break different types of text instructions captchas. The proposed models achieve more than 80% accuracy and on a desktop GPU has a maximum inference speed of 1.063 seconds. The study comes up with an ecosystem and procedure to rate the overall risk of a captcha system used on a website. We observe that concerning the importance of available information on these government websites, the effort required to solve the captcha systems by an attacker is alarming.