{"title":"Understanding Human-Chosen PINs: Characteristics, Distribution and Security","authors":"Ding Wang, Qianchen Gu, Xinyi Huang, Ping Wang","doi":"10.1145/3052973.3053031","DOIUrl":null,"url":null,"abstract":"Personal Identification Numbers (PINs) are ubiquitously used in embedded computing systems where user input interfaces are constrained. Yet, little attention has been paid to this important kind of authentication credentials, especially for 6-digit PINs which dominate in Asian countries and are gaining popularity worldwide. Unsurprisingly, many fundamental questions (e.g., what's the distribution that human-chosen PINs follow?) remain as intact as about fifty years ago when they first arose. In this work, we conduct a systematic investigation into the characteristics, distribution and security of both 4-digit PINs and 6-digit PINs that are chosen by English users and Chinese users. Particularly, we, for the first time, perform a comprehensive comparison of the PIN characteristics and security between these two distinct user groups. Our results show that there are great differences in PIN choices between these two groups of users, a small number of popular patterns prevail in both groups, and surprisingly, over 50% of every PIN datasets can be accounted for by just the top 5%~8% most popular PINs. What's disturbing is the observation that, as online guessing is a much more serious threat than offline guessing in the current PIN-based systems, longer PINs only attain marginally improved security: human-chosen 4-digit PINs can offer about 6.6 bits of security against online guessing and 8.4 bits of security against offline guessing, and this figure for 6-digit PINs is 7.2 bits and 13.2 bits, respectively. We, for the first time, reveal that Zipf's law is likely to exist in PINs. Despite distinct language/cultural backgrounds, both user groups choose PINs with almost the same Zipf distribution function, and such Zipf PIN-distribution from one source (about which we may know little information) can be well predicted by real-world attackers by running Markov-Chains with PINs from another known source. Our Zipf theory would have foundational implications for analyzing PIN-based protocols and for designing PIN creation policies, while our security measurements provide guidance for bank agencies and financial authorities that are planning to conduct PIN migration from 4-digits to 6-digits.","PeriodicalId":20540,"journal":{"name":"Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"55","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3052973.3053031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 55
Abstract
Personal Identification Numbers (PINs) are ubiquitously used in embedded computing systems where user input interfaces are constrained. Yet, little attention has been paid to this important kind of authentication credentials, especially for 6-digit PINs which dominate in Asian countries and are gaining popularity worldwide. Unsurprisingly, many fundamental questions (e.g., what's the distribution that human-chosen PINs follow?) remain as intact as about fifty years ago when they first arose. In this work, we conduct a systematic investigation into the characteristics, distribution and security of both 4-digit PINs and 6-digit PINs that are chosen by English users and Chinese users. Particularly, we, for the first time, perform a comprehensive comparison of the PIN characteristics and security between these two distinct user groups. Our results show that there are great differences in PIN choices between these two groups of users, a small number of popular patterns prevail in both groups, and surprisingly, over 50% of every PIN datasets can be accounted for by just the top 5%~8% most popular PINs. What's disturbing is the observation that, as online guessing is a much more serious threat than offline guessing in the current PIN-based systems, longer PINs only attain marginally improved security: human-chosen 4-digit PINs can offer about 6.6 bits of security against online guessing and 8.4 bits of security against offline guessing, and this figure for 6-digit PINs is 7.2 bits and 13.2 bits, respectively. We, for the first time, reveal that Zipf's law is likely to exist in PINs. Despite distinct language/cultural backgrounds, both user groups choose PINs with almost the same Zipf distribution function, and such Zipf PIN-distribution from one source (about which we may know little information) can be well predicted by real-world attackers by running Markov-Chains with PINs from another known source. Our Zipf theory would have foundational implications for analyzing PIN-based protocols and for designing PIN creation policies, while our security measurements provide guidance for bank agencies and financial authorities that are planning to conduct PIN migration from 4-digits to 6-digits.