{"title":"基于子序列约束顺序时间推理的欺骗检测","authors":"Jon Rogers, R. S. Aygün, L. Etzkorn","doi":"10.1109/AIKE55402.2022.00014","DOIUrl":null,"url":null,"abstract":"For select domains and datasets, duplicates may be, in part or in whole, instances of cheating. We may specifically observe this for Sony's PlayStation Network (PSN) that services the world's most popular gaming platform. The key to cheat detection in like domains is the ability to perform temporal deduplication. Temporal data is increasingly prevalent and is not well suited to traditional similarity and distance-based deduplication techniques. We strengthen the well-established Adaptive Sorted Neighborhood Method (ASNM) with an approach for temporal data domains ($\\text{ASNM}+\\text{LCS}$) that applies ASNM, infers attribute metadata, and further detects duplicates through inference of temporal ordering requirements using Longest Common Subsequence (LCS) for records of a shared type. Using LCS, we split each record's temporal sequence into constrained and unconstrained sequences. We flag suspicious (errant) records that are non-adherent to the inferred constrained order and we flag a record as a duplicate if its unconstrained order, of sufficient length, matches that of another record. ASNM and $\\text{ASNM}+\\text{LCS}$ were evaluated against a labeled dataset of 22,794 records from PSN trophy data where duplication may be indicative of cheating. $\\text{ASNM}+\\text{LCS}$ F1 scores outperformed ASNM at every similarity threshold with at least 32% improvement. ASNM's best performance was an F1 of. 708 at the 0.99 threshold; $\\text{ASNM}+\\text{LCS}$ yielded an F1 of. 938. The significant performance improvement costs little overhead as $\\text{ASNM}+\\text{LCS}$ averaged only 3.79% additional runtime.","PeriodicalId":441077,"journal":{"name":"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","volume":"89 15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Cheat Detection Through Temporal Inference of Constrained Orders for Subsequences\",\"authors\":\"Jon Rogers, R. S. Aygün, L. Etzkorn\",\"doi\":\"10.1109/AIKE55402.2022.00014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For select domains and datasets, duplicates may be, in part or in whole, instances of cheating. We may specifically observe this for Sony's PlayStation Network (PSN) that services the world's most popular gaming platform. The key to cheat detection in like domains is the ability to perform temporal deduplication. Temporal data is increasingly prevalent and is not well suited to traditional similarity and distance-based deduplication techniques. We strengthen the well-established Adaptive Sorted Neighborhood Method (ASNM) with an approach for temporal data domains ($\\\\text{ASNM}+\\\\text{LCS}$) that applies ASNM, infers attribute metadata, and further detects duplicates through inference of temporal ordering requirements using Longest Common Subsequence (LCS) for records of a shared type. Using LCS, we split each record's temporal sequence into constrained and unconstrained sequences. We flag suspicious (errant) records that are non-adherent to the inferred constrained order and we flag a record as a duplicate if its unconstrained order, of sufficient length, matches that of another record. ASNM and $\\\\text{ASNM}+\\\\text{LCS}$ were evaluated against a labeled dataset of 22,794 records from PSN trophy data where duplication may be indicative of cheating. $\\\\text{ASNM}+\\\\text{LCS}$ F1 scores outperformed ASNM at every similarity threshold with at least 32% improvement. ASNM's best performance was an F1 of. 708 at the 0.99 threshold; $\\\\text{ASNM}+\\\\text{LCS}$ yielded an F1 of. 938. The significant performance improvement costs little overhead as $\\\\text{ASNM}+\\\\text{LCS}$ averaged only 3.79% additional runtime.\",\"PeriodicalId\":441077,\"journal\":{\"name\":\"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"volume\":\"89 15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIKE55402.2022.00014\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIKE55402.2022.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cheat Detection Through Temporal Inference of Constrained Orders for Subsequences
For select domains and datasets, duplicates may be, in part or in whole, instances of cheating. We may specifically observe this for Sony's PlayStation Network (PSN) that services the world's most popular gaming platform. The key to cheat detection in like domains is the ability to perform temporal deduplication. Temporal data is increasingly prevalent and is not well suited to traditional similarity and distance-based deduplication techniques. We strengthen the well-established Adaptive Sorted Neighborhood Method (ASNM) with an approach for temporal data domains ($\text{ASNM}+\text{LCS}$) that applies ASNM, infers attribute metadata, and further detects duplicates through inference of temporal ordering requirements using Longest Common Subsequence (LCS) for records of a shared type. Using LCS, we split each record's temporal sequence into constrained and unconstrained sequences. We flag suspicious (errant) records that are non-adherent to the inferred constrained order and we flag a record as a duplicate if its unconstrained order, of sufficient length, matches that of another record. ASNM and $\text{ASNM}+\text{LCS}$ were evaluated against a labeled dataset of 22,794 records from PSN trophy data where duplication may be indicative of cheating. $\text{ASNM}+\text{LCS}$ F1 scores outperformed ASNM at every similarity threshold with at least 32% improvement. ASNM's best performance was an F1 of. 708 at the 0.99 threshold; $\text{ASNM}+\text{LCS}$ yielded an F1 of. 938. The significant performance improvement costs little overhead as $\text{ASNM}+\text{LCS}$ averaged only 3.79% additional runtime.