基于子序列约束顺序时间推理的欺骗检测

2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE) Pub Date : 2022-09-01 DOI:10.1109/AIKE55402.2022.00014

Jon Rogers, R. S. Aygün, L. Etzkorn

{"title":"基于子序列约束顺序时间推理的欺骗检测","authors":"Jon Rogers, R. S. Aygün, L. Etzkorn","doi":"10.1109/AIKE55402.2022.00014","DOIUrl":null,"url":null,"abstract":"For select domains and datasets, duplicates may be, in part or in whole, instances of cheating. We may specifically observe this for Sony's PlayStation Network (PSN) that services the world's most popular gaming platform. The key to cheat detection in like domains is the ability to perform temporal deduplication. Temporal data is increasingly prevalent and is not well suited to traditional similarity and distance-based deduplication techniques. We strengthen the well-established Adaptive Sorted Neighborhood Method (ASNM) with an approach for temporal data domains ($\\text{ASNM}+\\text{LCS}$) that applies ASNM, infers attribute metadata, and further detects duplicates through inference of temporal ordering requirements using Longest Common Subsequence (LCS) for records of a shared type. Using LCS, we split each record's temporal sequence into constrained and unconstrained sequences. We flag suspicious (errant) records that are non-adherent to the inferred constrained order and we flag a record as a duplicate if its unconstrained order, of sufficient length, matches that of another record. ASNM and $\\text{ASNM}+\\text{LCS}$ were evaluated against a labeled dataset of 22,794 records from PSN trophy data where duplication may be indicative of cheating. $\\text{ASNM}+\\text{LCS}$ F1 scores outperformed ASNM at every similarity threshold with at least 32% improvement. ASNM's best performance was an F1 of. 708 at the 0.99 threshold; $\\text{ASNM}+\\text{LCS}$ yielded an F1 of. 938. The significant performance improvement costs little overhead as $\\text{ASNM}+\\text{LCS}$ averaged only 3.79% additional runtime.","PeriodicalId":441077,"journal":{"name":"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","volume":"89 15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Cheat Detection Through Temporal Inference of Constrained Orders for Subsequences\",\"authors\":\"Jon Rogers, R. S. Aygün, L. Etzkorn\",\"doi\":\"10.1109/AIKE55402.2022.00014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For select domains and datasets, duplicates may be, in part or in whole, instances of cheating. We may specifically observe this for Sony's PlayStation Network (PSN) that services the world's most popular gaming platform. The key to cheat detection in like domains is the ability to perform temporal deduplication. Temporal data is increasingly prevalent and is not well suited to traditional similarity and distance-based deduplication techniques. We strengthen the well-established Adaptive Sorted Neighborhood Method (ASNM) with an approach for temporal data domains ($\\\\text{ASNM}+\\\\text{LCS}$) that applies ASNM, infers attribute metadata, and further detects duplicates through inference of temporal ordering requirements using Longest Common Subsequence (LCS) for records of a shared type. Using LCS, we split each record's temporal sequence into constrained and unconstrained sequences. We flag suspicious (errant) records that are non-adherent to the inferred constrained order and we flag a record as a duplicate if its unconstrained order, of sufficient length, matches that of another record. ASNM and $\\\\text{ASNM}+\\\\text{LCS}$ were evaluated against a labeled dataset of 22,794 records from PSN trophy data where duplication may be indicative of cheating. $\\\\text{ASNM}+\\\\text{LCS}$ F1 scores outperformed ASNM at every similarity threshold with at least 32% improvement. ASNM's best performance was an F1 of. 708 at the 0.99 threshold; $\\\\text{ASNM}+\\\\text{LCS}$ yielded an F1 of. 938. The significant performance improvement costs little overhead as $\\\\text{ASNM}+\\\\text{LCS}$ averaged only 3.79% additional runtime.\",\"PeriodicalId\":441077,\"journal\":{\"name\":\"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"volume\":\"89 15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIKE55402.2022.00014\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIKE55402.2022.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

对于选定的域和数据集，副本可能是部分或全部的作弊实例。我们可能会特别注意到，索尼的PlayStation Network (PSN)服务于世界上最受欢迎的游戏平台。在类似域中进行欺骗检测的关键是执行临时重复数据删除的能力。时间数据越来越普遍，不适合传统的基于相似性和距离的重复数据删除技术。我们通过一种针对时间数据域($\text{ASNM}+\text{LCS}$)的方法加强了已建立的自适应排序邻域方法(ASNM)，该方法应用ASNM，推断属性元数据，并通过使用最长公共子序列(LCS)对共享类型的记录进行时间排序需求推断来进一步检测重复。使用LCS，我们将每个记录的时间序列拆分为受约束和不受约束的序列。我们标记不遵循推断约束顺序的可疑(错误)记录，如果其无约束顺序(长度足够)与另一条记录匹配，我们将其标记为重复记录。ASNM和$\text{ASNM}+\text{LCS}$根据来自PSN奖杯数据的22,794条记录的标记数据集进行评估，其中重复可能表明作弊。$\text{ASNM}+\text{LCS}$ F1分数在每个相似度阈值上都优于ASNM，至少提高32%。ASNM的最佳表现是F1。0.99阈值为708;$\text{ASNM}+\text{LCS}$得到的F1为。938. 显著的性能改进花费很少的开销，因为$\text{ASNM}+\text{LCS}$平均只增加了3.79%的运行时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cheat Detection Through Temporal Inference of Constrained Orders for Subsequences

For select domains and datasets, duplicates may be, in part or in whole, instances of cheating. We may specifically observe this for Sony's PlayStation Network (PSN) that services the world's most popular gaming platform. The key to cheat detection in like domains is the ability to perform temporal deduplication. Temporal data is increasingly prevalent and is not well suited to traditional similarity and distance-based deduplication techniques. We strengthen the well-established Adaptive Sorted Neighborhood Method (ASNM) with an approach for temporal data domains ($\text{ASNM}+\text{LCS}$) that applies ASNM, infers attribute metadata, and further detects duplicates through inference of temporal ordering requirements using Longest Common Subsequence (LCS) for records of a shared type. Using LCS, we split each record's temporal sequence into constrained and unconstrained sequences. We flag suspicious (errant) records that are non-adherent to the inferred constrained order and we flag a record as a duplicate if its unconstrained order, of sufficient length, matches that of another record. ASNM and $\text{ASNM}+\text{LCS}$ were evaluated against a labeled dataset of 22,794 records from PSN trophy data where duplication may be indicative of cheating. $\text{ASNM}+\text{LCS}$ F1 scores outperformed ASNM at every similarity threshold with at least 32% improvement. ASNM's best performance was an F1 of. 708 at the 0.99 threshold; $\text{ASNM}+\text{LCS}$ yielded an F1 of. 938. The significant performance improvement costs little overhead as $\text{ASNM}+\text{LCS}$ averaged only 3.79% additional runtime.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)

自引率

0.00%

发文量