检测儿童性虐待材料（CSAM）中的露骨性内容：端到端分类器和基于区域的网络

arXiv - CS - Emerging Technologies Pub Date : 2024-06-20 DOI:arxiv-2406.14131

Weronika Gutfeter, Joanna Gajewska, Andrzej Pacut

{"title":"检测儿童性虐待材料（CSAM）中的露骨性内容：端到端分类器和基于区域的网络","authors":"Weronika Gutfeter, Joanna Gajewska, Andrzej Pacut","doi":"arxiv-2406.14131","DOIUrl":null,"url":null,"abstract":"Child sexual abuse materials (CSAM) pose a significant threat to the safety\nand well-being of children worldwide. Detecting and preventing the distribution\nof such materials is a critical task for law enforcement agencies and\ntechnology companies. As content moderation is often manual, developing an\nautomated detection system can help reduce human reviewers' exposure to\npotentially harmful images and accelerate the process of counteracting. This\nstudy presents methods for classifying sexually explicit content, which plays a\ncrucial role in the automated CSAM detection system. Several approaches are\nexplored to solve the task: an end-to-end classifier, a classifier with person\ndetection and a private body parts detector. All proposed methods are tested on\nthe images obtained from the online tool for reporting illicit content. Due to\nlegal constraints, access to the data is limited, and all algorithms are\nexecuted remotely on the isolated server. The end-to-end classifier yields the\nmost promising results, with an accuracy of 90.17%, after augmenting the\ntraining set with the additional neutral samples and adult pornography. While\ndetection-based methods may not achieve higher accuracy rates and cannot serve\nas a final classifier on their own, their inclusion in the system can be\nbeneficial. Human body-oriented approaches generate results that are easier to\ninterpret, and obtaining more interpretable results is essential when analyzing\nmodels that are trained without direct access to data.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detecting sexually explicit content in the context of the child sexual abuse materials (CSAM): end-to-end classifiers and region-based networks\",\"authors\":\"Weronika Gutfeter, Joanna Gajewska, Andrzej Pacut\",\"doi\":\"arxiv-2406.14131\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Child sexual abuse materials (CSAM) pose a significant threat to the safety\\nand well-being of children worldwide. Detecting and preventing the distribution\\nof such materials is a critical task for law enforcement agencies and\\ntechnology companies. As content moderation is often manual, developing an\\nautomated detection system can help reduce human reviewers' exposure to\\npotentially harmful images and accelerate the process of counteracting. This\\nstudy presents methods for classifying sexually explicit content, which plays a\\ncrucial role in the automated CSAM detection system. Several approaches are\\nexplored to solve the task: an end-to-end classifier, a classifier with person\\ndetection and a private body parts detector. All proposed methods are tested on\\nthe images obtained from the online tool for reporting illicit content. Due to\\nlegal constraints, access to the data is limited, and all algorithms are\\nexecuted remotely on the isolated server. The end-to-end classifier yields the\\nmost promising results, with an accuracy of 90.17%, after augmenting the\\ntraining set with the additional neutral samples and adult pornography. While\\ndetection-based methods may not achieve higher accuracy rates and cannot serve\\nas a final classifier on their own, their inclusion in the system can be\\nbeneficial. Human body-oriented approaches generate results that are easier to\\ninterpret, and obtaining more interpretable results is essential when analyzing\\nmodels that are trained without direct access to data.\",\"PeriodicalId\":501168,\"journal\":{\"name\":\"arXiv - CS - Emerging Technologies\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2406.14131\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.14131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

儿童性虐待材料 (CSAM) 对全世界儿童的安全和福祉构成重大威胁。检测和防止此类材料的传播是执法机构和技术公司的一项重要任务。由于内容审核通常是人工操作，开发自动检测系统有助于减少人工审核人员接触潜在有害图像的机会，并加快反制过程。本研究介绍了对性露骨内容进行分类的方法，这在 CSAM 自动检测系统中起着至关重要的作用。研究探讨了几种方法来解决这一任务：端到端分类器、带有持续检测功能的分类器和隐私身体部位检测器。所有提出的方法都在从报告非法内容的在线工具中获取的图像上进行了测试。由于法律限制，对数据的访问是有限的，所有算法都是在独立服务器上远程执行的。在使用额外的中性样本和成人色情内容增强训练集后，端到端分类器取得了最有希望的结果，准确率达到 90.17%。虽然基于检测的方法可能无法达到更高的准确率，也不能单独作为最终分类器，但将它们纳入系统中可能会有所帮助。以人体为导向的方法产生的结果更易于解释，而在分析那些在无法直接获取数据的情况下训练出来的模型时，获得更易于解释的结果至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Detecting sexually explicit content in the context of the child sexual abuse materials (CSAM): end-to-end classifiers and region-based networks

Child sexual abuse materials (CSAM) pose a significant threat to the safety and well-being of children worldwide. Detecting and preventing the distribution of such materials is a critical task for law enforcement agencies and technology companies. As content moderation is often manual, developing an automated detection system can help reduce human reviewers' exposure to potentially harmful images and accelerate the process of counteracting. This study presents methods for classifying sexually explicit content, which plays a crucial role in the automated CSAM detection system. Several approaches are explored to solve the task: an end-to-end classifier, a classifier with person detection and a private body parts detector. All proposed methods are tested on the images obtained from the online tool for reporting illicit content. Due to legal constraints, access to the data is limited, and all algorithms are executed remotely on the isolated server. The end-to-end classifier yields the most promising results, with an accuracy of 90.17%, after augmenting the training set with the additional neutral samples and adult pornography. While detection-based methods may not achieve higher accuracy rates and cannot serve as a final classifier on their own, their inclusion in the system can be beneficial. Human body-oriented approaches generate results that are easier to interpret, and obtaining more interpretable results is essential when analyzing models that are trained without direct access to data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Emerging Technologies

自引率

0.00%

发文量