A semantic approach to understanding GDPR fines: From text to compliance insights

IF 3.2 3区 社会学 Q1 LAW
Albina Orlando, Mario Santoro
{"title":"A semantic approach to understanding GDPR fines: From text to compliance insights","authors":"Albina Orlando,&nbsp;Mario Santoro","doi":"10.1016/j.clsr.2025.106187","DOIUrl":null,"url":null,"abstract":"<div><div>This study introduces an explainable Artificial Intelligence (XAI) framework that couples legal-domain NLP with Structural Topic Modeling (STM) and WordNet semantic graphs to rigorously analyze over 1,900 GDPR enforcement decision summaries from a public dataset. Our methodology focuses on demonstrating the pipeline’s validity respect to manual analyses by inspecting the results of four well-know research questions: (1) cross-country fine distribution disparities (automated metadata extraction); (2) the violation severity–fine amount relationship (keyness and semantic analysis); (3) structural text patterns (network analysis and STM); and (4) prevalent enforcement triggers (topic prevalence modeling) The pipeline’s validity is underscored by its ability to replicate key findings from previous manual analyses while enabling a more nuanced exploration of GDPR enforcement trends. Our results confirm significant disparities in enforcement across EU member states and reveal that monetary penalties do not consistently correlate with violation severity. Specifically, serious infringements, particularly those involving video surveillance, frequently result in low-value fines, especially when committed by individuals or smaller entities. This highlights that a substantial proportion of severe violations are attributed to smaller actors. Methodologically, the framework’s ability to quickly replicate such well-known patterns, alongside its transparency and reproducibility, establishes its potential as a scalable tool for transparent and explainable GDPR enforcement analytics.</div></div>","PeriodicalId":51516,"journal":{"name":"Computer Law & Security Review","volume":"59 ","pages":"Article 106187"},"PeriodicalIF":3.2000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Law & Security Review","FirstCategoryId":"90","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2212473X25000598","RegionNum":3,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"LAW","Score":null,"Total":0}
引用次数: 0

Abstract

This study introduces an explainable Artificial Intelligence (XAI) framework that couples legal-domain NLP with Structural Topic Modeling (STM) and WordNet semantic graphs to rigorously analyze over 1,900 GDPR enforcement decision summaries from a public dataset. Our methodology focuses on demonstrating the pipeline’s validity respect to manual analyses by inspecting the results of four well-know research questions: (1) cross-country fine distribution disparities (automated metadata extraction); (2) the violation severity–fine amount relationship (keyness and semantic analysis); (3) structural text patterns (network analysis and STM); and (4) prevalent enforcement triggers (topic prevalence modeling) The pipeline’s validity is underscored by its ability to replicate key findings from previous manual analyses while enabling a more nuanced exploration of GDPR enforcement trends. Our results confirm significant disparities in enforcement across EU member states and reveal that monetary penalties do not consistently correlate with violation severity. Specifically, serious infringements, particularly those involving video surveillance, frequently result in low-value fines, especially when committed by individuals or smaller entities. This highlights that a substantial proportion of severe violations are attributed to smaller actors. Methodologically, the framework’s ability to quickly replicate such well-known patterns, alongside its transparency and reproducibility, establishes its potential as a scalable tool for transparent and explainable GDPR enforcement analytics.
理解GDPR罚款的语义方法:从文本到合规性洞察
本研究引入了一个可解释的人工智能(XAI)框架,该框架将法律领域的NLP与结构主题建模(STM)和WordNet语义图相结合,以严格分析来自公共数据集的1,900多个GDPR执行决策摘要。我们的方法侧重于通过检查四个众所周知的研究问题的结果来证明管道在人工分析方面的有效性:(1)跨国精细分布差异(自动元数据提取);(2)违规严重程度-罚款金额关系(关键字和语义分析);(3)结构文本模式(网络分析和STM);(4)普遍执行触发器(主题流行度建模)该管道的有效性强调了它能够复制以前手工分析的关键发现,同时能够更细致地探索GDPR执行趋势。我们的研究结果证实了欧盟成员国在执法方面的显著差异,并揭示了罚款并不总是与违规严重程度相关。具体来说,严重的侵权行为,特别是涉及视频监控的侵权行为,往往会导致小额罚款,尤其是个人或较小的实体犯下的侵权行为。这突出表明,很大一部分严重侵犯行为是由较小的行为者造成的。在方法上,该框架能够快速复制这些众所周知的模式,以及它的透明度和可重复性,确立了它作为透明和可解释的GDPR执行分析的可扩展工具的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.60
自引率
10.30%
发文量
81
审稿时长
67 days
期刊介绍: CLSR publishes refereed academic and practitioner papers on topics such as Web 2.0, IT security, Identity management, ID cards, RFID, interference with privacy, Internet law, telecoms regulation, online broadcasting, intellectual property, software law, e-commerce, outsourcing, data protection, EU policy, freedom of information, computer security and many other topics. In addition it provides a regular update on European Union developments, national news from more than 20 jurisdictions in both Europe and the Pacific Rim. It is looking for papers within the subject area that display good quality legal analysis and new lines of legal thought or policy development that go beyond mere description of the subject area, however accurate that may be.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信