{"title":"Unpacking copyright infringement issues in the GenAI development lifecycle and a peek into the future","authors":"Cheng L. SAW, Bryan Zhi Yang TAN","doi":"10.1016/j.clsr.2025.106163","DOIUrl":null,"url":null,"abstract":"<div><div>Generative AI (“GAI”) refers to deep learning models that ingest input data and “learn” to produce output that mimics such data when duly prompted. This feature, however, has given rise to numerous claims of infringement by the owners of copyright in the training material. Relevantly, three questions have emerged for the law of copyright: (1) whether <em>prima facie</em> acts of infringement are disclosed at each stage of the GAI development lifecycle; (2) whether such acts fall within the scope of the text and data mining (“TDM”) exceptions; and (3) whether (and, if so, how successfully) the fair use exception may be invoked by GAI developers as a defence to infringement claims. This paper critically examines these questions in turn and considers, in particular, their interplay with the so-called “memorisation” phenomenon. It is argued that although infringing acts might occur in the process of downloading in-copyright training material and training the GAI model in question, TDM and fair use exceptions (where available) may yet exonerate developers from copyright liability under the right conditions.</div></div>","PeriodicalId":51516,"journal":{"name":"Computer Law & Security Review","volume":"58 ","pages":"Article 106163"},"PeriodicalIF":3.3000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Law & Security Review","FirstCategoryId":"90","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2212473X25000367","RegionNum":3,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"LAW","Score":null,"Total":0}
引用次数: 0
Abstract
Generative AI (“GAI”) refers to deep learning models that ingest input data and “learn” to produce output that mimics such data when duly prompted. This feature, however, has given rise to numerous claims of infringement by the owners of copyright in the training material. Relevantly, three questions have emerged for the law of copyright: (1) whether prima facie acts of infringement are disclosed at each stage of the GAI development lifecycle; (2) whether such acts fall within the scope of the text and data mining (“TDM”) exceptions; and (3) whether (and, if so, how successfully) the fair use exception may be invoked by GAI developers as a defence to infringement claims. This paper critically examines these questions in turn and considers, in particular, their interplay with the so-called “memorisation” phenomenon. It is argued that although infringing acts might occur in the process of downloading in-copyright training material and training the GAI model in question, TDM and fair use exceptions (where available) may yet exonerate developers from copyright liability under the right conditions.
期刊介绍:
CLSR publishes refereed academic and practitioner papers on topics such as Web 2.0, IT security, Identity management, ID cards, RFID, interference with privacy, Internet law, telecoms regulation, online broadcasting, intellectual property, software law, e-commerce, outsourcing, data protection, EU policy, freedom of information, computer security and many other topics. In addition it provides a regular update on European Union developments, national news from more than 20 jurisdictions in both Europe and the Pacific Rim. It is looking for papers within the subject area that display good quality legal analysis and new lines of legal thought or policy development that go beyond mere description of the subject area, however accurate that may be.