Large Language Models to make museum archive collections more accessible

IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Manon Reusens, Amy Adams, Bart Baesens
{"title":"Large Language Models to make museum archive collections more accessible","authors":"Manon Reusens,&nbsp;Amy Adams,&nbsp;Bart Baesens","doi":"10.1007/s00146-025-02227-8","DOIUrl":null,"url":null,"abstract":"<div><p>Keywords are essential to the searchability and therefore discoverability of museum and archival collections in the modern world. Without them, the collection management systems (CMS) and online collections these cultural organisations rely on to record, organise, and make their collections accessible, do not operate efficiently. However, generating these keywords manually is time consuming for these already resource strapped organisations. Artificial intelligence (AI), particularly generative AI and Large Language Models (LLMs), could hold the key to generating, even automating, this key data and as such be considered a co-creative add-on. This study contributes to the literature by introducing the use of Meta’s open-source LLM, Llama, to generate keywords from curator/archivist written descriptions of museum and archival collection items. Our findings suggest that these technologies add significant value compared to current manual methods for keyword generation. In particular, we find that through using carefully crafted prompts, successful keyword augmentations could be established making museum and archival collections much more accessible to wider and more diverse audiences. However, the results also showed that generative AI has biases (e.g., hallucinations, over generalisations, outdated language), though the frequency of occurrence was not as high as general perception may insist. Hence, we also discuss mitigation strategies to address these and how cultural institutions can recognise the risks and errors while getting the most from the systems. Finally, we discuss options to achieve structured results which allow easier ingestion of data back into CMS. Ultimately, LLMs hold significant potential to enhance accessibility to museum and archival collections, yet they are not without imperfection as we extensively discuss.</p></div>","PeriodicalId":47165,"journal":{"name":"AI & Society","volume":"40 6","pages":"4485 - 4497"},"PeriodicalIF":4.7000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI & Society","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s00146-025-02227-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Keywords are essential to the searchability and therefore discoverability of museum and archival collections in the modern world. Without them, the collection management systems (CMS) and online collections these cultural organisations rely on to record, organise, and make their collections accessible, do not operate efficiently. However, generating these keywords manually is time consuming for these already resource strapped organisations. Artificial intelligence (AI), particularly generative AI and Large Language Models (LLMs), could hold the key to generating, even automating, this key data and as such be considered a co-creative add-on. This study contributes to the literature by introducing the use of Meta’s open-source LLM, Llama, to generate keywords from curator/archivist written descriptions of museum and archival collection items. Our findings suggest that these technologies add significant value compared to current manual methods for keyword generation. In particular, we find that through using carefully crafted prompts, successful keyword augmentations could be established making museum and archival collections much more accessible to wider and more diverse audiences. However, the results also showed that generative AI has biases (e.g., hallucinations, over generalisations, outdated language), though the frequency of occurrence was not as high as general perception may insist. Hence, we also discuss mitigation strategies to address these and how cultural institutions can recognise the risks and errors while getting the most from the systems. Finally, we discuss options to achieve structured results which allow easier ingestion of data back into CMS. Ultimately, LLMs hold significant potential to enhance accessibility to museum and archival collections, yet they are not without imperfection as we extensively discuss.

大型语言模型,使博物馆档案收藏更容易访问
关键词对现代世界的博物馆和档案收藏的可搜索性和可发现性至关重要。没有它们,这些文化组织赖以记录、组织和提供馆藏的馆藏管理系统(CMS)和在线馆藏就无法有效运作。然而,对于这些已经资源紧张的组织来说,手动生成这些关键字非常耗时。人工智能(AI),特别是生成式人工智能和大型语言模型(llm),可以掌握生成甚至自动化这些关键数据的关键,因此可以被视为共同创造的附加组件。本研究通过引入Meta的开源法学硕士Llama,从馆长/档案管理员对博物馆和档案收藏项目的书面描述中生成关键词,为文献做出了贡献。我们的研究结果表明,与当前手动生成关键字的方法相比,这些技术增加了显著的价值。特别是,我们发现通过使用精心制作的提示,可以建立成功的关键词增强,使博物馆和档案收藏更容易被更广泛和更多样化的受众所访问。然而,结果也表明,生成式人工智能有偏见(例如,幻觉、过度概括、过时的语言),尽管出现的频率并没有一般认知所坚持的那么高。因此,我们还讨论了缓解策略,以解决这些问题,以及文化机构如何在从系统中获得最大收益的同时认识到风险和错误。最后,我们讨论了实现结构化结果的选项,这些结果允许更容易地将数据摄取回CMS。最终,法学硕士在提高博物馆和档案收藏的可访问性方面具有巨大的潜力,但正如我们广泛讨论的那样,它们并非没有缺陷。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
AI & Society
AI & Society COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
8.00
自引率
20.00%
发文量
257
期刊介绍: AI & Society: Knowledge, Culture and Communication, is an International Journal publishing refereed scholarly articles, position papers, debates, short communications, and reviews of books and other publications. Established in 1987, the Journal focuses on societal issues including the design, use, management, and policy of information, communications and new media technologies, with a particular emphasis on cultural, social, cognitive, economic, ethical, and philosophical implications. AI & Society has a broad scope and is strongly interdisciplinary. We welcome contributions and participation from researchers and practitioners in a variety of fields including information technologies, humanities, social sciences, arts and sciences. This includes broader societal and cultural impacts, for example on governance, security, sustainability, identity, inclusion, working life, corporate and community welfare, and well-being of people. Co-authored articles from diverse disciplines are encouraged. AI & Society seeks to promote an understanding of the potential, transformative impacts and critical consequences of pervasive technology for societies. Technological innovations, including new sciences such as biotech, nanotech and neuroscience, offer a great potential for societies, but also pose existential risk. Rooted in the human-centred tradition of science and technology, the Journal acts as a catalyst, promoter and facilitator of engagement with diversity of voices and over-the-horizon issues of arts, science, technology and society. AI & Society expects that, in keeping with the ethos of the journal, submissions should provide a substantial and explicit argument on the societal dimension of research, particularly the benefits, impacts and implications for society. This may include factors such as trust, biases, privacy, reliability, responsibility, and competence of AI systems. Such arguments should be validated by critical comment on current research in this area. Curmudgeon Corner will retain its opinionated ethos. The journal is in three parts: a) full length scholarly articles; b) strategic ideas, critical reviews and reflections; c) Student Forum is for emerging researchers and new voices to communicate their ongoing research to the wider academic community, mentored by the Journal Advisory Board; Book Reviews and News; Curmudgeon Corner for the opinionated. Papers in the Original Section may include original papers, which are underpinned by theoretical, methodological, conceptual or philosophical foundations. The Open Forum Section may include strategic ideas, critical reviews and potential implications for society of current research. Network Research Section papers make substantial contributions to theoretical and methodological foundations within societal domains. These will be multi-authored papers that include a summary of the contribution of each author to the paper. Original, Open Forum and Network papers are peer reviewed. The Student Forum Section may include theoretical, methodological, and application orientations of ongoing research including case studies, as well as, contextual action research experiences. Papers in this section are normally single-authored and are also formally reviewed. Curmudgeon Corner is a short opinionated column on trends in technology, arts, science and society, commenting emphatically on issues of concern to the research community and wider society. Normal word length: Original and Network Articles 10k, Open Forum 8k, Student Forum 6k, Curmudgeon 1k. The exception to the co-author limit of Original and Open Forum (4), Network (10), Student (3) and Curmudgeon (2) articles will be considered for their special contributions. Please do not send your submissions by email but use the "Submit manuscript" button. NOTE TO AUTHORS: The Journal expects its authors to include, in their submissions: a) An acknowledgement of the pre-accept/pre-publication versions of their manuscripts on non-commercial and academic sites. b) Images: obtain permissions from the copyright holder/original sources. c) Formal permission from their ethics committees when conducting studies with people.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信