用人工智能总结1952-2012年美国总统竞选电视广告视频。

IF 6.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES
Adam Breuer, Bryce J Dietrich, Michael H Crespin, Matthew Butler, J A Pryse, Kosuke Imai
{"title":"用人工智能总结1952-2012年美国总统竞选电视广告视频。","authors":"Adam Breuer, Bryce J Dietrich, Michael H Crespin, Matthew Butler, J A Pryse, Kosuke Imai","doi":"10.1038/s41597-025-05558-9","DOIUrl":null,"url":null,"abstract":"<p><p>This paper introduces the largest and most comprehensive dataset of US presidential campaign television advertisements, available in digital format. The dataset also includes machine-searchable transcripts and high-quality summaries designed to facilitate a variety of academic research. To date, there has been great interest in collecting and analyzing US presidential campaign advertisements, but the need for manual procurement and annotation has led many to rely on smaller subsets. We design a large-scale, parallelized, AI-based analysis pipeline that automates the laborious process of preparing, transcribing, storyboarding, and summarizing videos. We then apply this methodology to the 9,707 presidential ads from the Julian P. Kanter Political Commercial Archive. We conduct extensive human evaluations to show that these transcripts and summaries match the quality of manually generated alternatives. We illustrate the value of this data by including an application that tracks the genesis and evolution of current focal issue areas over seven decades of presidential elections. Our analysis pipeline and codebase also show how to use LLM-based tools to obtain high-quality summaries for other video datasets.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1552"},"PeriodicalIF":6.9000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12460618/pdf/","citationCount":"0","resultStr":"{\"title\":\"Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952-2012.\",\"authors\":\"Adam Breuer, Bryce J Dietrich, Michael H Crespin, Matthew Butler, J A Pryse, Kosuke Imai\",\"doi\":\"10.1038/s41597-025-05558-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This paper introduces the largest and most comprehensive dataset of US presidential campaign television advertisements, available in digital format. The dataset also includes machine-searchable transcripts and high-quality summaries designed to facilitate a variety of academic research. To date, there has been great interest in collecting and analyzing US presidential campaign advertisements, but the need for manual procurement and annotation has led many to rely on smaller subsets. We design a large-scale, parallelized, AI-based analysis pipeline that automates the laborious process of preparing, transcribing, storyboarding, and summarizing videos. We then apply this methodology to the 9,707 presidential ads from the Julian P. Kanter Political Commercial Archive. We conduct extensive human evaluations to show that these transcripts and summaries match the quality of manually generated alternatives. We illustrate the value of this data by including an application that tracks the genesis and evolution of current focal issue areas over seven decades of presidential elections. Our analysis pipeline and codebase also show how to use LLM-based tools to obtain high-quality summaries for other video datasets.</p>\",\"PeriodicalId\":21597,\"journal\":{\"name\":\"Scientific Data\",\"volume\":\"12 1\",\"pages\":\"1552\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12460618/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Data\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41597-025-05558-9\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-05558-9","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

本文介绍了美国总统竞选电视广告的最大和最全面的数据集,可在数字格式。该数据集还包括机器可搜索的成绩单和高质量的摘要,旨在促进各种学术研究。迄今为止,人们对收集和分析美国总统竞选广告非常感兴趣,但由于需要人工采购和注释,许多人依赖于较小的子集。我们设计了一个大规模的、并行的、基于人工智能的分析管道,它自动化了准备、转录、故事板和总结视频的费力过程。然后,我们将这种方法应用于朱利安·p·坎特政治商业档案馆的9707个总统广告。我们进行了广泛的人工评估,以显示这些转录本和摘要与手动生成的替代方案的质量相匹配。我们通过包含一个应用程序来说明这些数据的价值,该应用程序跟踪了70多年来总统选举中当前焦点问题领域的起源和演变。我们的分析管道和代码库还展示了如何使用基于llm的工具来获取其他视频数据集的高质量摘要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952-2012.

Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952-2012.

Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952-2012.

Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952-2012.

This paper introduces the largest and most comprehensive dataset of US presidential campaign television advertisements, available in digital format. The dataset also includes machine-searchable transcripts and high-quality summaries designed to facilitate a variety of academic research. To date, there has been great interest in collecting and analyzing US presidential campaign advertisements, but the need for manual procurement and annotation has led many to rely on smaller subsets. We design a large-scale, parallelized, AI-based analysis pipeline that automates the laborious process of preparing, transcribing, storyboarding, and summarizing videos. We then apply this methodology to the 9,707 presidential ads from the Julian P. Kanter Political Commercial Archive. We conduct extensive human evaluations to show that these transcripts and summaries match the quality of manually generated alternatives. We illustrate the value of this data by including an application that tracks the genesis and evolution of current focal issue areas over seven decades of presidential elections. Our analysis pipeline and codebase also show how to use LLM-based tools to obtain high-quality summaries for other video datasets.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Scientific Data
Scientific Data Social Sciences-Education
CiteScore
11.20
自引率
4.10%
发文量
689
审稿时长
16 weeks
期刊介绍: Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data. The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信