Revisiting Weimar Film Reviewers’ Sentiments: Integrating Lexicon-Based Sentiment Analysis with Large Language Models

Q1 Arts and Humanities

Journal of Cultural Analytics Pub Date : 2024-07-18 DOI:10.22148/001c.118497

Isadora Campregher Paiva, Josephine Diecke

{"title":"Revisiting Weimar Film Reviewers’ Sentiments: Integrating Lexicon-Based Sentiment Analysis with Large Language Models","authors":"Isadora Campregher Paiva, Josephine Diecke","doi":"10.22148/001c.118497","DOIUrl":null,"url":null,"abstract":"Film reviews are an obvious area for the application of sentiment analysis, but while this is common in the field of computer science, it has been mostly absent in film studies. Film scholars have quite rightly been skeptical of such techniques due to their inability to grasp nuanced critical texts. Recent technological developments have, however, given us cause to re-evaluate the usefulness of automated sentiment analysis for historical film reviews. The release of ever more sophisticated Large Language Models (LLMs) has shown that their capacity to handle nuanced language could overcome some of the shortcomings of lexicon-based sentiment analysis. Applying it to historical film reviews seemed logical and promising to us. Some of our early optimism was misplaced: while LLMs, and in particular ChatGPT, proved indeed to be much more adept at dealing with nuanced language, they are also difficult to control and implement in a consistent and reproducible way – two things that lexicon-based sentiment analysis excels at. Given these contrasting sets of strengths and weaknesses, we propose an innovative solution which combines the two, and has more accurate results. In a two-step process, we first harness ChatGPT’s more nuanced grasp of language to undertake a verbose sentiment analysis, in which the model is prompted to explain its judgment of the film reviews at length. We then apply a lexicon-based sentiment analysis (with Python’s NLTK library and its VADER lexicon) to the result of ChatGPT’s analysis, thus achieving systematic results. When applied to a corpus of 80 reviews of three canonical Weimar films (Das Cabinet des Dr. Caligari, Metropolis and Nosferatu), this approach successfully recognized the sentiments of 88.75% of reviews, a considerable improvement when compared to the accuracy rate of the direct application of VADER to the reviews (66.25%). These results are particularly impressive given that this corpus is especially challenging for automated sentiment analysis, with a prevalence of macabre themes, which can easily trigger falsely negative results, and a high number of mixed reviews. We believe this hybrid approach could prove useful for application in large corpora, for which close reading of all reviews would be humanly impossible.","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" 26","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cultural Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22148/001c.118497","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Arts and Humanities","Score":null,"Total":0}

引用次数: 0

Abstract

Film reviews are an obvious area for the application of sentiment analysis, but while this is common in the field of computer science, it has been mostly absent in film studies. Film scholars have quite rightly been skeptical of such techniques due to their inability to grasp nuanced critical texts. Recent technological developments have, however, given us cause to re-evaluate the usefulness of automated sentiment analysis for historical film reviews. The release of ever more sophisticated Large Language Models (LLMs) has shown that their capacity to handle nuanced language could overcome some of the shortcomings of lexicon-based sentiment analysis. Applying it to historical film reviews seemed logical and promising to us. Some of our early optimism was misplaced: while LLMs, and in particular ChatGPT, proved indeed to be much more adept at dealing with nuanced language, they are also difficult to control and implement in a consistent and reproducible way – two things that lexicon-based sentiment analysis excels at. Given these contrasting sets of strengths and weaknesses, we propose an innovative solution which combines the two, and has more accurate results. In a two-step process, we first harness ChatGPT’s more nuanced grasp of language to undertake a verbose sentiment analysis, in which the model is prompted to explain its judgment of the film reviews at length. We then apply a lexicon-based sentiment analysis (with Python’s NLTK library and its VADER lexicon) to the result of ChatGPT’s analysis, thus achieving systematic results. When applied to a corpus of 80 reviews of three canonical Weimar films (Das Cabinet des Dr. Caligari, Metropolis and Nosferatu), this approach successfully recognized the sentiments of 88.75% of reviews, a considerable improvement when compared to the accuracy rate of the direct application of VADER to the reviews (66.25%). These results are particularly impressive given that this corpus is especially challenging for automated sentiment analysis, with a prevalence of macabre themes, which can easily trigger falsely negative results, and a high number of mixed reviews. We believe this hybrid approach could prove useful for application in large corpora, for which close reading of all reviews would be humanly impossible.

查看原文本刊更多论文

重温魏玛影评人的情感：基于词典的情感分析与大型语言模型的整合

影评显然是情感分析的一个应用领域，但这在计算机科学领域很常见，而在电影研究领域却鲜有应用。电影学者对这类技术持怀疑态度是有道理的，因为它们无法把握细致入微的评论文本。然而，最近的技术发展使我们有理由重新评估自动情感分析对历史影评的有用性。越来越复杂的大型语言模型（LLM）的发布表明，它们处理细微语言的能力可以克服基于词典的情感分析的一些缺点。在我们看来，将其应用于历史影评似乎是合乎逻辑且大有可为的。我们早期的一些乐观看法是错误的：虽然事实证明 LLM，尤其是 ChatGPT，确实更善于处理细微语言，但它们也很难以一致和可重现的方式进行控制和实施，而这两点正是基于词典的情感分析所擅长的。鉴于这些优缺点的对比，我们提出了一种创新的解决方案，将两者结合起来，从而获得更准确的结果。我们分两步进行，首先利用 ChatGPT 对语言更细致入微的把握，进行冗长的情感分析，促使模型详细解释其对影评的判断。然后，我们将基于词典的情感分析（使用 Python 的 NLTK 库及其 VADER 词库）应用于 ChatGPT 的分析结果，从而获得系统性的结果。在对魏玛三部经典电影（《卡里加里博士的小屋》、《大都会》和《诺斯费拉图》）的 80 篇评论语料进行分析时，该方法成功识别了 88.75% 的评论情感，与直接应用 VADER 分析评论的准确率（66.25%）相比有了显著提高。这些结果尤其令人印象深刻，因为这个语料库对于自动情感分析来说特别具有挑战性，因为它普遍存在恐怖主题，很容易引发错误的负面结果，而且存在大量混合评论。我们相信，这种混合方法在大型语料库中的应用会非常有用，因为在大型语料库中，要仔细阅读所有评论是不可能的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊