Spotting Acronyms and Initialisms with the Help of Informatics

Q3 Arts and Humanities
Attila Imre
{"title":"Spotting Acronyms and Initialisms with the Help of Informatics","authors":"Attila Imre","doi":"10.2478/ausp-2022-0025","DOIUrl":null,"url":null,"abstract":"Abstract The growing popularity of streaming services has led to innumerable audiovisual material available for the audience. As movies, documentaries, or TV shows are part of the entertainment industry, they aim at reaching viewers worldwide with the help of dubbed and subtitled versions. Our aim is to collect the acronyms used in the transcripts/subtitles of several American political TV shows (24, Designated Survivor, House of Cards, and The West Wing) and analyse their translated versions into Hungarian. However, the strenuous activity of opening each subtitle file one by one and browsing through them to spot and collect the acronyms and initialisms would result in countless mouse clicks. Hence, a specific software (SRT Manager) was designed to speed up the process. As the majority of definitions regarding acronyms and initialisms focus on the fact that they result from the combination of at least two capital letters, once the software gets the input (multiple subtitle files of entire seasons), it provides all the consecutive two- or more capital letter instances (with or without periods) found in the raw data, such as AA or A.A. Further statistical data (the source file of each instance, counting all unique values and numbering occurrences, and adding sample lines from the subtitle) also saves a lot of time and energy, as it can easily be exported to spreadsheet programs for further data analysis.","PeriodicalId":37574,"journal":{"name":"Acta Universitatis Sapientiae, Philologica","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Universitatis Sapientiae, Philologica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/ausp-2022-0025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract The growing popularity of streaming services has led to innumerable audiovisual material available for the audience. As movies, documentaries, or TV shows are part of the entertainment industry, they aim at reaching viewers worldwide with the help of dubbed and subtitled versions. Our aim is to collect the acronyms used in the transcripts/subtitles of several American political TV shows (24, Designated Survivor, House of Cards, and The West Wing) and analyse their translated versions into Hungarian. However, the strenuous activity of opening each subtitle file one by one and browsing through them to spot and collect the acronyms and initialisms would result in countless mouse clicks. Hence, a specific software (SRT Manager) was designed to speed up the process. As the majority of definitions regarding acronyms and initialisms focus on the fact that they result from the combination of at least two capital letters, once the software gets the input (multiple subtitle files of entire seasons), it provides all the consecutive two- or more capital letter instances (with or without periods) found in the raw data, such as AA or A.A. Further statistical data (the source file of each instance, counting all unique values and numbering occurrences, and adding sample lines from the subtitle) also saves a lot of time and energy, as it can easily be exported to spreadsheet programs for further data analysis.
在信息学的帮助下发现缩略语和首字母
流媒体服务的日益普及,为观众提供了无数的视听材料。由于电影、纪录片或电视节目都是娱乐产业的一部分,它们的目标是通过配音和字幕版本吸引全球观众。我们的目标是收集几部美国政治电视节目(《24小时》、《指定幸存者》、《纸牌屋》和《白宫风云》)的字幕中使用的首字母缩略词,并分析它们的匈牙利语翻译版本。但是,一个一个地打开每个字幕文件,通过浏览来发现和收集缩略词和首字母的繁重活动会导致无数次鼠标点击。因此,设计了一个特定的软件(SRT Manager)来加速这个过程。由于大多数关于首字母缩写和首字母缩写的定义都关注于它们至少由两个大写字母组合而成的事实,一旦软件获得输入(整个季节的多个字幕文件),它就会提供原始数据中发现的所有连续的两个或更多大写字母实例(带或不带句号),例如AA或a.a。进一步的统计数据(每个实例的源文件,计算所有唯一值并编号出现次数,从副标题中添加样行也节省了大量的时间和精力,因为它可以很容易地导出到电子表格程序中进行进一步的数据分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Acta Universitatis Sapientiae, Philologica
Acta Universitatis Sapientiae, Philologica Arts and Humanities-Language and Linguistics
CiteScore
0.50
自引率
0.00%
发文量
0
审稿时长
10 weeks
期刊介绍: Series Philologica is published in cooperation with Sciendo by De Gruyter. Series Philologica publishes original, previously unpublished articles in the wide field of philological studies, and it is published in 3 issues a year (since 2014). The printed and online version of papers are identical.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信