Collecting Large-scale Comparative Text Data on Legislative Debates

Jan Schwalbach, Christian Rauh
{"title":"Collecting Large-scale Comparative Text Data on Legislative Debates","authors":"Jan Schwalbach, Christian Rauh","doi":"10.1093/oso/9780198849063.003.0006","DOIUrl":null,"url":null,"abstract":"Parliamentary speeches present one of the most consistently available sources of information about the political priorities, actor positions, and conflict structures in democratic states. Recent advances of automated text analysis offer more and more tools to tap into this information reservoir in a systematic manner. However, collecting the high-quality text data needed for unleashing the comparative potential of the various text analysis algorithms out there is a costly endeavor and faces various pragmatic hurdles. Against this challenge, this chapter offers three contributions. First, we outline best practice guidelines and useful tools for researchers wishing to collect or to extend existing legislative debate corpora. Second, we present an extended version of the ParlSpeech Corpus. Third, we highlight the difficulties of comparing text-as-data outputs across different parliaments, pointing to varying languages, varying traditions and conventions, and varying metadata availability.","PeriodicalId":217414,"journal":{"name":"The Politics of Legislative Debates","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Politics of Legislative Debates","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/oso/9780198849063.003.0006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Parliamentary speeches present one of the most consistently available sources of information about the political priorities, actor positions, and conflict structures in democratic states. Recent advances of automated text analysis offer more and more tools to tap into this information reservoir in a systematic manner. However, collecting the high-quality text data needed for unleashing the comparative potential of the various text analysis algorithms out there is a costly endeavor and faces various pragmatic hurdles. Against this challenge, this chapter offers three contributions. First, we outline best practice guidelines and useful tools for researchers wishing to collect or to extend existing legislative debate corpora. Second, we present an extended version of the ParlSpeech Corpus. Third, we highlight the difficulties of comparing text-as-data outputs across different parliaments, pointing to varying languages, varying traditions and conventions, and varying metadata availability.
收集立法辩论的大规模文本比较数据
议会演讲提供了关于民主国家的政治优先事项、行动者立场和冲突结构的最一致的信息来源之一。自动文本分析的最新进展提供了越来越多的工具,以系统的方式挖掘这个信息库。然而,收集高质量的文本数据以释放各种文本分析算法的比较潜力是一项昂贵的工作,并且面临各种实际障碍。针对这一挑战,本章提供了三点贡献。首先,我们为希望收集或扩展现有立法辩论语料库的研究人员概述了最佳实践指南和有用的工具。其次,我们提出了一个扩展版本的ParlSpeech语料库。第三,我们强调了比较不同议会的文本即数据输出的困难,指出不同的语言、不同的传统和惯例以及不同的元数据可用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信