网络抓取在经济和金融中的应用

P. Śpiewanowski, Oleksandr Talavera, Linh Vi
{"title":"网络抓取在经济和金融中的应用","authors":"P. Śpiewanowski, Oleksandr Talavera, Linh Vi","doi":"10.1093/acrefore/9780190625979.013.652","DOIUrl":null,"url":null,"abstract":"The 21st-century economy is increasingly built around data. Firms and individuals upload and store enormous amount of data. Most of the produced data is stored on private servers, but a considerable part is made publicly available across the 1.83 billion websites available online. These data can be accessed by researchers using web-scraping techniques.\n Web scraping refers to the process of collecting data from web pages either manually or using automation tools or specialized software. Web scraping is possible and relatively simple thanks to the regular structure of the code used for websites designed to be displayed in web browsers. Websites built with HTML can be scraped using standard text-mining tools, either scripts in popular (statistical) programming languages such as Python, Stata, R, or stand-alone dedicated web-scraping tools. Some of those tools do not even require any prior programming skills.\n Since about 2010, with the omnipresence of social and economic activities on the Internet, web scraping has become increasingly more popular among academic researchers. In contrast to proprietary data, which might not be feasible due to substantial costs, web scraping can make interesting data sources accessible to everyone.\n Thanks to web scraping, the data are now available in real time and with significantly more details than what has been traditionally offered by statistical offices or commercial data vendors. In fact, many statistical offices have started using web-scraped data, for example, for calculating price indices. Data collected through web scraping has been used in numerous economic and finance projects and can easily complement traditional data sources.","PeriodicalId":211658,"journal":{"name":"Oxford Research Encyclopedia of Economics and Finance","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Applications of Web Scraping in Economics and Finance\",\"authors\":\"P. Śpiewanowski, Oleksandr Talavera, Linh Vi\",\"doi\":\"10.1093/acrefore/9780190625979.013.652\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The 21st-century economy is increasingly built around data. Firms and individuals upload and store enormous amount of data. Most of the produced data is stored on private servers, but a considerable part is made publicly available across the 1.83 billion websites available online. These data can be accessed by researchers using web-scraping techniques.\\n Web scraping refers to the process of collecting data from web pages either manually or using automation tools or specialized software. Web scraping is possible and relatively simple thanks to the regular structure of the code used for websites designed to be displayed in web browsers. Websites built with HTML can be scraped using standard text-mining tools, either scripts in popular (statistical) programming languages such as Python, Stata, R, or stand-alone dedicated web-scraping tools. Some of those tools do not even require any prior programming skills.\\n Since about 2010, with the omnipresence of social and economic activities on the Internet, web scraping has become increasingly more popular among academic researchers. In contrast to proprietary data, which might not be feasible due to substantial costs, web scraping can make interesting data sources accessible to everyone.\\n Thanks to web scraping, the data are now available in real time and with significantly more details than what has been traditionally offered by statistical offices or commercial data vendors. In fact, many statistical offices have started using web-scraped data, for example, for calculating price indices. Data collected through web scraping has been used in numerous economic and finance projects and can easily complement traditional data sources.\",\"PeriodicalId\":211658,\"journal\":{\"name\":\"Oxford Research Encyclopedia of Economics and Finance\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Oxford Research Encyclopedia of Economics and Finance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/acrefore/9780190625979.013.652\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Oxford Research Encyclopedia of Economics and Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/acrefore/9780190625979.013.652","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

21世纪的经济日益以数据为基础。公司和个人上传和存储了大量的数据。大多数生成的数据都存储在私人服务器上,但相当一部分是通过18.3亿个在线网站公开提供的。研究人员可以使用网络抓取技术访问这些数据。网页抓取是指手动或使用自动化工具或专用软件从网页收集数据的过程。网页抓取是可能的,而且相对简单,这要归功于设计用于在Web浏览器中显示的网站的代码的规则结构。使用HTML构建的网站可以使用标准的文本挖掘工具进行抓取,无论是流行的(统计)编程语言脚本,如Python、Stata、R,还是独立的专用网页抓取工具。其中一些工具甚至不需要任何先前的编程技能。大约从2010年开始,随着互联网上社会和经济活动的无处不在,网络抓取在学术研究人员中越来越受欢迎。与专有数据相比,由于巨大的成本,专有数据可能不可行,网络抓取可以使每个人都可以访问有趣的数据源。多亏了网络抓取技术,这些数据现在可以实时获得,而且比传统上由统计部门或商业数据供应商提供的数据更加详细。事实上,许多统计部门已经开始使用网络搜集的数据,例如计算价格指数。通过网络抓取收集的数据已经在许多经济和金融项目中使用,并且可以很容易地补充传统的数据源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Applications of Web Scraping in Economics and Finance
The 21st-century economy is increasingly built around data. Firms and individuals upload and store enormous amount of data. Most of the produced data is stored on private servers, but a considerable part is made publicly available across the 1.83 billion websites available online. These data can be accessed by researchers using web-scraping techniques. Web scraping refers to the process of collecting data from web pages either manually or using automation tools or specialized software. Web scraping is possible and relatively simple thanks to the regular structure of the code used for websites designed to be displayed in web browsers. Websites built with HTML can be scraped using standard text-mining tools, either scripts in popular (statistical) programming languages such as Python, Stata, R, or stand-alone dedicated web-scraping tools. Some of those tools do not even require any prior programming skills. Since about 2010, with the omnipresence of social and economic activities on the Internet, web scraping has become increasingly more popular among academic researchers. In contrast to proprietary data, which might not be feasible due to substantial costs, web scraping can make interesting data sources accessible to everyone. Thanks to web scraping, the data are now available in real time and with significantly more details than what has been traditionally offered by statistical offices or commercial data vendors. In fact, many statistical offices have started using web-scraped data, for example, for calculating price indices. Data collected through web scraping has been used in numerous economic and finance projects and can easily complement traditional data sources.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信