The use of web-scraped data to analyse the dynamics of clothing and footwear prices

Adam Juszczak
{"title":"The use of web-scraped data to analyse the dynamics of clothing and footwear prices","authors":"Adam Juszczak","doi":"10.59139/ws.2023.09.2","DOIUrl":null,"url":null,"abstract":"Web scraping is a technique that makes it possible to obtain information from websites automatically. As online shopping grows in popularity, it became an abundant source of information on the prices of goods sold by retailers. The use of scraped data usually allows, in addition to a significant reduction of costs of price research, the improvement of the precision of inflation estimates and real-time tracking. For this reason, web scraping is a popular research tool both for statistical centers (Eurostat, British Office of National Statistics, Belgian Statbel) and universities (e.g. the Billion Prices Project conducted at Massachusetts Institute of Technology). However, the use of scraped data to calculate inflation brings about many challenges at the stage of their collection, processing, and aggregation. The aim of the study is to compare various methods of calculating price indices of clothing and footwear on the basis of scraped data. Using data from one of the largest online stores selling clothing and footwear for the period of February 2018–November 2019, the author compared the results of the Jevons chain index, the GEKS-J index and the GEKS-J expanding and updating window methods. As a result of the calculations, a high chain index drift was confirmed, and very similar results were found using the extension methods and the updated calculation window (excluding the FBEW method).","PeriodicalId":85858,"journal":{"name":"Wiadomosci statystyczne (Warsaw, Poland : 1956)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wiadomosci statystyczne (Warsaw, Poland : 1956)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.59139/ws.2023.09.2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Web scraping is a technique that makes it possible to obtain information from websites automatically. As online shopping grows in popularity, it became an abundant source of information on the prices of goods sold by retailers. The use of scraped data usually allows, in addition to a significant reduction of costs of price research, the improvement of the precision of inflation estimates and real-time tracking. For this reason, web scraping is a popular research tool both for statistical centers (Eurostat, British Office of National Statistics, Belgian Statbel) and universities (e.g. the Billion Prices Project conducted at Massachusetts Institute of Technology). However, the use of scraped data to calculate inflation brings about many challenges at the stage of their collection, processing, and aggregation. The aim of the study is to compare various methods of calculating price indices of clothing and footwear on the basis of scraped data. Using data from one of the largest online stores selling clothing and footwear for the period of February 2018–November 2019, the author compared the results of the Jevons chain index, the GEKS-J index and the GEKS-J expanding and updating window methods. As a result of the calculations, a high chain index drift was confirmed, and very similar results were found using the extension methods and the updated calculation window (excluding the FBEW method).
利用网上搜集的数据来分析服装和鞋类价格的动态
网络抓取是一种可以自动从网站获取信息的技术。随着网上购物越来越受欢迎,它成为零售商销售商品价格的丰富信息来源。使用收集的数据除了可以显著降低价格研究的成本外,还可以提高通货膨胀估计和实时跟踪的精度。由于这个原因,网络抓取是统计中心(欧盟统计局,英国国家统计局,比利时国家统计局)和大学(例如麻省理工学院进行的十亿价格项目)的流行研究工具。然而,使用抓取的数据来计算通货膨胀,在数据的收集、处理和汇总阶段带来了许多挑战。本研究的目的是在搜集数据的基础上,比较各种计算服装和鞋类价格指数的方法。作者利用2018年2月至2019年11月期间最大的服装和鞋类在线商店之一的数据,比较了Jevons连锁店指数、GEKS-J指数和GEKS-J扩展和更新窗口方法的结果。计算结果表明,采用扩展方法和更新后的计算窗口(不包括FBEW方法)得到了非常相似的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信