Description of Datasets

J. Petersen
{"title":"Description of Datasets","authors":"J. Petersen","doi":"10.1002/9781119214656.ch11","DOIUrl":null,"url":null,"abstract":"We use two independent data sets in this work. • NBER trade data [1]. Compiled by the National Bureau of Economic Research, this set of bilateral trade data by commodity spans the period 1962-2000. Trade flows (in USD) are reported in product categories following 4-digit SITC rev.2 classification. This dataset is a combination of two others, spanning 1962-1983 and 1984-2000 respectively. We work with the timespan 1984-2000 to exclude any possible artifacts in the results due to changes in data collection between these two timespans. The NBER trade data introduces artificial product categories (containing 'A's and 'X's in the SITC code) to account for differences in import and export records (i.e. if country A exports to countries B,C, but A's export record deviates from (B+C)'s import records). We only focus on export data and exclude these artificial product categories. Finally, we only include 'real' countries (the dataset also lists world regions, such as Southern Asia or Oceania, etc.). This results in longitudinal trade data for 200 countries in 800 product categories over 17 years. • COMTRADE trade data [2]. The United Nations Commodity Trade Statistics Database (UN COMTRADE) publishes annual international trade statistics data by commodities and partner countries. We use data from the timespan 1990-2010. Export values (in USD) are reported in HS1992 product categories for over 170 countries (again, leaving aside world regions), amounting to roughly 5000 categories over 21 years. Let A(p, c, t) be a product indicator function for the appearance of product p in country c between year t − 1 and t, A(p, c, t) = 1 if x(p, c, t − 1) = 0 and x(p, c, t) > 0 , 0 otherwise. (1) Similarly the indicator function for a disappearance event is D(p, c, t) = 1 if x(p, c, t − 1) > 0 and x(p, c, t) = 0 , 0 otherwise. (2) Note that these definitions are only useful if there exists a data record for c at both t and t − 1. We exclude small countries from the analysis by demanding a population of at least 1.2 million people and total exports of at least 1 billion USD, leaving us with a list of 125 countries. The reported results for the SPI where computed over the timespan 1984-2000. Individual trade flows between countries are only included if they exceed 100000 USD Furthermore, appearance and disappearance events are …","PeriodicalId":222310,"journal":{"name":"Robust Statistics","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robust Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/9781119214656.ch11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We use two independent data sets in this work. • NBER trade data [1]. Compiled by the National Bureau of Economic Research, this set of bilateral trade data by commodity spans the period 1962-2000. Trade flows (in USD) are reported in product categories following 4-digit SITC rev.2 classification. This dataset is a combination of two others, spanning 1962-1983 and 1984-2000 respectively. We work with the timespan 1984-2000 to exclude any possible artifacts in the results due to changes in data collection between these two timespans. The NBER trade data introduces artificial product categories (containing 'A's and 'X's in the SITC code) to account for differences in import and export records (i.e. if country A exports to countries B,C, but A's export record deviates from (B+C)'s import records). We only focus on export data and exclude these artificial product categories. Finally, we only include 'real' countries (the dataset also lists world regions, such as Southern Asia or Oceania, etc.). This results in longitudinal trade data for 200 countries in 800 product categories over 17 years. • COMTRADE trade data [2]. The United Nations Commodity Trade Statistics Database (UN COMTRADE) publishes annual international trade statistics data by commodities and partner countries. We use data from the timespan 1990-2010. Export values (in USD) are reported in HS1992 product categories for over 170 countries (again, leaving aside world regions), amounting to roughly 5000 categories over 21 years. Let A(p, c, t) be a product indicator function for the appearance of product p in country c between year t − 1 and t, A(p, c, t) = 1 if x(p, c, t − 1) = 0 and x(p, c, t) > 0 , 0 otherwise. (1) Similarly the indicator function for a disappearance event is D(p, c, t) = 1 if x(p, c, t − 1) > 0 and x(p, c, t) = 0 , 0 otherwise. (2) Note that these definitions are only useful if there exists a data record for c at both t and t − 1. We exclude small countries from the analysis by demanding a population of at least 1.2 million people and total exports of at least 1 billion USD, leaving us with a list of 125 countries. The reported results for the SPI where computed over the timespan 1984-2000. Individual trade flows between countries are only included if they exceed 100000 USD Furthermore, appearance and disappearance events are …
数据集描述
在这项工作中,我们使用了两个独立的数据集。•NBER贸易数据b[1]。这组双边贸易数据由美国国家经济研究局(National Bureau of Economic Research)编制,涵盖了1962年至2000年的商品贸易数据。贸易流量(以美元计)按照4位数的SITC rev.2分类按产品类别报告。这个数据集是另外两个数据集的组合,分别跨越1962-1983和1984-2000。我们使用1984-2000年的时间跨度,以排除由于这两个时间跨度之间的数据收集变化而导致的结果中的任何可能的人为因素。NBER贸易数据引入了人工产品类别(在SITC代码中包含“A”和“X”),以解释进出口记录的差异(即,如果A国向B国、C国出口,但A国的出口记录偏离(B+C)的进口记录)。我们只关注出口数据,不考虑这些人为的产品类别。最后,我们只包括“真实的”国家(数据集还列出了世界地区,如南亚或大洋洲等)。这就得到了200个国家在17年间800种产品类别的纵向贸易数据。•COMTRADE贸易数据[2]。联合国商品贸易统计数据库(uncomtrade)按商品和伙伴国发布年度国际贸易统计数据。我们使用的数据时间跨度为1990年至2010年。超过170个国家(同样,不包括世界地区)的HS1992产品类别的出口值(以美元计)报告了21年来大约5000个类别。设A(p, c, t)为产品p在c国从t - 1年到t年出现的产品指示函数,如果x(p, c, t - 1) = 0,则A(p, c, t) = 1,否则为x(p, c, t) >, 0。(1)类似地,如果x(p, c, t - 1) > 0,则消失事件的指示函数为D(p, c, t) = 1,否则为x(p, c, t) = 0,0。(2)注意,这些定义只有在c在t和t−1都存在数据记录时才有用。我们将小国排除在分析之外,要求人口至少为120万,出口总额至少为10亿美元,这样我们就得到了125个国家的名单。报告的SPI结果是在1984-2000年期间计算的。国与国之间的个别贸易流量只有在超过10万美元时才被包括在内。此外,出现和消失事件是…
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信