S. Verhulst, A. Ramesh, Andrew Young, A. Zahuranec
{"title":"Where is Everyone? The Importance of Population Density Data: A Data Artefact Study of the Facebook Population Density Map","authors":"S. Verhulst, A. Ramesh, Andrew Young, A. Zahuranec","doi":"10.2139/ssrn.3937599","DOIUrl":"https://doi.org/10.2139/ssrn.3937599","url":null,"abstract":"In this paper, we explore new and traditional approaches to measuring population density, and ways in which density information has frequently been used by humanitarian, private-sector and government actors to advance a range of private and public goals. We explain how new innovations are leading to fresh ways of collecting data—and fresh forms of data—and how this may open up new avenues for using density information in a variety of contexts. Section III examines one particular example: Facebook’s High-Resolution Population Density Maps (also referred to as HRSL, or high resolution settlement layer). This recent initiative, created in collaboration with a number of external organizations, shows not only the potential of mapping innovations but also the potential benefits of inter-sectoral partnerships and sharing. We examine three particular use cases of HRSL, and we follow with an assessment and some lessons learned. These lessons are applicable to HRSL in particular, but also more broadly. We conclude with some thoughts on avenues for future research.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114254623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Asynchronous Fieldwork in Cross-Country Surveys: An Application to Physical Activity","authors":"S. Poupakis, Francesco Salustri","doi":"10.2139/ssrn.3890036","DOIUrl":"https://doi.org/10.2139/ssrn.3890036","url":null,"abstract":"Multi-country surveys often aim at cross-country comparisons. A common quality standard is conducting these surveys within a common fieldwork period, across all participating countries. However, the rate the target sample is achieved within that fieldwork period in each country varies substantially. Thus, the distribution of the interview month often varies substantially in the final sample. This may lead to biased estimates of cross-country differences, especially if the variable of interest exhibit a non-constant trend over time. This paper aims at demonstrating when such a problem cause biased estimates of country differences in physical activity. We demonstrate the implications of such an asynchronous fieldwork in cross-country surveys, using the European Social Survey Round 7. Our analytical sample focuses on 6 countries with data collected between September 2014 and January 2015. We present results for modelling physical activity using regression analysis. We compare unadjusted and adjusted regression coefficients accounting for fieldwork month. Moreover, we present a set of different postestimation predictions obtained from such pooled cross-country analyses. We found that physical activity varies across interview month, with the highest activity reported in September, decreasing thereafter, reaching the lowest level in January. Thus, countries with more observations during autumn were upward-biased, compared to countries with more observations during winter. Our results demonstrate how comparisons between countries are affected when interview month is omitted. This is prevalent using both unweighted and weighted regression techniques. Studies using pooled samples of cross-country surveys are commonplace. While a common fieldwork period accounts for severe biases in country comparison, often the bias remains when the outcome of interest has substantial seasonal variation. Our study suggests how accounting for interview month in analyses is an easy way to mitigate this problem.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121932609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Full-Information Estimation of Heterogeneous Agent Models Using Macro and Micro Data","authors":"L. Liu, Mikkel Plagborg-Møller","doi":"10.2139/ssrn.3765532","DOIUrl":"https://doi.org/10.2139/ssrn.3765532","url":null,"abstract":"We develop a generally applicable full‐information inference method for heterogeneous agent models, combining aggregate time series data and repeated cross‐sections of micro data. To handle unobserved aggregate state variables that affect cross‐sectional distributions, we compute a numerically unbiased estimate of the model‐implied likelihood function. Employing the likelihood estimate in a Markov Chain Monte Carlo algorithm, we obtain fully efficient and valid Bayesian inference. Evaluation of the micro part of the likelihood lends itself naturally to parallel computing. Numerical illustrations in models with heterogeneous households or firms demonstrate that the proposed full‐information method substantially sharpens inference relative to using only macro data, and for some parameters micro data is essential for identification.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125751314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Perils of Working with Big Data and a SMALL Framework You Can Use to Avoid Them","authors":"Scott A. Brave, R. Butters, M. Fogarty","doi":"10.21033/wp-2020-35","DOIUrl":"https://doi.org/10.21033/wp-2020-35","url":null,"abstract":"The use of “Big Data” to explain fluctuations in the broader economy or guide the business decisions of a firm is now so commonplace that in some instances it has even begun to rival more traditional government statistics and business analytics. Big data sources can very often provide advantages when compared to these more traditional data sources, but with these advantages also comes the potential for pitfalls. We lay out a framework called SMALL that we have developed in order to help interested parties as they navigate the big data minefield. Based on a set of five questions, the SMALL framework should help users of big data spot concerns in their own work and that of others who rely on such data to draw conclusions with actionable public policy or business implications. To demonstrate, we provide several case studies that show a healthy dose of skepticism can be warranted when working with and interpreting these new big data sources.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133251662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Sawada, Minhaj Mahmud, Mai Seki, A. Le, H. Kawarazaki
{"title":"Fighting the Learning Crisis in Developing Countries: A Randomized Experiment of Self-Learning at the Right Level","authors":"Y. Sawada, Minhaj Mahmud, Mai Seki, A. Le, H. Kawarazaki","doi":"10.2139/ssrn.3471021","DOIUrl":"https://doi.org/10.2139/ssrn.3471021","url":null,"abstract":"This paper investigates the effectiveness of a globally popularmethod of self-learning at the right level in improving the cognitiveand non-cognitive abilities of disadvantaged pupils in a developing country, Bangladesh. Using a randomized control trial design,we find substantial improvement in cognitive ability measured bymathematics test scores and catch-up effects on non-cognitive ability measured by a pupil self-esteem measure. These findings areconsistent with a longer-term impact found in take-up rates andscores on a national-level primary school completion exam. Moreover, the teachers' ability to assess student performance substantially improves. Based on our estimates, program benefit exceedscost in a plausible way. Above findings suggest that self-learning atright level can effectively address the learning crisis by improvingthe quality of primary education in developing countries.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115585497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accurate Occupancy Detection of an Office Room From Light, Temperature, Humidity and CO2 Measurements Using Statistical Learning Models","authors":"Alex Mirugwe","doi":"10.2139/ssrn.3686755","DOIUrl":"https://doi.org/10.2139/ssrn.3686755","url":null,"abstract":"This project aims at developing, validating, and testing several classification statistical models that could predict whether or not an office room is occupied using several data features, namely temperature (◦C), light (lx), humidity (%), CO2 (ppm), and a humidity ratio. The data is modeled using classification techniques i.e. Logistic regression, Classification tree, Bagging-Random forest, and Gradient boosted trees.<br><br>These models were trained and then after evaluated against validation and test sets and using confusion matrices to obtain classification and mis-classification rates. The logistic model was trained using glmnet R package, Tree package for classification tree model, random Forest for both Bagging and Random Forest Models, and gbm package for Gradient Boosted Model.<br><br>The best accuracy was obtained from the Random Forest Model with a classification rate of 93.21% when it was evaluated against the test set. Light sensor is also the most significant variable in predicting whether the office room is occupied or not, this was observed in all the five models.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131983700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Andersen, Ilya Archakov, Leon Eric Grund, N. Hautsch, Yifan Li, S. Nasekin, Ingmar Nolte, Manh Cuong Pham, Stephen J. Taylor, V. Todorov
{"title":"A Descriptive Study of High-Frequency Trade and Quote Option Data","authors":"T. Andersen, Ilya Archakov, Leon Eric Grund, N. Hautsch, Yifan Li, S. Nasekin, Ingmar Nolte, Manh Cuong Pham, Stephen J. Taylor, V. Todorov","doi":"10.2139/ssrn.3446690","DOIUrl":"https://doi.org/10.2139/ssrn.3446690","url":null,"abstract":"This paper provides a guide to high frequency option trade and quote data disseminated by the \u0000Options Price Reporting Authority (OPRA). We present a comprehensive overview of the U.S. option market, including details on market regulation and the trading processes for all 16 constituent option exchanges. We review the existing literature that utilizes high-frequency options data, summarize the general structure of the OPRA dataset and present a thorough empirical description of the observed option trades and quotes for a selected sample of underlying assets that contains more than 25 billion records. We outline several types of irregular observations and provide recommendations for data filtering and cleaning. Finally, we illustrate the usefulness of the high frequency option data with two empirical applications: option-implied variance estimation and risk-neutral density estimation. Both applications highlight the rich information content of the high frequency OPRA data.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125027955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating the Underlying Components of High Frequency Financial Data: Finite Sample Performance and Microstructure Noise Effects","authors":"Rodrigo Hizmeri, M. Izzeldin","doi":"10.2139/ssrn.3639110","DOIUrl":"https://doi.org/10.2139/ssrn.3639110","url":null,"abstract":"This paper examines the finite sample properties of novel theoretical tests that evaluate the presence of: a) Brownian motion, b) jumps; c) finite vs. infinite activity jumps. In allowing for Gaussian, t-distributed, and Gaussian-T mixture noise, our Monte Carlo experiment guides a search for optimal performance across sampling frequencies. Using 100 stocks and SPY, we find that: i) a Brownian and a jump component characterize 1-min stock data; ii) Jumps should allow for both finite and infinite activity; iii) Rejection rates are time-varying, such that more jump days are usually associated with an increase of infinite jumps vis-a-vis finite jumps.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127527776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Analysis of Time Series Data on Regression, Heuristic, and ARIMA Models","authors":"N. Mammadova","doi":"10.2139/ssrn.3609131","DOIUrl":"https://doi.org/10.2139/ssrn.3609131","url":null,"abstract":"The preliminary objective of this paper is to help students to fulfill some basic operations in R programming as well as to find the model that fits best to the original data in Time Series Analysis.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114289560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lara Lobschat, N. Holtrop, Norris I. Bruce, R. Rao
{"title":"All Ads Are Not Created Equal: Display Advertisement’s Copy and Placement Effects on Clicks and Conversions","authors":"Lara Lobschat, N. Holtrop, Norris I. Bruce, R. Rao","doi":"10.2139/ssrn.3531468","DOIUrl":"https://doi.org/10.2139/ssrn.3531468","url":null,"abstract":"In today’s media ecosystem, advertisers face the challenge to create ad campaigns with the ability to engage consumers and ultimately increase conversions. Hence, they need guidance on how to design promising ad copies and on which websites to deliver these ads. In this study, the authors analyze how different message content and ad format elements influences consumers’ likelihood to click on an ad and subsequently convert on the advertising website. Additionally, they explore the effects of placing an ad on different websites. The main results suggest that message content and ad format that enhance engagement do not necessarily enhance conversions. Furthermore, several novel findings pertain to placement in social media and use of influencer marketing: display ads are effective on social media, especially with hard sell messages. Influencer ads increase engagement more than conversion and are more effective outside social media. A simulation reveals the impact of optimizing ad copy and placement decisions: optimizing ad copy decisions has larger effects than placement decisions and synergy exists between both decisions, offering further guidance to managers.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"172 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124191355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}