{"title":"BayesSRW: Bayesian Sampling and Re-weighting approach for variance reduction","authors":"Carol Liu","doi":"arxiv-2408.15454","DOIUrl":"https://doi.org/arxiv-2408.15454","url":null,"abstract":"In this paper, we address the challenge of sampling in scenarios where\u0000limited resources prevent exhaustive measurement across all subjects. We\u0000consider a setting where samples are drawn from multiple groups, each following\u0000a distribution with unknown mean and variance parameters. We introduce a novel\u0000sampling strategy, motivated simply by Cauchy-Schwarz inequality, which\u0000minimizes the variance of the population mean estimator by allocating samples\u0000proportionally to both the group size and the standard deviation. This approach\u0000improves the efficiency of sampling by focusing resources on groups with\u0000greater variability, thereby enhancing the precision of the overall estimate.\u0000Additionally, we extend our method to a two-stage sampling procedure in a Bayes\u0000approach, named BayesSRW, where a preliminary stage is used to estimate the\u0000variance, which then informs the optimal allocation of the remaining sampling\u0000budget. Through simulation examples, we demonstrate the effectiveness of our\u0000approach in reducing estimation uncertainty and providing more reliable\u0000insights in applications ranging from user experience surveys to\u0000high-dimensional peptide array studies.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"67 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The effects of data preprocessing on probability of default model fairness","authors":"Di Wu","doi":"arxiv-2408.15452","DOIUrl":"https://doi.org/arxiv-2408.15452","url":null,"abstract":"In the context of financial credit risk evaluation, the fairness of machine\u0000learning models has become a critical concern, especially given the potential\u0000for biased predictions that disproportionately affect certain demographic\u0000groups. This study investigates the impact of data preprocessing, with a\u0000specific focus on Truncated Singular Value Decomposition (SVD), on the fairness\u0000and performance of probability of default models. Using a comprehensive dataset\u0000sourced from Kaggle, various preprocessing techniques, including SVD, were\u0000applied to assess their effect on model accuracy, discriminatory power, and\u0000fairness.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Endogenous Treatment Models with Social Interactions: An Application to the Impact of Exercise on Self-Esteem","authors":"Zhongjian Lin, Francis Vella","doi":"arxiv-2408.13971","DOIUrl":"https://doi.org/arxiv-2408.13971","url":null,"abstract":"We address the estimation of endogenous treatment models with social\u0000interactions in both the treatment and outcome equations. We model the\u0000interactions between individuals in an internally consistent manner via a game\u0000theoretic approach based on discrete Bayesian games. This introduces a\u0000substantial computational burden in estimation which we address through a\u0000sequential version of the nested fixed point algorithm. We also provide some\u0000relevant treatment effects, and procedures for their estimation, which capture\u0000the impact on both the individual and the total sample. Our empirical\u0000application examines the impact of an individual's exercise frequency on her\u0000level of self-esteem. We find that an individual's exercise frequency is\u0000influenced by her expectation of her friends'. We also find that an\u0000individual's level of self-esteem is affected by her level of exercise and, at\u0000relatively lower levels of self-esteem, by the expectation of her friends'\u0000self-esteem.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling the Dynamics of Growth in Master-Planned Communities","authors":"Christopher K. Allsup, Irene S. Gabashvili","doi":"arxiv-2408.14214","DOIUrl":"https://doi.org/arxiv-2408.14214","url":null,"abstract":"This paper describes how a time-varying Markov model was used to forecast\u0000housing development at a master-planned community during a transition from high\u0000to low growth. Our approach draws on detailed historical data to model the\u0000dynamics of the market participants, producing results that are entirely\u0000data-driven and free of bias. While traditional time series forecasting methods\u0000often struggle to account for nonlinear regime changes in growth, our approach\u0000successfully captures the onset of buildout as well as external economic\u0000shocks, such as the 1990 and 2008-2011 recessions and the 2021 post-pandemic\u0000boom. This research serves as a valuable tool for urban planners, homeowner\u0000associations, and property stakeholders aiming to navigate the complexities of\u0000growth at master-planned communities during periods of both system stability\u0000and instability.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Double/Debiased CoCoLASSO of Treatment Effects with Mismeasured High-Dimensional Control Variables","authors":"Geonwoo Kim, Suyong Song","doi":"arxiv-2408.14671","DOIUrl":"https://doi.org/arxiv-2408.14671","url":null,"abstract":"We develop an estimator for treatment effects in high-dimensional settings\u0000with additive measurement error, a prevalent challenge in modern econometrics.\u0000We introduce the Double/Debiased Convex Conditioned LASSO (Double/Debiased\u0000CoCoLASSO), which extends the double/debiased machine learning framework to\u0000accommodate mismeasured covariates. Our principal contributions are threefold.\u0000(1) We construct a Neyman-orthogonal score function that remains valid under\u0000measurement error, incorporating a bias correction term to account for\u0000error-induced correlations. (2) We propose a method of moments estimator for\u0000the measurement error variance, enabling implementation without prior knowledge\u0000of the error covariance structure. (3) We establish the $sqrt{N}$-consistency\u0000and asymptotic normality of our estimator under general conditions, allowing\u0000for both the number of covariates and the magnitude of measurement error to\u0000increase with the sample size. Our theoretical results demonstrate the\u0000estimator's efficiency within the class of regularized high-dimensional\u0000estimators accounting for measurement error. Monte Carlo simulations\u0000corroborate our asymptotic theory and illustrate the estimator's robust\u0000performance across various levels of measurement error. Notably, our\u0000covariance-oblivious approach nearly matches the efficiency of methods that\u0000assume known error variance.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"67 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inference on Consensus Ranking of Distributions","authors":"David M. Kaplan","doi":"arxiv-2408.13949","DOIUrl":"https://doi.org/arxiv-2408.13949","url":null,"abstract":"Instead of testing for unanimous agreement, I propose learning how broad of a\u0000consensus favors one distribution over another (of earnings, productivity,\u0000asset returns, test scores, etc.). Specifically, given a sample from each of\u0000two distributions, I propose statistical inference methods to learn about the\u0000set of utility functions for which the first distribution has higher expected\u0000utility than the second distribution. With high probability, an \"inner\"\u0000confidence set is contained within this true set, while an \"outer\" confidence\u0000set contains the true set. Such confidence sets can be formed by inverting a\u0000proposed multiple testing procedure that controls the familywise error rate.\u0000Theoretical justification comes from empirical process results, given that very\u0000large classes of utility functions are generally Donsker (subject to finite\u0000moments). The theory additionally justifies a uniform (over utility functions)\u0000confidence band of expected utility differences, as well as tests with a\u0000utility-based \"restricted stochastic dominance\" as either the null or\u0000alternative hypothesis. Simulated and empirical examples illustrate the\u0000methodology.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-sectional Dependence in Idiosyncratic Volatility","authors":"Ilze Kalnina, Kokouvi Tewou","doi":"arxiv-2408.13437","DOIUrl":"https://doi.org/arxiv-2408.13437","url":null,"abstract":"This paper introduces an econometric framework for analyzing cross-sectional\u0000dependence in the idiosyncratic volatilities of assets using high frequency\u0000data. We first consider the estimation of standard measures of dependence in\u0000the idiosyncratic volatilities such as covariances and correlations. Naive\u0000estimators of these measures are biased due to the use of the error-laden\u0000estimates of idiosyncratic volatilities. We provide bias-corrected estimators\u0000and the relevant asymptotic theory. Next, we introduce an idiosyncratic\u0000volatility factor model, in which we decompose the variation in idiosyncratic\u0000volatilities into two parts: the variation related to the systematic factors\u0000such as the market volatility, and the residual variation. Again, naive\u0000estimators of the decomposition are biased, and we provide bias-corrected\u0000estimators. We also provide the asymptotic theory that allows us to test\u0000whether the residual (non-systematic) components of the idiosyncratic\u0000volatilities exhibit cross-sectional dependence. We apply our methodology to\u0000the S&P 100 index constituents, and document strong cross-sectional dependence\u0000in their idiosyncratic volatilities. We consider two different sets of\u0000idiosyncratic volatility factors, and find that neither can fully account for\u0000the cross-sectional dependence in idiosyncratic volatilities. For each model,\u0000we map out the network of dependencies in residual (non-systematic)\u0000idiosyncratic volatilities across all stocks.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine Learning and the Yield Curve: Tree-Based Macroeconomic Regime Switching","authors":"Siyu Bie, Francis X. Diebold, Jingyu He, Junye Li","doi":"arxiv-2408.12863","DOIUrl":"https://doi.org/arxiv-2408.12863","url":null,"abstract":"We explore tree-based macroeconomic regime-switching in the context of the\u0000dynamic Nelson-Siegel (DNS) yield-curve model. In particular, we customize the\u0000tree-growing algorithm to partition macroeconomic variables based on the DNS\u0000model's marginal likelihood, thereby identifying regime-shifting patterns in\u0000the yield curve. Compared to traditional Markov-switching models, our model\u0000offers clear economic interpretation via macroeconomic linkages and ensures\u0000computational simplicity. In an empirical application to U.S. Treasury bond\u0000yields, we find (1) important yield curve regime switching, and (2) evidence\u0000that macroeconomic variables have predictive power for the yield curve when the\u0000short rate is high, but not in other regimes, thereby refining the notion of\u0000yield curve ``macro-spanning\".","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Difference-in-differences with as few as two cross-sectional units -- A new perspective to the democracy-growth debate","authors":"Gilles Koumou, Emmanuel Selorm Tsyawo","doi":"arxiv-2408.13047","DOIUrl":"https://doi.org/arxiv-2408.13047","url":null,"abstract":"Pooled panel analyses tend to mask heterogeneity in unit-specific treatment\u0000effects. For example, existing studies on the impact of democracy on economic\u0000growth do not reach a consensus as empirical findings are substantially\u0000heterogeneous in the country composition of the panel. In contrast to pooled\u0000panel analyses, this paper proposes a Difference-in-Differences (DiD) estimator\u0000that exploits the temporal dimension in the data and estimates unit-specific\u0000average treatment effects on the treated (ATT) with as few as two\u0000cross-sectional units. Under weak identification and temporal dependence\u0000conditions, the DiD estimator is asymptotically normal. The estimator is\u0000further complemented with a test of identification granted at least two\u0000candidate control units. Empirical results using the DiD estimator suggest\u0000Benin's economy would have been 6.3% smaller on average over the 1993-2018\u0000period had she not democratised.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiyuan Ren, Joseph Y. J. Chow, Venktesh Pandey, Linfei Yuan
{"title":"Integrating an agent-based behavioral model in microtransit forecasting and revenue management","authors":"Xiyuan Ren, Joseph Y. J. Chow, Venktesh Pandey, Linfei Yuan","doi":"arxiv-2408.12577","DOIUrl":"https://doi.org/arxiv-2408.12577","url":null,"abstract":"As an IT-enabled multi-passenger mobility service, microtransit has the\u0000potential to improve accessibility, reduce congestion, and enhance flexibility\u0000in transportation options. However, due to its heterogeneous impacts on\u0000different communities and population segments, there is a need for better tools\u0000in microtransit forecast and revenue management, especially when actual usage\u0000data are limited. We propose a novel framework based on an agent-based mixed\u0000logit model estimated with microtransit usage data and synthetic trip data. The\u0000framework involves estimating a lower-branch mode choice model with synthetic\u0000trip data, combining lower-branch parameters with microtransit data to estimate\u0000an upper-branch ride pass subscription model, and applying the nested model to\u0000evaluate microtransit pricing and subsidy policies. The framework enables\u0000further decision-support analysis to consider diverse travel patterns and\u0000heterogeneous tastes of the total population. We test the framework in a case\u0000study with synthetic trip data from Replica Inc. and microtransit data from\u0000Arlington Via. The lower-branch model result in a rho-square value of 0.603 on\u0000weekdays and 0.576 on weekends. Predictions made by the upper-branch model\u0000closely match the marginal subscription data. In a ride pass pricing policy\u0000scenario, we show that a discount in weekly pass (from $25 to $18.9) and\u0000monthly pass (from $80 to $71.5) would surprisingly increase total revenue by\u0000$102/day. In an event- or place-based subsidy policy scenario, we show that a\u0000100% fare discount would reduce 80 car trips during peak hours at AT&T Stadium,\u0000requiring a subsidy of $32,068/year.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}