Multiplicative Models For Continuous Dependent Variables: Estimation on Unlogged versus Logged Form

IF 2.4 2区 社会学 Q1 SOCIOLOGY
Trond Petersen
{"title":"Multiplicative Models For Continuous Dependent Variables: Estimation on Unlogged versus Logged Form","authors":"Trond Petersen","doi":"10.1177/0081175017730108","DOIUrl":null,"url":null,"abstract":"In regression analysis with a continuous and positive dependent variable, a multiplicative relationship between the unlogged dependent variable and the independent variables is often specified. It can then be estimated on its unlogged or logged form. The two procedures may yield major differences in estimates, even opposite signs. The reason is that estimation on the unlogged form yields coefficients for the relative arithmetic mean of the unlogged dependent variable, whereas estimation on the logged form gives coefficients for the relative geometric mean for the unlogged dependent variable (or for absolute differences in the arithmetic mean of the logged dependent variable). Estimated coefficients from the two forms may therefore vary widely, because of their different foci, relative arithmetic versus relative geometric means. The first goal of this article is to explain why major divergencies in coefficients can occur. Although well understood in the statistical literature, this is not widely understood in sociological research, and it is hence of significant practical interest. The second goal is to derive conditions under which divergencies will not occur, where estimation on the logged form will give unbiased estimators for relative arithmetic means. First, it derives the necessary and sufficient conditions for when estimation on the logged form will give unbiased estimators for the parameters for the relative arithmetic mean. This requires not only that there is arithmetic mean independence of the unlogged error term but that there is also geometric mean independence. Second, it shows that statistical independence of the error terms on regressors implies that there is both arithmetic and geometric mean independence for the error terms, and it is hence a sufficient condition for absence of bias. Third, it shows that although statistical independence is a sufficient condition, it is not a necessary one for lack of bias. Fourth, it demonstrates that homoskedasticity of error terms is neither a necessary nor a sufficient condition for absence of bias. Fifth, it shows that in the semi-logarithmic specification, for a logged error term with the same qualitative distributional shape at each value of independent variables (e.g., normal), arithmetic mean independence, but heteroskedasticity, estimation on the logged form will give biased estimators for the parameters for the arithmetic mean (whereas with homoskedasticity, and for this case thus statistical independence, estimators are unbiased, from the second result above).","PeriodicalId":48140,"journal":{"name":"Sociological Methodology","volume":"47 1","pages":"113 - 164"},"PeriodicalIF":2.4000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/0081175017730108","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sociological Methodology","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/0081175017730108","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIOLOGY","Score":null,"Total":0}
引用次数: 15

Abstract

In regression analysis with a continuous and positive dependent variable, a multiplicative relationship between the unlogged dependent variable and the independent variables is often specified. It can then be estimated on its unlogged or logged form. The two procedures may yield major differences in estimates, even opposite signs. The reason is that estimation on the unlogged form yields coefficients for the relative arithmetic mean of the unlogged dependent variable, whereas estimation on the logged form gives coefficients for the relative geometric mean for the unlogged dependent variable (or for absolute differences in the arithmetic mean of the logged dependent variable). Estimated coefficients from the two forms may therefore vary widely, because of their different foci, relative arithmetic versus relative geometric means. The first goal of this article is to explain why major divergencies in coefficients can occur. Although well understood in the statistical literature, this is not widely understood in sociological research, and it is hence of significant practical interest. The second goal is to derive conditions under which divergencies will not occur, where estimation on the logged form will give unbiased estimators for relative arithmetic means. First, it derives the necessary and sufficient conditions for when estimation on the logged form will give unbiased estimators for the parameters for the relative arithmetic mean. This requires not only that there is arithmetic mean independence of the unlogged error term but that there is also geometric mean independence. Second, it shows that statistical independence of the error terms on regressors implies that there is both arithmetic and geometric mean independence for the error terms, and it is hence a sufficient condition for absence of bias. Third, it shows that although statistical independence is a sufficient condition, it is not a necessary one for lack of bias. Fourth, it demonstrates that homoskedasticity of error terms is neither a necessary nor a sufficient condition for absence of bias. Fifth, it shows that in the semi-logarithmic specification, for a logged error term with the same qualitative distributional shape at each value of independent variables (e.g., normal), arithmetic mean independence, but heteroskedasticity, estimation on the logged form will give biased estimators for the parameters for the arithmetic mean (whereas with homoskedasticity, and for this case thus statistical independence, estimators are unbiased, from the second result above).
连续因变量的乘性模型:未记录与记录形式的估计
在具有连续正因变量的回归分析中,经常指定未标记因变量和自变量之间的乘法关系。然后可以在未标记或记录的表格上对其进行估计。这两个过程可能会产生估计值的重大差异,甚至相反的符号。原因是,对未标记形式的估计产生了未标记因变量的相对算术平均值的系数,而对记录形式的估计给出了未标记随变量的相对几何平均值(或记录因变量算术平均值中的绝对差)的系数。因此,这两种形式的估计系数可能差异很大,因为它们的焦点不同,相对算术平均值与相对几何平均值不同。本文的第一个目标是解释为什么系数会出现重大差异。尽管在统计文献中得到了很好的理解,但在社会学研究中却没有得到广泛的理解,因此它具有重要的现实意义。第二个目标是推导出不会出现偏差的条件,其中对数形式的估计将给出相对算术平均值的无偏估计量。首先,推导了对数形式上的估计何时给出相对算术平均值参数的无偏估计的充要条件。这不仅要求无标记误差项存在算术平均独立性,而且要求几何平均独立性。其次,它表明回归器上误差项的统计独立性意味着误差项既有算术平均独立性,也有几何平均独立性。因此,这是不存在偏差的充分条件。第三,它表明,尽管统计独立性是一个充分条件,但它不是缺乏偏见的必要条件。第四,它证明了误差项的同方差既不是不存在偏差的必要条件,也不是不存在偏见的充分条件。第五,它表明,在半对数规范中,对于在自变量的每个值(例如,正态)具有相同定性分布形状的记录误差项,算术平均独立,但异方差,对数形式上的估计将给出算术平均值的参数的有偏估计量(而对于同方差,并且在这种情况下因此是统计独立性,根据上面的第二个结果,估计量是无偏的)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.50
自引率
0.00%
发文量
12
期刊介绍: Sociological Methodology is a compendium of new and sometimes controversial advances in social science methodology. Contributions come from diverse areas and have something useful -- and often surprising -- to say about a wide range of topics ranging from legal and ethical issues surrounding data collection to the methodology of theory construction. In short, Sociological Methodology holds something of value -- and an interesting mix of lively controversy, too -- for nearly everyone who participates in the enterprise of sociological research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信