具有协变量的监督潜在狄利克雷分配:文本和协变量的贝叶斯结构和测量模型。

IF 7.6 1区 心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY
Psychological methods Pub Date : 2023-10-01 Epub Date: 2023-01-05 DOI:10.1037/met0000541
Kenneth Tyler Wilcox, Ross Jacobucci, Zhiyong Zhang, Brooke A Ammerman
{"title":"具有协变量的监督潜在狄利克雷分配:文本和协变量的贝叶斯结构和测量模型。","authors":"Kenneth Tyler Wilcox,&nbsp;Ross Jacobucci,&nbsp;Zhiyong Zhang,&nbsp;Brooke A Ammerman","doi":"10.1037/met0000541","DOIUrl":null,"url":null,"abstract":"<p><p>Text is a burgeoning data source for psychological researchers, but little methodological research has focused on adapting popular modeling approaches for text to the context of psychological research. One popular measurement model for text, topic modeling, uses a latent mixture model to represent topics underlying a body of documents. Recently, psychologists have studied relationships between these topics and other psychological measures by using estimates of the topics as regression predictors along with other manifest variables. While similar two-stage approaches involving estimated latent variables are known to yield biased estimates and incorrect standard errors, two-stage topic modeling approaches have received limited statistical study and, as we show, are subject to the same problems. To address these problems, we proposed a novel statistical model-supervised latent Dirichlet allocation with covariates (SLDAX)-that jointly incorporates a latent variable measurement model of text and a structural regression model to allow the latent topics and other manifest variables to serve as predictors of an outcome. Using a simulation study with data characteristics consistent with psychological text data, we found that SLDAX estimates were generally more accurate and more efficient. To illustrate the application of SLDAX and a two-stage approach, we provide an empirical clinical application to compare the application of both the two-stage and SLDAX approaches. Finally, we implemented the SLDAX model in an open-source R package to facilitate its use and further study. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"1178-1206"},"PeriodicalIF":7.6000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Supervised latent Dirichlet allocation with covariates: A Bayesian structural and measurement model of text and covariates.\",\"authors\":\"Kenneth Tyler Wilcox,&nbsp;Ross Jacobucci,&nbsp;Zhiyong Zhang,&nbsp;Brooke A Ammerman\",\"doi\":\"10.1037/met0000541\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Text is a burgeoning data source for psychological researchers, but little methodological research has focused on adapting popular modeling approaches for text to the context of psychological research. One popular measurement model for text, topic modeling, uses a latent mixture model to represent topics underlying a body of documents. Recently, psychologists have studied relationships between these topics and other psychological measures by using estimates of the topics as regression predictors along with other manifest variables. While similar two-stage approaches involving estimated latent variables are known to yield biased estimates and incorrect standard errors, two-stage topic modeling approaches have received limited statistical study and, as we show, are subject to the same problems. To address these problems, we proposed a novel statistical model-supervised latent Dirichlet allocation with covariates (SLDAX)-that jointly incorporates a latent variable measurement model of text and a structural regression model to allow the latent topics and other manifest variables to serve as predictors of an outcome. Using a simulation study with data characteristics consistent with psychological text data, we found that SLDAX estimates were generally more accurate and more efficient. To illustrate the application of SLDAX and a two-stage approach, we provide an empirical clinical application to compare the application of both the two-stage and SLDAX approaches. Finally, we implemented the SLDAX model in an open-source R package to facilitate its use and further study. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>\",\"PeriodicalId\":20782,\"journal\":{\"name\":\"Psychological methods\",\"volume\":\" \",\"pages\":\"1178-1206\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychological methods\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/met0000541\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/5 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000541","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/5 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

文本是心理学研究人员新兴的数据来源,但很少有方法论研究关注将流行的文本建模方法应用于心理学研究。一个流行的文本测量模型,主题建模,使用潜在的混合模型来表示文档主体下面的主题。最近,心理学家研究了这些主题和其他心理测量之间的关系,方法是将主题的估计值与其他明显变量一起用作回归预测因子。虽然已知涉及估计潜在变量的类似两阶段方法会产生有偏差的估计和不正确的标准误差,但两阶段主题建模方法受到的统计研究有限,正如我们所表明的,也会遇到同样的问题。为了解决这些问题,我们提出了一种新的统计模型,监督具有协变量的潜在狄利克雷分配(SLDAX),该模型结合了文本的潜在变量测量模型和结构回归模型,以允许潜在主题和其他明显变量作为结果的预测因子。使用一项数据特征与心理文本数据一致的模拟研究,我们发现SLDAX估计通常更准确、更有效。为了说明SLDAX和两阶段方法的应用,我们提供了一个经验临床应用来比较两阶段方法和SLDAX方法的应用。最后,我们在一个开源的R包中实现了SLDAX模型,以方便其使用和进一步研究。(PsycInfo数据库记录(c)2023 APA,保留所有权利)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Supervised latent Dirichlet allocation with covariates: A Bayesian structural and measurement model of text and covariates.

Text is a burgeoning data source for psychological researchers, but little methodological research has focused on adapting popular modeling approaches for text to the context of psychological research. One popular measurement model for text, topic modeling, uses a latent mixture model to represent topics underlying a body of documents. Recently, psychologists have studied relationships between these topics and other psychological measures by using estimates of the topics as regression predictors along with other manifest variables. While similar two-stage approaches involving estimated latent variables are known to yield biased estimates and incorrect standard errors, two-stage topic modeling approaches have received limited statistical study and, as we show, are subject to the same problems. To address these problems, we proposed a novel statistical model-supervised latent Dirichlet allocation with covariates (SLDAX)-that jointly incorporates a latent variable measurement model of text and a structural regression model to allow the latent topics and other manifest variables to serve as predictors of an outcome. Using a simulation study with data characteristics consistent with psychological text data, we found that SLDAX estimates were generally more accurate and more efficient. To illustrate the application of SLDAX and a two-stage approach, we provide an empirical clinical application to compare the application of both the two-stage and SLDAX approaches. Finally, we implemented the SLDAX model in an open-source R package to facilitate its use and further study. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Psychological methods
Psychological methods PSYCHOLOGY, MULTIDISCIPLINARY-
CiteScore
13.10
自引率
7.10%
发文量
159
期刊介绍: Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信