{"title":"组成员的置信度集","authors":"Andreas Dzemski, R. Okui","doi":"10.2139/ssrn.3133878","DOIUrl":null,"url":null,"abstract":"We develop new procedures to quantify the statistical uncertainty of data-driven clustering algorithms. In our panel setting, each unit belongs to one of a finite number of latent groups with group-specific regression curves. We propose methods for computing unit-wise and joint confidence sets for group membership. The unit-wise sets give possible group memberships for a given unit and the joint sets give possible vectors of group memberships for all units. We also propose an algorithm that can improve the power of our procedures by detecting units that are easy to classify. The confidence sets invert a test for group membership that is based on a characterization of the true group memberships by a system of moment inequalities. To construct the joint confidence, we solve a high-dimensional testing problem that tests group membership simultaneously for all units. We justify this procedure under $N, T \\to \\infty$ asymptotics where we allow $T$ to be much smaller than $N$. As part of our theoretical arguments, we develop new simultaneous anti-concentration inequalities for the MAX and the QLR statistics. Monte Carlo results indicate that our confidence sets have adequate coverage and are informative. We illustrate the practical relevance of our confidence sets in two applications.","PeriodicalId":260073,"journal":{"name":"Mathematics eJournal","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Confidence Set for Group Membership\",\"authors\":\"Andreas Dzemski, R. Okui\",\"doi\":\"10.2139/ssrn.3133878\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We develop new procedures to quantify the statistical uncertainty of data-driven clustering algorithms. In our panel setting, each unit belongs to one of a finite number of latent groups with group-specific regression curves. We propose methods for computing unit-wise and joint confidence sets for group membership. The unit-wise sets give possible group memberships for a given unit and the joint sets give possible vectors of group memberships for all units. We also propose an algorithm that can improve the power of our procedures by detecting units that are easy to classify. The confidence sets invert a test for group membership that is based on a characterization of the true group memberships by a system of moment inequalities. To construct the joint confidence, we solve a high-dimensional testing problem that tests group membership simultaneously for all units. We justify this procedure under $N, T \\\\to \\\\infty$ asymptotics where we allow $T$ to be much smaller than $N$. As part of our theoretical arguments, we develop new simultaneous anti-concentration inequalities for the MAX and the QLR statistics. Monte Carlo results indicate that our confidence sets have adequate coverage and are informative. We illustrate the practical relevance of our confidence sets in two applications.\",\"PeriodicalId\":260073,\"journal\":{\"name\":\"Mathematics eJournal\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematics eJournal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2139/ssrn.3133878\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematics eJournal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3133878","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
摘要
我们开发了新的程序来量化数据驱动聚类算法的统计不确定性。在我们的面板设置中,每个单位属于具有组特异性回归曲线的有限数量的潜在组中的一个。我们提出了计算分组隶属度的单位置信集和联合置信集的方法。单位集合给出了给定单位的可能的组成员关系,联合集合给出了所有单位的可能的组成员关系向量。我们还提出了一种算法,可以通过检测容易分类的单元来提高我们的程序的能力。置信集反转了基于矩不等式系统对真实组成员关系的表征的组成员资格的检验。为了构造联合置信度,我们解决了一个对所有单元同时进行群隶属度测试的高维测试问题。我们在$N, T \to \infty$渐近性下证明这个过程,我们允许$T$比$N$小得多。作为我们理论论证的一部分,我们为MAX和QLR统计开发了新的同时反集中不等式。蒙特卡罗结果表明,我们的置信集具有足够的覆盖率和信息量。我们在两个应用中说明了我们的置信集的实际相关性。
We develop new procedures to quantify the statistical uncertainty of data-driven clustering algorithms. In our panel setting, each unit belongs to one of a finite number of latent groups with group-specific regression curves. We propose methods for computing unit-wise and joint confidence sets for group membership. The unit-wise sets give possible group memberships for a given unit and the joint sets give possible vectors of group memberships for all units. We also propose an algorithm that can improve the power of our procedures by detecting units that are easy to classify. The confidence sets invert a test for group membership that is based on a characterization of the true group memberships by a system of moment inequalities. To construct the joint confidence, we solve a high-dimensional testing problem that tests group membership simultaneously for all units. We justify this procedure under $N, T \to \infty$ asymptotics where we allow $T$ to be much smaller than $N$. As part of our theoretical arguments, we develop new simultaneous anti-concentration inequalities for the MAX and the QLR statistics. Monte Carlo results indicate that our confidence sets have adequate coverage and are informative. We illustrate the practical relevance of our confidence sets in two applications.