{"title":"关于分歧与决策P值的评论","authors":"Paul W. Vos","doi":"10.1111/sjos.12647","DOIUrl":null,"url":null,"abstract":"The distinction between the two uses of p-values described by Professor Greenland is related to two distinct interpretations of frequentist probability—that is, probability used to describe a random event. I will illustrate with a simple example. In the North Carolina Pick-4 lottery, 10 ping pong balls labeled with distinct digits from I9 = {0, 1,..., 9} are mixed in a clear container and opening a door allows a single ball to be selected. Prior to opening the door, blown air mixes the balls making equally likely selection of each ball plausible. This is repeated with three identical containers to obtain the remaining three digits. If a winning ticket is defined as one where the sum of the four digits exceeds 28, the state can charge $5 for a ticket with a $100 prize and expect a profit. There are 330 of 104 possible outcomes where the sum exceeds 28 so the expected value is 0.033 × $100 = $3.30. This calculation requires no repeated sampling but it is natural for the state to interpret this value in the long run. For an individual ticket holder, all that is required is that each ball is given an equal chance to be selected for the drawing associated with his ticket. The ticket holder does not need to imagine a long sequence of draws just as a cancer patient does not need to consider a long sequence of 5-year periods to understand a 30% 5-year survival. Using terminology from Vos and Holbert (2022), the scope for the ticket holder is specific while that of the state is generic. The uniform distribution on 4-tuples I4 9 = I9 × I9 × I9 × I9 provides a model for repeated draws of the Pick-4 lottery, that is, of the data generation process. For most inference applications, the distribution of an unknown population can be modeled rather than the process that generated the data. We modify this example to consider inference. We are told the sum of a single lottery draw and we are to infer whether the draw came from the NC lottery or lottery A that also has four containers but each contains 8 balls with labels from I7 = {0, 1,..., 7}. The sum of the digits is 29 but no other information is given. A reduction-to-contradiction argument establishes that the result came from the NC lottery. Premise: lottery A produced our data; every possible sum from lottery A belongs to the set {0, 1,..., 28}; 29 is not in this set; conclusion: the contradiction means it is impossible that the premise is true.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":"50 1","pages":"920 - 922"},"PeriodicalIF":0.8000,"publicationDate":"2023-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Comments on Divergence vs. Decision P‐values\",\"authors\":\"Paul W. Vos\",\"doi\":\"10.1111/sjos.12647\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The distinction between the two uses of p-values described by Professor Greenland is related to two distinct interpretations of frequentist probability—that is, probability used to describe a random event. I will illustrate with a simple example. In the North Carolina Pick-4 lottery, 10 ping pong balls labeled with distinct digits from I9 = {0, 1,..., 9} are mixed in a clear container and opening a door allows a single ball to be selected. Prior to opening the door, blown air mixes the balls making equally likely selection of each ball plausible. This is repeated with three identical containers to obtain the remaining three digits. If a winning ticket is defined as one where the sum of the four digits exceeds 28, the state can charge $5 for a ticket with a $100 prize and expect a profit. There are 330 of 104 possible outcomes where the sum exceeds 28 so the expected value is 0.033 × $100 = $3.30. This calculation requires no repeated sampling but it is natural for the state to interpret this value in the long run. For an individual ticket holder, all that is required is that each ball is given an equal chance to be selected for the drawing associated with his ticket. The ticket holder does not need to imagine a long sequence of draws just as a cancer patient does not need to consider a long sequence of 5-year periods to understand a 30% 5-year survival. Using terminology from Vos and Holbert (2022), the scope for the ticket holder is specific while that of the state is generic. The uniform distribution on 4-tuples I4 9 = I9 × I9 × I9 × I9 provides a model for repeated draws of the Pick-4 lottery, that is, of the data generation process. For most inference applications, the distribution of an unknown population can be modeled rather than the process that generated the data. We modify this example to consider inference. We are told the sum of a single lottery draw and we are to infer whether the draw came from the NC lottery or lottery A that also has four containers but each contains 8 balls with labels from I7 = {0, 1,..., 7}. The sum of the digits is 29 but no other information is given. A reduction-to-contradiction argument establishes that the result came from the NC lottery. Premise: lottery A produced our data; every possible sum from lottery A belongs to the set {0, 1,..., 28}; 29 is not in this set; conclusion: the contradiction means it is impossible that the premise is true.\",\"PeriodicalId\":49567,\"journal\":{\"name\":\"Scandinavian Journal of Statistics\",\"volume\":\"50 1\",\"pages\":\"920 - 922\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2023-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scandinavian Journal of Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1111/sjos.12647\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scandinavian Journal of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1111/sjos.12647","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
The distinction between the two uses of p-values described by Professor Greenland is related to two distinct interpretations of frequentist probability—that is, probability used to describe a random event. I will illustrate with a simple example. In the North Carolina Pick-4 lottery, 10 ping pong balls labeled with distinct digits from I9 = {0, 1,..., 9} are mixed in a clear container and opening a door allows a single ball to be selected. Prior to opening the door, blown air mixes the balls making equally likely selection of each ball plausible. This is repeated with three identical containers to obtain the remaining three digits. If a winning ticket is defined as one where the sum of the four digits exceeds 28, the state can charge $5 for a ticket with a $100 prize and expect a profit. There are 330 of 104 possible outcomes where the sum exceeds 28 so the expected value is 0.033 × $100 = $3.30. This calculation requires no repeated sampling but it is natural for the state to interpret this value in the long run. For an individual ticket holder, all that is required is that each ball is given an equal chance to be selected for the drawing associated with his ticket. The ticket holder does not need to imagine a long sequence of draws just as a cancer patient does not need to consider a long sequence of 5-year periods to understand a 30% 5-year survival. Using terminology from Vos and Holbert (2022), the scope for the ticket holder is specific while that of the state is generic. The uniform distribution on 4-tuples I4 9 = I9 × I9 × I9 × I9 provides a model for repeated draws of the Pick-4 lottery, that is, of the data generation process. For most inference applications, the distribution of an unknown population can be modeled rather than the process that generated the data. We modify this example to consider inference. We are told the sum of a single lottery draw and we are to infer whether the draw came from the NC lottery or lottery A that also has four containers but each contains 8 balls with labels from I7 = {0, 1,..., 7}. The sum of the digits is 29 but no other information is given. A reduction-to-contradiction argument establishes that the result came from the NC lottery. Premise: lottery A produced our data; every possible sum from lottery A belongs to the set {0, 1,..., 28}; 29 is not in this set; conclusion: the contradiction means it is impossible that the premise is true.
期刊介绍:
The Scandinavian Journal of Statistics is internationally recognised as one of the leading statistical journals in the world. It was founded in 1974 by four Scandinavian statistical societies. Today more than eighty per cent of the manuscripts are submitted from outside Scandinavia.
It is an international journal devoted to reporting significant and innovative original contributions to statistical methodology, both theory and applications.
The journal specializes in statistical modelling showing particular appreciation of the underlying substantive research problems.
The emergence of specialized methods for analysing longitudinal and spatial data is just one example of an area of important methodological development in which the Scandinavian Journal of Statistics has a particular niche.