{"title":"Improving confidence intervals and central value estimation in small datasets through hybrid parametric bootstrapping","authors":"Victor V. Golovko","doi":"10.1016/j.ins.2025.122254","DOIUrl":null,"url":null,"abstract":"<div><div>We developed a hybrid parametric bootstrapping (HPB) method for analyzing small datasets with high precision. This method addresses the challenge of estimating confidence intervals (CI) and central values when traditional distribution assumptions do not apply. Our HPB is combined with Steiner's Most Frequent Value (MFV) technique. The MFV method minimizes the information loss associated with small datasets, while the HPB considers the uncertainty of each separate element. As a practical example, we applied this innovative and robust statistical methodology to refine prior measurements of the half-life of <span><math><mmultiscripts><mrow><mtext>Ru</mtext></mrow><mprescripts></mprescripts><none></none><mrow><mn>97</mn></mrow></mmultiscripts></math></span>. Using the MFV technique integrated with the HPB method, we obtained a significantly more precise half-life estimate, <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>1</mn><mo>/</mo><mn>2</mn><mo>,</mo><mtext>MFV(HPB)</mtext></mrow></msub><mo>=</mo><msubsup><mrow><mn>2.8385</mn></mrow><mrow><mo>−</mo><mn>0.0075</mn></mrow><mrow><mo>+</mo><mn>0.0022</mn></mrow></msubsup></math></span> days. This refined value features a 68.27% confidence interval from 2.8310 to 2.8407 days and a 95.45% confidence interval from 2.8036 to 2.8485 days, as calculated using the percentile method. Our analysis demonstrates a substantial reduction in uncertainty–over 30 times lower than that reported in nuclear data sheets–indicating the potential for widespread analytical impact. In addition, employing alternative minimization strategies can reduce the statistical uncertainty by a further 44%. The HPB method effectively addresses the uncertainties inherent in small datasets, as demonstrated by re-evaluating the specific activity measurements for <span><math><mmultiscripts><mrow><mtext>Ar</mtext></mrow><mprescripts></mprescripts><none></none><mrow><mn>39</mn></mrow></mmultiscripts></math></span> using underground data. We report <span><math><mi>S</mi><msub><mrow><mi>A</mi></mrow><mrow><mtext>MFV(HPB)</mtext></mrow></msub><mo>=</mo><msubsup><mrow><mn>0.966</mn></mrow><mrow><mo>−</mo><mn>0.020</mn></mrow><mrow><mo>+</mo><mn>0.027</mn></mrow></msubsup></math></span> Bq/kg<sub>atmAr</sub>, with confidence intervals (68.27%: 0.946–0.993; 95.45%: 0.921–1.029) derived using the percentile method. Advances in statistical methods are important for making data analysis more accurate and reliable, especially when combining and interpreting information from different sources. The developed tools help handle complex data more effectively, thereby improving the process and understanding of information in real-world applications where precision is essential.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"716 ","pages":"Article 122254"},"PeriodicalIF":8.1000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S002002552500386X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
We developed a hybrid parametric bootstrapping (HPB) method for analyzing small datasets with high precision. This method addresses the challenge of estimating confidence intervals (CI) and central values when traditional distribution assumptions do not apply. Our HPB is combined with Steiner's Most Frequent Value (MFV) technique. The MFV method minimizes the information loss associated with small datasets, while the HPB considers the uncertainty of each separate element. As a practical example, we applied this innovative and robust statistical methodology to refine prior measurements of the half-life of . Using the MFV technique integrated with the HPB method, we obtained a significantly more precise half-life estimate, days. This refined value features a 68.27% confidence interval from 2.8310 to 2.8407 days and a 95.45% confidence interval from 2.8036 to 2.8485 days, as calculated using the percentile method. Our analysis demonstrates a substantial reduction in uncertainty–over 30 times lower than that reported in nuclear data sheets–indicating the potential for widespread analytical impact. In addition, employing alternative minimization strategies can reduce the statistical uncertainty by a further 44%. The HPB method effectively addresses the uncertainties inherent in small datasets, as demonstrated by re-evaluating the specific activity measurements for using underground data. We report Bq/kgatmAr, with confidence intervals (68.27%: 0.946–0.993; 95.45%: 0.921–1.029) derived using the percentile method. Advances in statistical methods are important for making data analysis more accurate and reliable, especially when combining and interpreting information from different sources. The developed tools help handle complex data more effectively, thereby improving the process and understanding of information in real-world applications where precision is essential.
期刊介绍:
Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions.
Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.