利用混合参数自举改进小数据集的置信区间和中心值估计

IF 8.1 1区 计算机科学 0 COMPUTER SCIENCE, INFORMATION SYSTEMS
Victor V. Golovko
{"title":"利用混合参数自举改进小数据集的置信区间和中心值估计","authors":"Victor V. Golovko","doi":"10.1016/j.ins.2025.122254","DOIUrl":null,"url":null,"abstract":"<div><div>We developed a hybrid parametric bootstrapping (HPB) method for analyzing small datasets with high precision. This method addresses the challenge of estimating confidence intervals (CI) and central values when traditional distribution assumptions do not apply. Our HPB is combined with Steiner's Most Frequent Value (MFV) technique. The MFV method minimizes the information loss associated with small datasets, while the HPB considers the uncertainty of each separate element. As a practical example, we applied this innovative and robust statistical methodology to refine prior measurements of the half-life of <span><math><mmultiscripts><mrow><mtext>Ru</mtext></mrow><mprescripts></mprescripts><none></none><mrow><mn>97</mn></mrow></mmultiscripts></math></span>. Using the MFV technique integrated with the HPB method, we obtained a significantly more precise half-life estimate, <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>1</mn><mo>/</mo><mn>2</mn><mo>,</mo><mtext>MFV(HPB)</mtext></mrow></msub><mo>=</mo><msubsup><mrow><mn>2.8385</mn></mrow><mrow><mo>−</mo><mn>0.0075</mn></mrow><mrow><mo>+</mo><mn>0.0022</mn></mrow></msubsup></math></span> days. This refined value features a 68.27% confidence interval from 2.8310 to 2.8407 days and a 95.45% confidence interval from 2.8036 to 2.8485 days, as calculated using the percentile method. Our analysis demonstrates a substantial reduction in uncertainty–over 30 times lower than that reported in nuclear data sheets–indicating the potential for widespread analytical impact. In addition, employing alternative minimization strategies can reduce the statistical uncertainty by a further 44%. The HPB method effectively addresses the uncertainties inherent in small datasets, as demonstrated by re-evaluating the specific activity measurements for <span><math><mmultiscripts><mrow><mtext>Ar</mtext></mrow><mprescripts></mprescripts><none></none><mrow><mn>39</mn></mrow></mmultiscripts></math></span> using underground data. We report <span><math><mi>S</mi><msub><mrow><mi>A</mi></mrow><mrow><mtext>MFV(HPB)</mtext></mrow></msub><mo>=</mo><msubsup><mrow><mn>0.966</mn></mrow><mrow><mo>−</mo><mn>0.020</mn></mrow><mrow><mo>+</mo><mn>0.027</mn></mrow></msubsup></math></span> Bq/kg<sub>atmAr</sub>, with confidence intervals (68.27%: 0.946–0.993; 95.45%: 0.921–1.029) derived using the percentile method. Advances in statistical methods are important for making data analysis more accurate and reliable, especially when combining and interpreting information from different sources. The developed tools help handle complex data more effectively, thereby improving the process and understanding of information in real-world applications where precision is essential.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"716 ","pages":"Article 122254"},"PeriodicalIF":8.1000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving confidence intervals and central value estimation in small datasets through hybrid parametric bootstrapping\",\"authors\":\"Victor V. Golovko\",\"doi\":\"10.1016/j.ins.2025.122254\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>We developed a hybrid parametric bootstrapping (HPB) method for analyzing small datasets with high precision. This method addresses the challenge of estimating confidence intervals (CI) and central values when traditional distribution assumptions do not apply. Our HPB is combined with Steiner's Most Frequent Value (MFV) technique. The MFV method minimizes the information loss associated with small datasets, while the HPB considers the uncertainty of each separate element. As a practical example, we applied this innovative and robust statistical methodology to refine prior measurements of the half-life of <span><math><mmultiscripts><mrow><mtext>Ru</mtext></mrow><mprescripts></mprescripts><none></none><mrow><mn>97</mn></mrow></mmultiscripts></math></span>. Using the MFV technique integrated with the HPB method, we obtained a significantly more precise half-life estimate, <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>1</mn><mo>/</mo><mn>2</mn><mo>,</mo><mtext>MFV(HPB)</mtext></mrow></msub><mo>=</mo><msubsup><mrow><mn>2.8385</mn></mrow><mrow><mo>−</mo><mn>0.0075</mn></mrow><mrow><mo>+</mo><mn>0.0022</mn></mrow></msubsup></math></span> days. This refined value features a 68.27% confidence interval from 2.8310 to 2.8407 days and a 95.45% confidence interval from 2.8036 to 2.8485 days, as calculated using the percentile method. Our analysis demonstrates a substantial reduction in uncertainty–over 30 times lower than that reported in nuclear data sheets–indicating the potential for widespread analytical impact. In addition, employing alternative minimization strategies can reduce the statistical uncertainty by a further 44%. The HPB method effectively addresses the uncertainties inherent in small datasets, as demonstrated by re-evaluating the specific activity measurements for <span><math><mmultiscripts><mrow><mtext>Ar</mtext></mrow><mprescripts></mprescripts><none></none><mrow><mn>39</mn></mrow></mmultiscripts></math></span> using underground data. We report <span><math><mi>S</mi><msub><mrow><mi>A</mi></mrow><mrow><mtext>MFV(HPB)</mtext></mrow></msub><mo>=</mo><msubsup><mrow><mn>0.966</mn></mrow><mrow><mo>−</mo><mn>0.020</mn></mrow><mrow><mo>+</mo><mn>0.027</mn></mrow></msubsup></math></span> Bq/kg<sub>atmAr</sub>, with confidence intervals (68.27%: 0.946–0.993; 95.45%: 0.921–1.029) derived using the percentile method. Advances in statistical methods are important for making data analysis more accurate and reliable, especially when combining and interpreting information from different sources. The developed tools help handle complex data more effectively, thereby improving the process and understanding of information in real-world applications where precision is essential.</div></div>\",\"PeriodicalId\":51063,\"journal\":{\"name\":\"Information Sciences\",\"volume\":\"716 \",\"pages\":\"Article 122254\"},\"PeriodicalIF\":8.1000,\"publicationDate\":\"2025-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S002002552500386X\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S002002552500386X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

提出了一种用于高精度小数据集分析的混合参数自举(HPB)方法。该方法解决了在传统分布假设不适用时估计置信区间(CI)和中心值的挑战。我们的HPB结合了斯坦纳最频繁值(MFV)技术。MFV方法最大限度地减少了与小数据集相关的信息损失,而HPB方法考虑了每个单独元素的不确定性。作为一个实际的例子,我们应用这一创新的和稳健的统计方法来改进Ru97半衰期的先前测量。将MFV技术与HPB方法相结合,我们得到了更精确的半衰期估计,T1/2,MFV(HPB)=2.8385−0.0075+0.0022天。使用百分位数法计算,该精细化值在2.8310至2.8407天的置信区间为68.27%,在2.8036至2.8485天的置信区间为95.45%。我们的分析表明,不确定性大幅降低——比核数据表中报告的不确定性降低了30多倍——表明了广泛分析影响的潜力。此外,采用备选最小化策略可以进一步减少44%的统计不确定性。HPB方法有效地解决了小数据集固有的不确定性,正如使用地下数据重新评估Ar39的特定活动测量结果所证明的那样。我们报道SAMFV(HPB)=0.966−0.020+0.027 Bq/kgatmAr,置信区间为(68.27%:0.946 ~ 0.993;95.45%: 0.921-1.029),采用百分位数法推导。统计方法的进步对于使数据分析更加准确和可靠非常重要,特别是在综合和解释来自不同来源的信息时。开发的工具有助于更有效地处理复杂的数据,从而改进了在精度至关重要的实际应用程序中对信息的处理和理解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving confidence intervals and central value estimation in small datasets through hybrid parametric bootstrapping
We developed a hybrid parametric bootstrapping (HPB) method for analyzing small datasets with high precision. This method addresses the challenge of estimating confidence intervals (CI) and central values when traditional distribution assumptions do not apply. Our HPB is combined with Steiner's Most Frequent Value (MFV) technique. The MFV method minimizes the information loss associated with small datasets, while the HPB considers the uncertainty of each separate element. As a practical example, we applied this innovative and robust statistical methodology to refine prior measurements of the half-life of Ru97. Using the MFV technique integrated with the HPB method, we obtained a significantly more precise half-life estimate, T1/2,MFV(HPB)=2.83850.0075+0.0022 days. This refined value features a 68.27% confidence interval from 2.8310 to 2.8407 days and a 95.45% confidence interval from 2.8036 to 2.8485 days, as calculated using the percentile method. Our analysis demonstrates a substantial reduction in uncertainty–over 30 times lower than that reported in nuclear data sheets–indicating the potential for widespread analytical impact. In addition, employing alternative minimization strategies can reduce the statistical uncertainty by a further 44%. The HPB method effectively addresses the uncertainties inherent in small datasets, as demonstrated by re-evaluating the specific activity measurements for Ar39 using underground data. We report SAMFV(HPB)=0.9660.020+0.027 Bq/kgatmAr, with confidence intervals (68.27%: 0.946–0.993; 95.45%: 0.921–1.029) derived using the percentile method. Advances in statistical methods are important for making data analysis more accurate and reliable, especially when combining and interpreting information from different sources. The developed tools help handle complex data more effectively, thereby improving the process and understanding of information in real-world applications where precision is essential.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Sciences
Information Sciences 工程技术-计算机:信息系统
CiteScore
14.00
自引率
17.30%
发文量
1322
审稿时长
10.4 months
期刊介绍: Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions. Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信