Journal of the Korean Statistical Society最新文献

筛选
英文 中文
Return prediction by machine learning for the Korean stock market 通过机器学习预测韩国股票市场的回报率
IF 0.6 4区 数学
Journal of the Korean Statistical Society Pub Date : 2023-12-20 DOI: 10.1007/s42952-023-00245-0
Wonwoo Choi, Seongho Jang, Sanghee Kim, Chayoung Park, Sunyoung Park, Seongjoo Song
{"title":"Return prediction by machine learning for the Korean stock market","authors":"Wonwoo Choi, Seongho Jang, Sanghee Kim, Chayoung Park, Sunyoung Park, Seongjoo Song","doi":"10.1007/s42952-023-00245-0","DOIUrl":"https://doi.org/10.1007/s42952-023-00245-0","url":null,"abstract":"<p>In this study, we aim to forecast monthly stock returns and analyze factors influencing stock prices in the Korean stock market. To find a model that maximizes the cumulative return of the portfolio of stocks with high predicted returns, we use machine learning models such as linear models, tree-based models, neural networks, and learning to rank algorithms. We employ a novel validation metric which we call the Cumulative net Return of a Portfolio with top 10% predicted return (CRP10) for tuning hyperparameters to increase the cumulative return of the selected portfolio. CRP10 tends to provide higher cumulative returns compared to out-of-sample R-squared as a validation metric with the data that we used. Our findings indicate that Light Gradient Boosting Machine (LightGBM) and Gradient Boosted Regression Trees (GBRT) demonstrate better performance than other models when we apply a single model for the entire test period. We also take the strategy of changing the model on a yearly basis by assessing the best model annually and observed that it did not outperform the approach of using a single model such as LightGBM or GBRT for the entire period.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138816752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatially integrated estimator of finite population total by integrating data from two independent surveys using spatial information 利用空间信息整合来自两个独立调查的数据,对有限人口总数进行空间整合估算
IF 0.6 4区 数学
Journal of the Korean Statistical Society Pub Date : 2023-12-19 DOI: 10.1007/s42952-023-00244-1
Nobin Chandra Paul, Anil Rai, Tauqueer Ahmad, Ankur Biswas, Prachi Misra Sahoo
{"title":"Spatially integrated estimator of finite population total by integrating data from two independent surveys using spatial information","authors":"Nobin Chandra Paul, Anil Rai, Tauqueer Ahmad, Ankur Biswas, Prachi Misra Sahoo","doi":"10.1007/s42952-023-00244-1","DOIUrl":"https://doi.org/10.1007/s42952-023-00244-1","url":null,"abstract":"<p>A major goal of survey sampling is finite population inference. In recent years, large-scale survey programs have encountered many practical challenges which include higher data collection cost, increasing non-response rate, increasing demand for disaggregated level statistics and desire for timely estimates. Data integration is a new field of research that provides a timely solution to these above-mentioned challenges by integrating data from multiple surveys. Now, it is possible to develop a framework that can efficiently combine information from several surveys to obtain more precise estimates of population parameters. In many surveys, parameters of interest are often spatial in nature, which means, the relationship between the study variable and covariates varies across all locations in the study area and this situation is referred as spatial non-stationarity. Hence, there is a need of a sampling methodology that can efficiently tackle this spatial non-stationarity problem and can be able to integrate this spatially referenced data to get more detailed information. In this study, a Geographically Weighted Spatially Integrated (GWSI) estimator of finite population total was developed by integrating data from two independent surveys using spatial information. The statistical properties of the proposed spatially integrated estimator were then evaluated empirically through a spatial simulation study. Three different spatial populations were generated having high spatial autocorrelation. The proposed spatially integrated estimator performed better than usual design-based estimator under all three populations. Furthermore, a Spatial Proportionate Bootstrap (SPB) method was developed for variance estimation of the proposed spatially integrated estimator.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138744869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical integration of allele frequencies from several organizations 统计整合多个组织的等位基因频率
IF 0.6 4区 数学
Journal of the Korean Statistical Society Pub Date : 2023-12-18 DOI: 10.1007/s42952-023-00243-2
Su Jin Jeong, Hyo-jung Lee, Soong Deok Lee, Su Jeong Park, Seung Hwan Lee, Jae Won Lee
{"title":"Statistical integration of allele frequencies from several organizations","authors":"Su Jin Jeong, Hyo-jung Lee, Soong Deok Lee, Su Jeong Park, Seung Hwan Lee, Jae Won Lee","doi":"10.1007/s42952-023-00243-2","DOIUrl":"https://doi.org/10.1007/s42952-023-00243-2","url":null,"abstract":"<p>Genetic evidence, especially evidence based on short tandem repeats, is of paramount importance for human identification in forensic inferences. In recent years, the identification of kinship using DNA evidence has drawn much attention in various fields. In particular, it is employed, using a criminal database, to confirm blood relations in forensics. The interpretation of the likelihood ratio when identifying an individual or a relationship depends on the allele frequencies that are used, and thus, it is crucial to obtain an accurate estimate of allele frequency. Each organization such as Supreme Prosecutors’ Office and Korean National Police Agency in Korea provides different statistical interpretations due to differing estimations of the allele frequency, which can lead to confusion in forensic identification. Therefore, it is very important to estimate allele frequency accurately, and doing so requires a certain amount of information. However, simply using a weighted average for each allele frequency may not be sufficient to determine biological independence. In this study, we propose a new statistical method for estimating allele frequency by integrating the data obtained from several organizations, and we analyze biological independence and differences in allele frequency relative to the weighted average of allele frequencies in various subgroups. Finally, our proposed method is illustrated using real data from 576 Korean individuals.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138717017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classification of repeated measurements using bias corrected Euclidean distance discriminant function 利用偏差校正欧氏距离判别函数对重复测量进行分类
IF 0.6 4区 数学
Journal of the Korean Statistical Society Pub Date : 2023-12-12 DOI: 10.1007/s42952-023-00246-z
Edward Kanuti Ngailo, Saralees Nadarajah
{"title":"Classification of repeated measurements using bias corrected Euclidean distance discriminant function","authors":"Edward Kanuti Ngailo, Saralees Nadarajah","doi":"10.1007/s42952-023-00246-z","DOIUrl":"https://doi.org/10.1007/s42952-023-00246-z","url":null,"abstract":"<p>This paper introduces a novel approach for approximating misclassification probabilities in Euclidean distance classifier when the group means exhibit a bilinear structure such as in the growth curve model first proposed by Potthoff and Roy (Biometrika 51:313–326, 1964). Initially, by leveraging certain statistical relationships, we establish two general results for the improved Euclidean discriminant function in both weighted and unweighted growth curve mean structures. We derive these approximations for the expected misclassification probabilities with respect to the distribution of the improved Euclidean discriminant function. Additionally, we compare the misclassification probabilities of the improved Euclidean discriminant function, the standard Euclidean discriminant function, and the linear discriminant function. It is important to note that in cases where the mean structure is weighted, a higher number of repeated measurements yields better classification results with the improved Euclidean discriminant function and the standard Euclidean discriminant function, allowing for more information to be acquired, as opposed to the linear discriminant function, which performs well with a smaller number of repeated measurements. Furthermore, we evaluate the accuracy of the suggested approximations by Monte Carlo simulations.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138575049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparse functional linear models via calibrated concave-convex procedure 稀疏函数线性模型通过校准凹-凸程序
IF 0.6 4区 数学
Journal of the Korean Statistical Society Pub Date : 2023-12-03 DOI: 10.1007/s42952-023-00242-3
Young Joo Lee, Yongho Jeon
{"title":"Sparse functional linear models via calibrated concave-convex procedure","authors":"Young Joo Lee, Yongho Jeon","doi":"10.1007/s42952-023-00242-3","DOIUrl":"https://doi.org/10.1007/s42952-023-00242-3","url":null,"abstract":"<p>In this paper, we propose a calibrated ConCave-Convex Procedure (CCCP) for variable selection in high-dimensional functional linear models. The calibrated CCCP approach for the Smoothly Clipped Absolute Deviation (SCAD) penalty is known to produce a consistent solution path with probability converging to one in linear models. We incorporate the SCAD penalty into function-on-scalar regression models and phrase them as a type of group-penalized estimation using a basis expansion approach. We then implement the calibrated CCCP method to solve the nonconvex group-penalized problem. For the tuning procedure, we use the Extended Bayesian Information Criterion (EBIC) to ensure consistency in high-dimensional settings. In simulation studies, we compare the performance of the proposed method with two existing convex-penalized estimators in terms of variable selection consistency and prediction accuracy. Lastly, we apply the method to the gene expression dataset for sparsely estimating the time-varying effects of transcription factors on the regulation of yeast cell cycle genes.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138496033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonparametric longitudinal regression model to analyze shape data using the Procrustes rotation 利用非参数纵向回归模型分析形状数据的Procrustes旋转
IF 0.6 4区 数学
Journal of the Korean Statistical Society Pub Date : 2023-12-03 DOI: 10.1007/s42952-023-00241-4
Meisam Moghimbeygi, Mousa Golalizadeh
{"title":"Nonparametric longitudinal regression model to analyze shape data using the Procrustes rotation","authors":"Meisam Moghimbeygi, Mousa Golalizadeh","doi":"10.1007/s42952-023-00241-4","DOIUrl":"https://doi.org/10.1007/s42952-023-00241-4","url":null,"abstract":"<p>Shape, as an intrinsic concept, can be considered as a source of information in some statistical analysis contexts. For instance, one of the important topics in morphology is to study the shape changes along time. From a topological viewpoint, shape data are points on a particular manifold and so to construct a longitudinal model for treating shape variation is not as trivial as thought. Unlike using the common parametric models to do such a task, we invoke Procrustes analysis in the context of a nonparametric framework and propose a simple, yet useful, model to deal with shape changes. After conveying the problem into the nonparametric regression model, we utilize the weighted least squares method to estimates the related parameters. Also, we illustrate implementing this new model in simulation studies and analyzing two biological data sets. Our proposed model shows its superiority while compared with other counterpart models.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138496034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variable selection for semiparametric accelerated failure time models with nonignorable missing data 具有不可忽略缺失数据的半参数加速失效时间模型的变量选择
IF 0.6 4区 数学
Journal of the Korean Statistical Society Pub Date : 2023-11-19 DOI: 10.1007/s42952-023-00238-z
Tianqing Liu, Xiaohui Yuan, Liuquan Sun
{"title":"Variable selection for semiparametric accelerated failure time models with nonignorable missing data","authors":"Tianqing Liu, Xiaohui Yuan, Liuquan Sun","doi":"10.1007/s42952-023-00238-z","DOIUrl":"https://doi.org/10.1007/s42952-023-00238-z","url":null,"abstract":"<p>The regularization approach for variable selection was well developed for semiparametric accelerated failure time (AFT) models, where the response variable is right censored. In the presence of missing data, this approach needs to be tailored to different missing data mechanisms. In this paper, we propose a flexible and generally applicable missing data mechanism for AFT models, which contains both ignorable and nonignorable missing data mechanism assumptions. We propose weighted rank (WR) estimators and corresponding penalized estimators of regression parameters under this missing data mechanism. An advantage of the WR estimators and corresponding penalized estimators is that they do not require specifying a missing data model for the proposed missing data mechanism. The theoretical properties of the WR and corresponding penalized estimators are established. Comprehensive simulation studies and a real data application further demonstrate the merits of our approach.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138496032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust and Efficient derivative estimation under correlated errors 相关误差下稳健高效的导数估计
IF 0.6 4区 数学
Journal of the Korean Statistical Society Pub Date : 2023-11-18 DOI: 10.1007/s42952-023-00240-5
Deru Kong, Wei Shen, Shengli Zhao, WenWu Wang
{"title":"Robust and Efficient derivative estimation under correlated errors","authors":"Deru Kong, Wei Shen, Shengli Zhao, WenWu Wang","doi":"10.1007/s42952-023-00240-5","DOIUrl":"https://doi.org/10.1007/s42952-023-00240-5","url":null,"abstract":"<p>In real applications, the correlated data are commonly encountered. To model such data, many techniques have been proposed. However, of the developed techniques, emphasis has been on the mean function estimation under correlated errors, with scant attention paid to the derivative estimation. In this paper, we propose the locally weighted least squares regression based on different difference quotients to estimate the different order derivatives under correlated errors. For the proposed estimators, we derive their asymptotic bias and variance with different covariance structure errors, which dramatically reduce the estimation variance compared with traditional methods. Furthermore, we establish their asymptotic normality for constructing confidence interval. Based on the asymptotic mean integrated squared error, we provide a data-driven tuning parameters selection criterion. Simulation studies show that the proposed method is more robust and efficient than four other popular methods. Finally, we illustrate the usefulness of the proposed method with a real data example.</p>","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138496031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Asymptotic bias of the $$ell _2$$-regularized error variance estimator $$ell _2$$ -正则化误差方差估计量的渐近偏差
4区 数学
Journal of the Korean Statistical Society Pub Date : 2023-11-14 DOI: 10.1007/s42952-023-00239-y
Semin Choi, Gunwoong Park
{"title":"Asymptotic bias of the $$ell _2$$-regularized error variance estimator","authors":"Semin Choi, Gunwoong Park","doi":"10.1007/s42952-023-00239-y","DOIUrl":"https://doi.org/10.1007/s42952-023-00239-y","url":null,"abstract":"","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134954534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A review on concomitants of order statistics and its application in parameter estimation under ranked set sampling 序统计量的伴随量及其在排序集抽样参数估计中的应用综述
4区 数学
Journal of the Korean Statistical Society Pub Date : 2023-11-13 DOI: 10.1007/s42952-023-00235-2
Rohan D. Koshti, Kirtee K. Kamalja
{"title":"A review on concomitants of order statistics and its application in parameter estimation under ranked set sampling","authors":"Rohan D. Koshti, Kirtee K. Kamalja","doi":"10.1007/s42952-023-00235-2","DOIUrl":"https://doi.org/10.1007/s42952-023-00235-2","url":null,"abstract":"","PeriodicalId":49992,"journal":{"name":"Journal of the Korean Statistical Society","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136282166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信