Proceedings of the 4th International Conference on Statistics: Theory and Applications最新文献

筛选
英文 中文
Machine Learning for Precision Medicine: Model Selection, Estimation, and Inference 精准医学的机器学习:模型选择、估计和推理
Yi Li
{"title":"Machine Learning for Precision Medicine: Model Selection, Estimation, and Inference","authors":"Yi Li","doi":"10.11159/icsta22.003","DOIUrl":"https://doi.org/10.11159/icsta22.003","url":null,"abstract":"- In the era of precision medicine, high-throughput data are routinely collected. These high dimensional data defy classical regression models, which are either infeasible to fit or likely to incur low predictability because of overfitting. In this talk we will introduce several cutting-edge machine learning methods, developed by my group in the last few years, for modeling (censored) outcome data with high dimensional predictors. Specifically, we will introduce a Dantzig selector for fitting survival models with high dimensional predictors, followed by various semiparametric and nonparametric feature screening methods for handling ultra-high dimensional predictors. We will also discuss statistical inference for regression models with high dimensional predictors. With high dimensional outcome data, we will introduce a new class of high dimensional Gaussian graphical regression models with predictors. The talk focuses on statistical principles and concepts behind these methods, which are motivated and illustrated by various biomedical examples, which have precision medicine contexts.","PeriodicalId":325859,"journal":{"name":"Proceedings of the 4th International Conference on Statistics: Theory and Applications","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133403284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Factors Which Impact the Customers’ Online Purchase Intentions 探讨影响顾客网上购买意愿的因素
Isidora Albijanić, Milica Milošević, V. Jeremic
{"title":"Exploring the Factors Which Impact the Customers’ Online Purchase Intentions","authors":"Isidora Albijanić, Milica Milošević, V. Jeremic","doi":"10.11159/icsta22.102","DOIUrl":"https://doi.org/10.11159/icsta22.102","url":null,"abstract":"- The e-market and e-commerce are growing rapidly year by year. Additionally, the coronavirus epidemic greatly affected the expansion of e-commerce among consumers during 2020 and 2021. Considering market growth, the number of stores in this market is also growing. As a result, the sector is becoming highly competitive and the stakeholders are eager to better understand the consumers’ behaviour, their decision making process and the motives which lead to online purchase. This paper lies upon the belief that the existing models for measuring the factors that influence customer’s decision to make a purchase through e-commerce can be further improved by including new constructs. Accordingly, a new conceptual model which explores the factors that influence decision to purchase products and services through e-commerce is proposed. The validity of the model was tested based on the data collected through online survey. The results indicate that the model is uphold by the data and that the most important aspects are assortment size, seller recognition, free delivery, and website functionality.","PeriodicalId":325859,"journal":{"name":"Proceedings of the 4th International Conference on Statistics: Theory and Applications","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130326780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Multivariate Dispersion Control Chart 一种新的多元离散控制图
Su-Fen Yang, Yen-ling Liu
{"title":"A New Multivariate Dispersion Control Chart","authors":"Su-Fen Yang, Yen-ling Liu","doi":"10.11159/icsta22.109","DOIUrl":"https://doi.org/10.11159/icsta22.109","url":null,"abstract":"Statistical process control methods are useful for improving or maintaining a manufacturing or service process in a stable and satisfactory state. Nowadays, the problem of monitoring multivariate process control for several related quality variables is of current interest. So far in the literature, a few papers have discussed monitoring process dispersion for cases in which the process has a multivariate normal or non-normal distribution. In this article, we develop a new Phase II dispersion control chart which is independent of the out-of-control process mean, and allows individual observations or multiple observations. It overcomes the problem in many existing covariance matrix control charts of assuming that there are no shifts in the process mean vector which, depending on the existence of shifts in mean, can lead to an increased false alarm rate. The proposed dispersion sample charting statistics are independent among samples. Moreover, the new Phase II dispersion control chart is constructed under the assumption of a multivariate normal distribution. For a single quality variable, Yang and Arnold [1][2] developed a process dispersion control chart, which is independent of the mean shifts. In this article, we extend the method to the multivariate case. A Shewhart-type and one-sided exponentially weighted moving average (EWMA) dispersion control charts to monitor the upward multivariate process dispersion are developed assuming that there is only an upward out-of-control process dispersion. To investigate how the out-of-control detection performance of the proposed EWMA dispersion control chart, we adopt four scenarios for the variance-covariance matrix. They are increasing in variances, increasing in covariances","PeriodicalId":325859,"journal":{"name":"Proceedings of the 4th International Conference on Statistics: Theory and Applications","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115981454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Unified Framework for Principal Subspace Analysis from the Hamiltonian Viewpoint 从哈密顿视点看主子空间分析的统一框架
K. Yu
{"title":"A Unified Framework for Principal Subspace Analysis from the Hamiltonian Viewpoint","authors":"K. Yu","doi":"10.11159/icsta22.143","DOIUrl":"https://doi.org/10.11159/icsta22.143","url":null,"abstract":"Extended Abstract In this work, we develop a general and unified framework for principal component analysis (PCA) applicable to Riemannian, sub-Riemannian and symplectic manifold-valued data and functional data. Almost all existing statistical methods for manifold data rely on the tangent bundle of the manifold, with the purpose of transforming the nonlinear manifold to the linear tangent spaces. However, such methods become invalid when the tangent vectors are constrained to lie in a subspace of the tangent space since the exponential map will no longer be a local diffeomorphism. This scenario, known as the sub-Riemannian geometry, has attracted considerable attention in recent years. We propose to shift the tangent space viewpoint and move towards the dual spaces of the tangent spaces, i.e., the cotangent spaces, and build subspaces based on initial covectors. More generally, motivated by the Arnold-Liouville theorem we propose the anchor-compatible identification for subspaces with first integrals (ACISFI), which constructs a properly nested sequence of subspaces as the fibres of a carefully chosen set of functionally independent functions defined on the cotangent bundle, i.e., the first integrals of the Hamiltonian system, generalising the ideas of obtaining subspaces from linearly independent tangent vectors [1] or from affinely independent points [2]. There are several advantages of the ACISFI over the existing PCA on manifolds. First, the subspaces can be learnt from sample points in a completely data-driven way. We do not impose a particular form for the Hamiltonian, e.g., the Hamiltonian which induces the geodesic flow, but it can be chosen as any smooth function on the manifold, and there is not any a priori restriction on the form of first integrals as well. Second, the submanifolds can be","PeriodicalId":325859,"journal":{"name":"Proceedings of the 4th International Conference on Statistics: Theory and Applications","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131112206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New Statistical Methods For Association Studies And Genomic Predictio 关联研究和基因组预测的新统计方法
Charles‐Elie Rabier, Céline Delmas
{"title":"New Statistical Methods For Association Studies And Genomic Predictio","authors":"Charles‐Elie Rabier, Céline Delmas","doi":"10.11159/icsta22.135","DOIUrl":"https://doi.org/10.11159/icsta22.135","url":null,"abstract":"Extended Abstract \"Selective genotyping\" is a very famous concept in genetics. It was introduced by Lebowitz et al. (1987) and was studied more in details by Lander and Botstein (1989). It consists in genotyping (collecting DNA information at specific positions) only the individuals with extreme phenotypes. Indeed, Lebowitz et al. (1987) noticed that the highest or the lowest observations contain most of the signal on Quantitative Trait Loci (QTL), i.e. genes with quantitative effect on a trait. Today, although the genotyping costs have drastically dropped, selective genotyping is still heavily used (e.g. [1]) since we can optimize the statistical experiment by focusing on extreme individuals instead of \"random\" individuals. Although \"selective genotyping\" was introduced in the eighties, biologists are still missing tools to analyze properly data sampled from this experimental design. Indeed, classical methods such as penalized regression (e.g. Lasso [2]) are not dedicated to extreme observations. As a consequence, we introduced recently the SgenoLasso [3], a new L1 penalized regression that models explicitly the extremes. SgenoLasso on the ``Interval Mapping\" a famous concept in genetics that consists in scanning the genome by testing the presence of a QTL at each location. From a statistical point of view, SgenoLasso is based on new limiting results on stochastic processes along the genome. SgenoLasso presents all the nice properties of Lasso since we have replaced the problem in a","PeriodicalId":325859,"journal":{"name":"Proceedings of the 4th International Conference on Statistics: Theory and Applications","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114896786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Time Series Analysis Using Shannon Index of Annual Domestic Crop Production and Area Planted in Jamaica from 2007 to 2021 2007 - 2021年牙买加国内作物年产量和种植面积Shannon指数时间序列分析
Videsh S. Jagroo, Annika Minott, Lisa James
{"title":"A Time Series Analysis Using Shannon Index of Annual Domestic Crop Production and Area Planted in Jamaica from 2007 to 2021","authors":"Videsh S. Jagroo, Annika Minott, Lisa James","doi":"10.11159/icsta22.166","DOIUrl":"https://doi.org/10.11159/icsta22.166","url":null,"abstract":"- This paper aims to determine agricultural non-tree crop diversification with respect to crop production and area reaped in Jamaica from 2007 to 2021 using the Shannon-Wiener index. Shannon Index was calculated to determine diversification nationally and by parish. Time series analysis was done for the resulting transformed dataset and ten-year forecasts computed for national production and area reaped using linear and quadratic models. Two variables of the Shannon index, namely quantity and abundance, classified as the number of crops and the percent area reaped respectively, were regressed against the Shannon indices of area reaped and production for each parish. At the national level, a decreasing trend was unveiled for the Shannon indices for production and area reaped. This indicates that crop production and area reaped will suffer a decrease in diversification without intervention. Additionally, all parishes with the exception of Trelawny, Portland and St. Ann were diverse in crop production and area reaped. Despite moderate to high Shannon indices in most parishes, their forecasts showed decreases in production diversification in the next 10 years except for Portland, and Kingston and St Andrew which showed increase. All parishes showed decrease in diversification in area reaped with the exception of Portland, Kingston and St Andrew, St Catherine, and St Elizabeth. Regressions for Clarendon and St James were significant for the Shannon indices of production and area reaped however they showed decrease in the Shannon indices for an increase in percent area reaped implying that as agricultural land usage is increasing, the diversities in production and area reaped are still decreasing. For St James only, increasing in the number of crops will result in an increase in the Shannon indices for production and area reaped implying that as the number of crops increases, the diversification of production and area reaped increases.","PeriodicalId":325859,"journal":{"name":"Proceedings of the 4th International Conference on Statistics: Theory and Applications","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127432674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining Statistical and Rule-Based Expert Knowledge to Measure Employment Precarity 结合统计和基于规则的专家知识来衡量就业不稳定性
Penelope Stamou, Elena Stringli, Glykeria Stamatopoulou, Dimitrios Parsanoglou, M. Symeonaki
{"title":"Combining Statistical and Rule-Based Expert Knowledge to Measure Employment Precarity","authors":"Penelope Stamou, Elena Stringli, Glykeria Stamatopoulou, Dimitrios Parsanoglou, M. Symeonaki","doi":"10.11159/icsta22.153","DOIUrl":"https://doi.org/10.11159/icsta22.153","url":null,"abstract":"- The measurement of precarity and the identification of a set of indicators that can be used for its assessment has been established as a key issue in Europe, central to the entire discipline of labour statistics, social policy, and sociology of work. Most recent studies agree upon the basic characteristics that a worker should have to be considered as precarious: insecurity, vulnerability, and no or limited entitlements. The present paper offers an innovative method that combines statistical analysis regarding the measurement of nine key indicators that are linked with precarity to a lesser or greater extend, with a rule-based expert system to rate each worker’s precarity. Raw data are drawn from the EU-Labour Force Survey (EU-LFS) for the case of Greece. However, the suggested method can be applied with minor modifications to the remainder thirty-four participating in the EU-LFS countries since a common questionnaire is used for all countries. The estimated indicators refer to three domains that are linked with precarity: labour market conditions and job insecurity, limited entitlements, and insufficient resources. Having estimated a precarious score for each worker, the socio-demographic characteristics of precarious workers are identified, extracting valuable knowledge on their profile.","PeriodicalId":325859,"journal":{"name":"Proceedings of the 4th International Conference on Statistics: Theory and Applications","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129047249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Mutual Information in the Analysis of Trust Gains from Subsets of Information 信息子集信任收益分析中的互信息
R. Bustin, C. V. Goldman
{"title":"Mutual Information in the Analysis of Trust Gains from Subsets of Information","authors":"R. Bustin, C. V. Goldman","doi":"10.11159/icsta22.110","DOIUrl":"https://doi.org/10.11159/icsta22.110","url":null,"abstract":"- Information can increase trust of humans in automated machines. However, assessing the impact of all combinations of information pieces on the trust level of humans might not be practical. This paper assumes that data can be collected from human participants having interacted with some automated machine. We assume a two-stage study in which the participants initially submit their ranking (trust level) when no information is provided, and then provide additional independent rankings for each piece of additional information. The goal is to determine the best combination of information pieces over all combinations without directly asking the participants to rank the possible combinations. The impact of the combinations on the trust ranking is evaluated using the mutual information quantity. We further consider the question of statistical significance in this unique setting, and suggest an optimization objective that examines the trade-off between the impact of the subset on the trust measure, on the one hand, while considering the complexity of the subset, measured by the size of the subset (number of additional pieces of information), on the other hand. We provide a numerical example that shows all aspects discussed in this work.","PeriodicalId":325859,"journal":{"name":"Proceedings of the 4th International Conference on Statistics: Theory and Applications","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130193749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Jackknife Estimator Consistency for Nonlinear Mixture 非线性混合的刀切估计相合性
R. Maiboroda, Vitaliy MIroshnychenko
{"title":"Jackknife Estimator Consistency for Nonlinear Mixture","authors":"R. Maiboroda, Vitaliy MIroshnychenko","doi":"10.11159/icsta22.149","DOIUrl":"https://doi.org/10.11159/icsta22.149","url":null,"abstract":"Extended Abstract This paper continues our studies of the jackknife (JK) technique application for estimation of estimators’ covariance matrices in models of mixture with varying concentrations (MVC) [2, 3]. On JK applications for homogeneous samples, see [1]. In MVC models one deals with a non-homogeneous sample, which consists of subjects belonging to 𝑀 different sub-populations (mixture components). One knows the probabilities with which a subject belongs to the mixture components and these probabilities are different for different subjects. Therefore, the considered observations are independent but not identically distributed. We consider objects from a mixture with various concentrations. All objects from the sample Ξ 𝑛 belongs to one of M different mixture components. Each object from the sample 𝛯 𝑛 = (𝜉 𝑗 ) 𝑗=1 𝑛 has observed characteristics 𝜉 𝑗 = (𝑋 𝑗 , 𝑌 𝑗 ) ∈ ℝ 𝐷 and one hidden 𝜅 𝑗 . 𝜅 𝑗 = 𝑚 if 𝑗 -th objects belongs to the 𝑚 -th component. These numbers are unknown, but we know the mixing probabilities 𝑝 𝑗;𝑛𝑚 = 𝑃{𝜅 𝑗 = 𝑚} . The 𝑋 𝑗 is a vector of regressors and 𝑌 𝑗 is a response in the regression model Here 𝑏 (𝑚) ∈ Θ ⊆ ℝ 𝑑 is a vector of unknown regression parameters for the 𝑚 -th component, the 𝑔: ℝ 𝐷−1 × Θ → ℝ is a known regression function, 𝜀 𝑗 is a regression error term. Random variables 𝑋 𝑗 and 𝜀 𝑗 are independent and their distribution is different","PeriodicalId":325859,"journal":{"name":"Proceedings of the 4th International Conference on Statistics: Theory and Applications","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128006369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Theoretical Formulism for Evidential Reasoning and Logic Based Bias Reduction in Geo-Intelligence Processing 地球情报处理中证据推理和基于逻辑的偏见减少的理论公式
Nicholas V. Scott
{"title":"A Theoretical Formulism for Evidential Reasoning and Logic Based Bias Reduction in Geo-Intelligence Processing","authors":"Nicholas V. Scott","doi":"10.11159/icsta22.117","DOIUrl":"https://doi.org/10.11159/icsta22.117","url":null,"abstract":"- Geo-intelligence processing is strongly based on the need to bring together analytical viewpoints from multiple members comprising a geo-intelligence team so that unified answers to problems can be provided to leadership responsible for decision making. A three-tier evidential reasoning formulism is proposed and explained embodying a guide for the statistical/cognitive processing of geo-intelligence sensor information to facilitate this aim. The first tier comprises computational modeling used in conjunction with informal logic-based bias reduction by a multiple analyst team to interpret geo-intelligence information and create geo-intelligence reports. In the second tier, Bayesian belief networks over distinct provinces under geo-intelligence analytical investigation are created by each analyst through the amalgamation of statistical information provided by geo-intelligence reports. Bayesian belief network (BBN) results coupled with ancillary intelligence and analyst beliefs provide a set of propositions and probability masses summarizing the state of each province analyzed by each team member. The BBN state levels denote the three conditions of lack of nefarious substance presence, probable nefarious substance presence, and definite nefarious substance presence and are taken to be related, via a one-to-one mapping, directly to a new set of decision-based propositions – lack of adversary attack, probable adversary attack, and definite adversary attack. In the third tier, team member probability masses associated with these propositions, along with conjunctive and disjunctive combinations, are gradually amalgamated using Dezert-Smarandache (DS) evidential theory. A numerical example demonstrates the mechanics of the third-tier information fusion process which takes into account logical paradoxes and results in a single virtual analyst probability mass distribution associated with the geo-intelligence information amalgamation problem.","PeriodicalId":325859,"journal":{"name":"Proceedings of the 4th International Conference on Statistics: Theory and Applications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130905834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信