{"title":"A note on the simultaneous computation of thousands of Pearson's X2-Statistics","authors":"H. Schwender","doi":"10.17877/DE290R-14818","DOIUrl":"https://doi.org/10.17877/DE290R-14818","url":null,"abstract":"In genetic association studies, important and common goals are the identification of single nucleotide polymorphisms (SNPs) showing a distribution that differs between several groups and the detection of SNPs with a coherent pattern. In the former situation, tens of thousands of SNPs should be tested, whereas in the latter case typically several ten SNPs are considered leading to thousands of statistics that need to be computed. A test statistic appropriate for both goals is Pearson’s χ2-statistic. However, computing this (or another) statistic for each SNP or pair of SNPs separately is very time-consuming. In this article, we show how simple matrix computation can be employed to calculate the χ2-statistic for all SNPs simultaneously.","PeriodicalId":10841,"journal":{"name":"CTIT technical reports series","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72781419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Testing large-dimensional correlation","authors":"Matthias Arnold, R. Weißbach","doi":"10.17877/DE290R-267","DOIUrl":"https://doi.org/10.17877/DE290R-267","url":null,"abstract":"This paper introduces a test for zero correlation in situations where the correlation matrix is large compared to the sample size. The test statistic is the sum of the squared correlation coefficients in the sample. We derive its limiting null distribution as the number of variables as well as the sample size converge to infinity. A Monte Carlo simulation finds both size and power for finite samples to be suitable. We apply the test to the vector of default rates, a risk factor in portfolio credit risk, in different sectors of the German economy.","PeriodicalId":10841,"journal":{"name":"CTIT technical reports series","volume":"106 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80753767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. L. Davies, U. Gather, D. Nordman, Henrike Weinert
{"title":"Constructing a regular histogram - a comparison of methods","authors":"P. L. Davies, U. Gather, D. Nordman, Henrike Weinert","doi":"10.17877/DE290R-15737","DOIUrl":"https://doi.org/10.17877/DE290R-15737","url":null,"abstract":"Even for a well-trained statistician the construction of a histogram for a given real-valued set is a sifficult problem. It is even more difficult to construct a fully automatic procedure which specifies the number and widths of the binss in a satisfactory manner for a wide range of data sets. In this paper we compare several histogram construction methods by means of a simulation study. The study includes plug-in methods, cross-validation, penalized maximum likehood and the taut string procedure. Their performance on different test beds is measured by the Hellinger distance and the ability to identify the modes of the underlying density.","PeriodicalId":10841,"journal":{"name":"CTIT technical reports series","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84256077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deriving a statistical model for the prediction of spiralling in BTA deep hole drilling from a physical model","authors":"C. Weihs, N. Raabe, O. Webber","doi":"10.1007/978-3-642-00668-5_11","DOIUrl":"https://doi.org/10.1007/978-3-642-00668-5_11","url":null,"abstract":"","PeriodicalId":10841,"journal":{"name":"CTIT technical reports series","volume":"132 1","pages":"107-114"},"PeriodicalIF":0.0,"publicationDate":"2007-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86818203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Minimization of Boolean expressions using matrix algebra","authors":"H. Schwender","doi":"10.17877/DE290R-265","DOIUrl":"https://doi.org/10.17877/DE290R-265","url":null,"abstract":"The more variables a logic expression contain, the more complicated is the interpretation of this expression. Since in a statistical sense prime implicants can be interpreted as interactions of binary variables, it is thus advantageous to convert such a logic expression into a disjunctive normal form consisting of prime implicants. In this paper, we present two algorithms based on matrix algebra for the identification of all prime implicants comprised in a logic expression and for the minimization of this set of prime implicants.","PeriodicalId":10841,"journal":{"name":"CTIT technical reports series","volume":"80 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83956663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A note on the choice of the number of slices in sliced inverse regression","authors":"C. Becker, U. Gather","doi":"10.17877/DE290R-283","DOIUrl":"https://doi.org/10.17877/DE290R-283","url":null,"abstract":"Sliced inverse regression (SIR) is a clever technique for reducing the dimension of the predictor in regression problems, thus avoiding the curse of dimensionality. There exist many contributions on various aspects of the performance of SIR. Up to now, few attention has been paid to the problem of choosing the number of slices within the SIR procedure appropriately. The aim of this paper is to show that especially the estimation of the reduced dimension can be strongly in?uenced by the chosen number of slices.","PeriodicalId":10841,"journal":{"name":"CTIT technical reports series","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73305000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Steel railway bridge deck design for noise emission and maintenance cost reduction","authors":"B. Hesselink, Bert H. H. Snijder","doi":"10.2749/222137807796120166","DOIUrl":"https://doi.org/10.2749/222137807796120166","url":null,"abstract":"Recently, new developments in steel railway bridge deck design have been induced by noise emission and maintenance cost reduction. The tendency is towards simple and smooth deck designs because they need less maintenance to prevent corrosion. In addition, composite and concrete deck systems are designed for minimum noise emission and lower (track) maintenance costs. Traditional deck designs, consisting of cross and longitudinal steel beams with bridge sleepers on top of them, frequently give problems with respect to fatigue. Therefore, the bridge sleepers were replaced by new special silent longitudinal deck sections enhancing the lifetime of these bridges. In this paper, the developments in steel railway bridge deck design to meet noise emission and maintenance requirements for new and existing steel railway bridges are illustrated. These developments bring new opportunities for the use of steel as a construction material for railway bridges.","PeriodicalId":10841,"journal":{"name":"CTIT technical reports series","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74416870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mixed Signals Among Panel Cointegration Tests","authors":"C. Hanck","doi":"10.17877/DE290R-14299","DOIUrl":"https://doi.org/10.17877/DE290R-14299","url":null,"abstract":"Time series cointegration tests, even in the presence of large sample sizes, often yield conflicting conclusions (“mixed signals”) as measured by, inter alia, a low correlation of empirical p-values [see Gregory et al., 2004, Journal of Applied Econometrics]. Using their methodology, we present evidence suggesting that the problem of mixed signals persists for popular panel cointegration tests. As expected, there is weaker correlation between residual and system-based tests than between tests of the same group.","PeriodicalId":10841,"journal":{"name":"CTIT technical reports series","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2006-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73025408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Making Indefinite Kernel Learning Practical","authors":"Ingo Mierswa","doi":"10.17877/DE290R-1946","DOIUrl":"https://doi.org/10.17877/DE290R-1946","url":null,"abstract":"In this paper we embed evolutionary computation into statistical learning theory. First, we outline the connection between large margin optimization and statistical learning and see why this paradigm is successful for many pattern recognition problems. We then embed evolutionary computation into the most prominent representative of this class of learning methods, namely into Support Vector Machines (SVM). In contrast to former applications of evolutionary algorithms to SVM we do not only optimize the method or kernel parameters. We rather use evolution strategies in order to directly solve the posed constrained optimization problem. Transforming the problem into the Wolfe dual reduces the total runtime and allows the usage of kernel functions just as for traditional SVM. We will show that evolutionary SVM are at least as accurate as their quadratic programming counterparts on eight real-world benchmark data sets in terms of generalization performance. They always outperform traditional approaches in terms of the original optimization problem. Additionally, the proposed algorithm is more generic than existing traditional solutions since it will also work for non-positive semidefinite or indefinite kernel functions. The evolutionary SVM variants frequently outperform their quadratic programming competitors in cases where such an indefinite Kernel function is used.","PeriodicalId":10841,"journal":{"name":"CTIT technical reports series","volume":"53 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2006-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88705119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OLS-based estimation of the disturbance variance under spatial autocorrelation","authors":"W. Krämer, C. Hanck","doi":"10.1007/978-3-7908-2064-5_19","DOIUrl":"https://doi.org/10.1007/978-3-7908-2064-5_19","url":null,"abstract":"","PeriodicalId":10841,"journal":{"name":"CTIT technical reports series","volume":"35 1","pages":"357-366"},"PeriodicalIF":0.0,"publicationDate":"2006-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86564920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}