Principal Curves

Pub Date : 1900-01-01 DOI:10.2307/2289936
T. Hastie, W. Stuetzle
{"title":"Principal Curves","authors":"T. Hastie, W. Stuetzle","doi":"10.2307/2289936","DOIUrl":null,"url":null,"abstract":"Principal curves are smooth one-dimensional curves that pass through the middle of a p-dimensional data set, providing a nonlinear summary of the data. They are nonparametric, and their shape is suggested by the data. The algorithm for constructing principal curves starts with some prior summary, such as the usual principal-component li e. The curve in each successive iteration is a smooth or local average of the p-dimensional points, where the definition of local is based on the distance in arc length of the projections of the points onto the curve found in the previous iteration. In this article principal curves are defined, an algorithm for their construction is given, some theoretical results are presented, and the procedure is compared to other generalizations ofprincipal components. Two applications illustrate the use of principal curves. The first describes how the principal-curve procedure was used to align the magnets of the Stanford linear collider. The collider uses about 950 magnets in a roughly circular arrangement tobend electron and positron beams and bring them to collision. After construction, it was found that some of the magnets had ended up significantly outof place. As a result, the beams had to be bent too sharply and could not be focused. The engineers realized that the magnets did not have to be moved to their originally planned locations, but rather to a sufficiently smooth arc through the middle of the existing positions. This arc was found using the principalcurve procedure. In the second application, two different assays for gold content in several samples of computer-chip waste appear to show some systematic differences that are blurred by measurement error. The classical approach using linear errors in variables regression can detect systematic linear differences but is not able to account for nonlinearities. When the first linear principal component is replaced with a principal curve, a local \"bump\" is revealed, and bootstrapping is used to verify its presence.","PeriodicalId":0,"journal":{"name":"","volume":" ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"925","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2307/2289936","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 925

Abstract

Principal curves are smooth one-dimensional curves that pass through the middle of a p-dimensional data set, providing a nonlinear summary of the data. They are nonparametric, and their shape is suggested by the data. The algorithm for constructing principal curves starts with some prior summary, such as the usual principal-component li e. The curve in each successive iteration is a smooth or local average of the p-dimensional points, where the definition of local is based on the distance in arc length of the projections of the points onto the curve found in the previous iteration. In this article principal curves are defined, an algorithm for their construction is given, some theoretical results are presented, and the procedure is compared to other generalizations ofprincipal components. Two applications illustrate the use of principal curves. The first describes how the principal-curve procedure was used to align the magnets of the Stanford linear collider. The collider uses about 950 magnets in a roughly circular arrangement tobend electron and positron beams and bring them to collision. After construction, it was found that some of the magnets had ended up significantly outof place. As a result, the beams had to be bent too sharply and could not be focused. The engineers realized that the magnets did not have to be moved to their originally planned locations, but rather to a sufficiently smooth arc through the middle of the existing positions. This arc was found using the principalcurve procedure. In the second application, two different assays for gold content in several samples of computer-chip waste appear to show some systematic differences that are blurred by measurement error. The classical approach using linear errors in variables regression can detect systematic linear differences but is not able to account for nonlinearities. When the first linear principal component is replaced with a principal curve, a local "bump" is revealed, and bootstrapping is used to verify its presence.
分享
查看原文
主曲线
主曲线是平滑的一维曲线,它穿过p维数据集的中间,提供数据的非线性汇总。它们是非参数的,它们的形状由数据决定。构造主曲线的算法从一些先前的总结开始,例如通常的主成分li e。每次连续迭代中的曲线是p维点的光滑或局部平均,其中局部的定义是基于在前一次迭代中发现的点的投影到曲线上的弧长距离。本文定义了主成分曲线,给出了构造主成分曲线的一种算法,给出了一些理论结果,并与其它主成分的推广方法进行了比较。两个应用说明了主曲线的使用。第一章描述了如何使用主曲线程序来对准斯坦福直线对撞机的磁体。对撞机使用大约950块磁铁,大致呈圆形排列,弯曲电子和正电子束,并使它们碰撞。施工结束后,人们发现一些磁铁最终明显错位了。因此,光束必须弯曲得太厉害,无法聚焦。工程师们意识到,磁铁不必移动到原来计划的位置,而是在现有位置的中间形成一个足够光滑的弧形。这条弧是用原理曲线法得到的。在第二个应用中,对几个计算机芯片废料样品中的含金量进行两种不同的测定,似乎显示出一些由于测量误差而模糊的系统差异。在变量回归中使用线性误差的经典方法可以检测到系统的线性差异,但不能解释非线性。当第一个线性主成分被替换为一个主曲线时,一个局部的“凸起”被显示出来,并使用自举来验证它的存在。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信