航天飞机容错:模拟和数字团队合作

H. Blair-Smith
{"title":"航天飞机容错:模拟和数字团队合作","authors":"H. Blair-Smith","doi":"10.1109/DASC.2009.5347450","DOIUrl":null,"url":null,"abstract":"The Space Shuttle control system (including the avionics suite) was developed during the 1970s to meet stringent survivability requirements that were then extraordinary but today may serve as a standard against which modern avionics can be measured. In 30 years of service, only two major malfunctions have occurred, both due to failures far beyond the reach of fault tolerance technology: the explosion of an external fuel tank, and the destruction of a launch-damaged wing by re-entry friction. The Space Shuttle is among the earliest systems (if not the earliest) designed to a “FO-FO-FS” criterion, meaning that it had to Fail (fully) Operational after any one failure, then Fail Operational after any second failure (even of the same kind of unit), then Fail Safe after most kinds of third failure. The computer system had to meet this criterion using a Redundant Set of 4 computers plus a backup of the same type, which was (ostensibly!) a COTS type. Quadruple redundancy was also employed in the hydraulic actuators for elevons and rudder. Sensors were installed with quadruple, triple, or dual redundancy. For still greater fault tolerance, these three redundancies (sensors, computers, actuators) were made independent of each other so that the reliability criterion applies to each category separately. The mission rule for Shuttle flights, as distinct from the design criterion, became “FO-FS,” so that a mission continues intact after any one failure, but is terminated with a safe return after any second failure of the same type. To avoid an unrecoverable flat spin during the most dynamic flight phases, the overall system had to continue safe operation within 400 msec of any failure, but the decision to shut down a computer had to be made by the crew. Among the interesting problems to be solved were “control slivering” and “sync holes.” The first flight test (Approach and Landing only) was the proof of the pudding: when a key wire harness solder joint was jarred loose by the Shuttle's being popped off the back of its 747 mother ship, one of the computers “went bananas” (actual quote from an IBM expert).","PeriodicalId":313168,"journal":{"name":"2009 IEEE/AIAA 28th Digital Avionics Systems Conference","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Space shuttle fault tolerance: Analog and digital teamwork\",\"authors\":\"H. Blair-Smith\",\"doi\":\"10.1109/DASC.2009.5347450\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Space Shuttle control system (including the avionics suite) was developed during the 1970s to meet stringent survivability requirements that were then extraordinary but today may serve as a standard against which modern avionics can be measured. In 30 years of service, only two major malfunctions have occurred, both due to failures far beyond the reach of fault tolerance technology: the explosion of an external fuel tank, and the destruction of a launch-damaged wing by re-entry friction. The Space Shuttle is among the earliest systems (if not the earliest) designed to a “FO-FO-FS” criterion, meaning that it had to Fail (fully) Operational after any one failure, then Fail Operational after any second failure (even of the same kind of unit), then Fail Safe after most kinds of third failure. The computer system had to meet this criterion using a Redundant Set of 4 computers plus a backup of the same type, which was (ostensibly!) a COTS type. Quadruple redundancy was also employed in the hydraulic actuators for elevons and rudder. Sensors were installed with quadruple, triple, or dual redundancy. For still greater fault tolerance, these three redundancies (sensors, computers, actuators) were made independent of each other so that the reliability criterion applies to each category separately. The mission rule for Shuttle flights, as distinct from the design criterion, became “FO-FS,” so that a mission continues intact after any one failure, but is terminated with a safe return after any second failure of the same type. To avoid an unrecoverable flat spin during the most dynamic flight phases, the overall system had to continue safe operation within 400 msec of any failure, but the decision to shut down a computer had to be made by the crew. Among the interesting problems to be solved were “control slivering” and “sync holes.” The first flight test (Approach and Landing only) was the proof of the pudding: when a key wire harness solder joint was jarred loose by the Shuttle's being popped off the back of its 747 mother ship, one of the computers “went bananas” (actual quote from an IBM expert).\",\"PeriodicalId\":313168,\"journal\":{\"name\":\"2009 IEEE/AIAA 28th Digital Avionics Systems Conference\",\"volume\":\"73 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE/AIAA 28th Digital Avionics Systems Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DASC.2009.5347450\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE/AIAA 28th Digital Avionics Systems Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DASC.2009.5347450","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

航天飞机控制系统(包括航空电子设备套件)在1970年期间发展,以满足严格的生存能力要求,当时是非凡的,但是今天可能作为一个标准,反对现代航空电子设备可以测量。在30年的服役中,只发生过两次重大故障,都是由于故障远远超出了容错技术的范围:外部燃料箱爆炸,以及再入摩擦破坏了发射损坏的机翼。航天飞机是按照“FO-FO-FS”标准设计的最早的系统之一(如果不是最早的),这意味着它必须在任何一次故障后失效(完全)运行,然后在任何第二次故障后失效运行(即使是同一种单元),然后在大多数类型的第三次故障后失效安全。计算机系统必须使用由4台计算机组成的冗余集加上相同类型的备份来满足这个标准,这(表面上)是COTS类型。升降舵和方向舵的液压执行机构也采用了四重冗余。传感器安装有四倍、三倍或双重冗余。为了获得更大的容错性,这三种冗余(传感器、计算机、执行器)相互独立,以便可靠性标准分别适用于每个类别。与设计标准不同,航天飞机飞行的任务规则变成了“FO-FS”,即在任何一次失败后,任务继续完整,但在任何第二次相同类型的失败后,任务以安全返回而终止。为了避免在最动态的飞行阶段出现无法恢复的平旋,整个系统必须在任何故障发生后400毫秒内继续安全运行,但是关闭计算机的决定必须由机组人员做出。需要解决的有趣问题包括“控制滑动”和“同步漏洞”。第一次飞行测试(仅在着陆和降落时)是布丁的证明:当一个关键的线束焊点因航天飞机从747母船的后部弹出而松动时,其中一台计算机“发疯了”(实际引用自IBM专家)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Space shuttle fault tolerance: Analog and digital teamwork
The Space Shuttle control system (including the avionics suite) was developed during the 1970s to meet stringent survivability requirements that were then extraordinary but today may serve as a standard against which modern avionics can be measured. In 30 years of service, only two major malfunctions have occurred, both due to failures far beyond the reach of fault tolerance technology: the explosion of an external fuel tank, and the destruction of a launch-damaged wing by re-entry friction. The Space Shuttle is among the earliest systems (if not the earliest) designed to a “FO-FO-FS” criterion, meaning that it had to Fail (fully) Operational after any one failure, then Fail Operational after any second failure (even of the same kind of unit), then Fail Safe after most kinds of third failure. The computer system had to meet this criterion using a Redundant Set of 4 computers plus a backup of the same type, which was (ostensibly!) a COTS type. Quadruple redundancy was also employed in the hydraulic actuators for elevons and rudder. Sensors were installed with quadruple, triple, or dual redundancy. For still greater fault tolerance, these three redundancies (sensors, computers, actuators) were made independent of each other so that the reliability criterion applies to each category separately. The mission rule for Shuttle flights, as distinct from the design criterion, became “FO-FS,” so that a mission continues intact after any one failure, but is terminated with a safe return after any second failure of the same type. To avoid an unrecoverable flat spin during the most dynamic flight phases, the overall system had to continue safe operation within 400 msec of any failure, but the decision to shut down a computer had to be made by the crew. Among the interesting problems to be solved were “control slivering” and “sync holes.” The first flight test (Approach and Landing only) was the proof of the pudding: when a key wire harness solder joint was jarred loose by the Shuttle's being popped off the back of its 747 mother ship, one of the computers “went bananas” (actual quote from an IBM expert).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信