浮点算术

IF 16.3 1区 数学 Q1 MATHEMATICS
S. Boldo, C. Jeannerod, G. Melquiond, Jean-Michel Muller
{"title":"浮点算术","authors":"S. Boldo, C. Jeannerod, G. Melquiond, Jean-Michel Muller","doi":"10.1017/S0962492922000101","DOIUrl":null,"url":null,"abstract":"Floating-point numbers have an intuitive meaning when it comes to physics-based numerical computations, and they have thus become the most common way of approximating real numbers in computers. The IEEE-754 Standard has played a large part in making floating-point arithmetic ubiquitous today, by specifying its semantics in a strict yet useful way as early as 1985. In particular, floating-point operations should be performed as if their results were first computed with an infinite precision and then rounded to the target format. A consequence is that floating-point arithmetic satisfies the ‘standard model’ that is often used for analysing the accuracy of floating-point algorithms. But that is only scraping the surface, and floating-point arithmetic offers much more. In this survey we recall the history of floating-point arithmetic as well as its specification mandated by the IEEE-754 Standard. We also recall what properties it entails and what every programmer should know when designing a floating-point algorithm. We provide various basic blocks that can be implemented with floating-point arithmetic. In particular, one can actually compute the rounding error caused by some floating-point operations, which paves the way to designing more accurate algorithms. More generally, properties of floating-point arithmetic make it possible to extend the accuracy of computations beyond working precision.","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"32 1","pages":"203 - 290"},"PeriodicalIF":16.3000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Floating-point arithmetic\",\"authors\":\"S. Boldo, C. Jeannerod, G. Melquiond, Jean-Michel Muller\",\"doi\":\"10.1017/S0962492922000101\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Floating-point numbers have an intuitive meaning when it comes to physics-based numerical computations, and they have thus become the most common way of approximating real numbers in computers. The IEEE-754 Standard has played a large part in making floating-point arithmetic ubiquitous today, by specifying its semantics in a strict yet useful way as early as 1985. In particular, floating-point operations should be performed as if their results were first computed with an infinite precision and then rounded to the target format. A consequence is that floating-point arithmetic satisfies the ‘standard model’ that is often used for analysing the accuracy of floating-point algorithms. But that is only scraping the surface, and floating-point arithmetic offers much more. In this survey we recall the history of floating-point arithmetic as well as its specification mandated by the IEEE-754 Standard. We also recall what properties it entails and what every programmer should know when designing a floating-point algorithm. We provide various basic blocks that can be implemented with floating-point arithmetic. In particular, one can actually compute the rounding error caused by some floating-point operations, which paves the way to designing more accurate algorithms. More generally, properties of floating-point arithmetic make it possible to extend the accuracy of computations beyond working precision.\",\"PeriodicalId\":48863,\"journal\":{\"name\":\"Acta Numerica\",\"volume\":\"32 1\",\"pages\":\"203 - 290\"},\"PeriodicalIF\":16.3000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Numerica\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1017/S0962492922000101\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Numerica","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1017/S0962492922000101","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 14

摘要

当涉及到基于物理的数值计算时,浮点数具有直观的含义,因此它们已成为计算机中近似实数的最常见方式。IEEE-754标准早在1985年就以严格而有用的方式规定了浮点运算的语义,在使浮点运算在今天无处不在方面发挥了很大的作用。特别是,执行浮点操作时,应该把它们的结果首先以无限精度计算,然后四舍五入到目标格式。其结果是,浮点运算满足通常用于分析浮点算法精度的“标准模型”。但这仅仅是皮毛,浮点运算提供了更多。在这个调查中,我们回顾了浮点运算的历史,以及它在IEEE-754标准中规定的规范。我们还回顾了它所需要的属性以及每个程序员在设计浮点算法时应该知道的内容。我们提供了各种可以用浮点运算实现的基本块。特别是,可以实际计算由某些浮点操作引起的舍入误差,这为设计更精确的算法铺平了道路。更一般地说,浮点运算的特性使计算精度超越工作精度成为可能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Floating-point arithmetic
Floating-point numbers have an intuitive meaning when it comes to physics-based numerical computations, and they have thus become the most common way of approximating real numbers in computers. The IEEE-754 Standard has played a large part in making floating-point arithmetic ubiquitous today, by specifying its semantics in a strict yet useful way as early as 1985. In particular, floating-point operations should be performed as if their results were first computed with an infinite precision and then rounded to the target format. A consequence is that floating-point arithmetic satisfies the ‘standard model’ that is often used for analysing the accuracy of floating-point algorithms. But that is only scraping the surface, and floating-point arithmetic offers much more. In this survey we recall the history of floating-point arithmetic as well as its specification mandated by the IEEE-754 Standard. We also recall what properties it entails and what every programmer should know when designing a floating-point algorithm. We provide various basic blocks that can be implemented with floating-point arithmetic. In particular, one can actually compute the rounding error caused by some floating-point operations, which paves the way to designing more accurate algorithms. More generally, properties of floating-point arithmetic make it possible to extend the accuracy of computations beyond working precision.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Acta Numerica
Acta Numerica MATHEMATICS-
CiteScore
26.00
自引率
0.70%
发文量
7
期刊介绍: Acta Numerica stands as the preeminent mathematics journal, ranking highest in both Impact Factor and MCQ metrics. This annual journal features a collection of review articles that showcase survey papers authored by prominent researchers in numerical analysis, scientific computing, and computational mathematics. These papers deliver comprehensive overviews of recent advances, offering state-of-the-art techniques and analyses. Encompassing the entirety of numerical analysis, the articles are crafted in an accessible style, catering to researchers at all levels and serving as valuable teaching aids for advanced instruction. The broad subject areas covered include computational methods in linear algebra, optimization, ordinary and partial differential equations, approximation theory, stochastic analysis, nonlinear dynamical systems, as well as the application of computational techniques in science and engineering. Acta Numerica also delves into the mathematical theory underpinning numerical methods, making it a versatile and authoritative resource in the field of mathematics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信