IEEE 754 revision

IEEE 754 revision

"This article describes the revision process of the IEEE 754 standard, 2000-2008, and the changes included in the revision. For a description of the standard itself, see IEEE 754-2008."

IEEE 754-2008 (previously known as "IEEE 754r") was published in August 2008 and is a significant revision to, and replaces, the IEEE 754-1985 floating point standard. The revision extended the previous standard where it was necessary, added decimal arithmetic and formats, tightened up certain areas of the original standard which were left undefined, and merged in IEEE 854 (the radix-independent floating-point standard).

In a few cases, where stricter definitions of binary floating-point arithmetic might be performance-incompatible with some existing implementation, they were made optional.

Revision process

The standard was under revision since 2000, with a target completion date of December 2006. The revision of an IEEE standard broadly follows three phases:
#Working group – a committee that creates a draft standard
#Ballot – interested parties subscribe to the "balloting group" and vote on the draft (75% of the group must participate, and 75% must approve for the draft to go forward); comments from the votes are resolved by a "Ballot Resolution Committee" (BRC) and changes made have to be recirculated with a new ballot if they are substantive
#When all comments are resolved and there are no further changes, the draft is submitted to the IEEE for review, approval, and publication (this can also result in changes and ballots, although this is rare).

On 11 June 2008, it was approved unanimously by the IEEE Revision Committee (RevCom), and it was formally approved by the IEEE-SA Standards Board on 12 June 2008. It was published on 29 August 2008.

754r Working Group phase

Participation in drafting the standard was open to people with a solid knowledge of floating-point arithmetic. More than 90 people attended at least one of the monthly meetings, which were held in Silicon Valley, and many more participated though the mailing list.

Progress at times was slow, leading the chairman to declare at the [http://nonabelian.com/754/minutes/754/050915 September 15, 2005 meeting] that "no progress is being made, I am suspending these meetings until further notice on those grounds". In December 2005, the committee reorganized under new rules with a target completion date of December 2006.

New policies and procedures were adopted in February 2006. In September 2006 a working draft was approved to be sent to the parent sponsoring committee (the IEEE Microprocessor Standards Committee, or MSC) for editing and to be sent to sponsor ballot.

754r Ballot phase

The MSC accepted the draft on 9 October 2006; the draft sent to the MSC can be found [http://www.validlab.com/754R/drafts/archive/2006-10-04.pdf here] . Note that the draft has been changed significantly in detail during the balloting process, although the content is broadly the same.

The first sponsor ballot took place from 2006-11-29 through 2006-12-28. Of the 84 members of the voting body, 85.7% responded—78.6% voted approval. There were negative votes (and over 400 comments) so there was a recirculation ballot in March 2007; this received an 84% approval. There were sufficient comments (over 130) from that ballot that a third draft was prepared for second, 15-day, recirculation ballot which started in mid-April 2007. For a technical reason, the ballot process was re-started with the 4th ballot in October 2007; there were also substantial changes in the draft resulting from 650 voters' comments and from requests from the sponsor (the IEEE MSC); this ballot just failed to reach the required 75% approval. The 5th ballot had a 98.0% response rate with 91.0% approval, with comments leading to relatively small changes. The 6th, 7th, and 8th ballots sustained approval ratings of over 90% with progressively fewer comments on each draft; the 8th (which had no in-scope comments: 9 were repeats of previous comments and one referred to material not in the draft) was submitted to the IEEE Standards Revision Committee ('RevCom') for approval as an IEEE standard.

754r Review and Approval phase

The IEEE Standards Revision Committee (RevCom) considered and unanimously approved the IEEE 754r draft at its June 2008 meeting, and it was approved by the IEEE-SA Standards Board on 12 June 2008. Final editing is complete and the document has now been forwarded to the IEEE Standards Publications Department for publication.

IEEE Std 754-2008 publication

The new IEEE 754 (formally IEEE Std 754-2008, the IEEE Standard for Floating-Point Arithmetic) was published by the IEEE Computer Society on 29 August 2008, and is available from the [http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4610935 IEEE Xplore website] .

This standard replaces IEEE 754-1985. IEEE 854, the Radix-Independent floating-point standard will be withdrawn in December 2008.

Summary of the revisions

The most obvious enhancements to the standard are the addition of a 16-bit and a 128-bit binary type and three decimal types, some new operations, and many recommended functions. However, there have been significant clarifications in terminology throughout. This summary highlights the main differences in each major clause of the standard.

Clause 1: Overview

The scope (determined by the sponsor of the standard) has been widened to include decimal formats and arithmetic, and adds extendable formats.

Clause 2: Definitions

Many of the definitions have been rewritten for clarification and consistency. A few terms have been renamed for clarity (for example, "denormalized" has been renamed to "subnormal").

Clause 3: Formats

The description of formats has been made more regular, with a distinction between "arithmetic formats" (in which arithmetic may be carried out) and "interchange formats" (which have a standard encoding). Conformance to the standard is now defined in these terms.

The specification levels of a floating-point format have been enumerated, to clarify the distinction between
#the theoretical real numbers (an extended number line)
#the entities which can be represented in the format (a finite set of numbers, together with −0, infinities, and NaN)
#the particular representations of the entities: sign-exponent-significand, "etc."
#the bit-pattern (encoding) used.

The sets of representable entities are then explained in detail, showing that they can be treated with the significand being considered either as a fraction or an integer. The particular sets known as "basic formats" are defined, and the encodings used for interchange of binary and decimal formats are explained.

The binary interchange formats have the "half precision" (16-bit storage format) and "quad precision" (128-bit format) added, together with generalized formulae for some wider formats; the basic formats have 32-bit, 64-bit, and 128-bit encodings.

Three new decimal formats are described, matching the lengths of the 32–128-bit binary formats. These give decimal interchange formats with 7, 16, and 34-digit significands, which may be normalized or unnormalized. For maximum range and precision, the formats merge part of the exponent and significand into a "combination field", and compress the remainder of the significand using either a decimal integer encoding (which uses "Densely Packed Decimal", or DPD, a compressed form of BCD) encoding or conventional binary integer encoding. The basic formats are the two larger sizes, which have 64-bit and 128-bit encodings. Generalized formulae for some other interchange formats are also specified.

Extended and extendable formats allow for arithmetic at other precisions and ranges.

Clause 4: Attributes and rounding

This clause has been changed to encourage the use of static attributes for controlling floating-point operations, and (in addition to required rounding attributes) allow for alternate exception handling, widening of intermediate results, value-changing optimizations, and reproducibility.

The "round-to-nearest, ties away from zero" rounding attribute has been added (required for decimal operations only).

Clause 5: Operations

This section has numerous clarifications (notably in the area of comparisons), and several previously recommended operations (such as copy, negate, abs, and class) are now required.

New operations include Fused multiply-add (FMA), explicit conversions, classification predicates (isNan("x"), "etc."), various min and max functions, a total ordering predicate, and two decimal-specific operations (samequantum and quantize).

min and max

The min and max operations are defined but leave some leeway for the case where the inputs are equal in value but differ in representation. In particular:
* min(+0,−0) or min(−0,+0) must produce something with a value of zero but may always return the first argument.

In order to support operations such as windowing in which a NaN input should be quietly replaced with one of the end points, min and max are defined to select a number, x, in preference to a quiet NaN:
* min(x,NaN) = min(NaN,x) = x
* max(x,NaN) = max(NaN,x) = x

In the current draft, these functions are called "minnum" and "maxnum" to indicate their preference for a number over a NaN.

decimal arithmetic

Decimal arithmetic, compatible with that used in Java, C#, PL/I, COBOL, Python, REXX, "etc.", is also defined in this section. In general, decimal arithmetic follows the same rules as binary arithmetic (results are correctly rounded, and so on), with additional rules that define the exponent of a result (more than one is possible in many cases).

correctly-rounded base conversion

Unlike in 854, 754r requires correctly rounded base conversion between decimal and binary floating point within a range which depends on the format.

Clause 6: Infinity, NaNs, and sign bit

This clause has been revised and clarified, but with no major additions.

Clause 7: Default exception handling

This clause has been revised and considerably clarified, but with no major additions.

Clause 8: Alternate exception handling

This clause has been extended from the previous Clause 8 ('Traps') to allow optional exception handling in various forms, including traps and other models such as try/catch. Traps and other exception mechanisms remain optional, as they were in IEEE 754-1985.

Clause 9: Recommended operations

This clause is new; it recommends fifty operations, including log, power, and trigonometric functions, that language standards should define. These are all optional (none are required in order to conform to the standard). The operations include some on dynamic modes for attributes, and also a set of reduction operations (sum, scaled product, "etc.").

Clause 10: Expression evaluation

This clause is new; it recommends how language standards should specify the semantics of sequences of operations, and points out the subtleties of literal meanings and optimizations that change the value of a result.

Clause 11: Reproducibility

This clause is new; it recommends that language standards should provide a means to write reproducible programs ("i.e.", programs that will produce the same result in all implementations of a language), and describes what needs to be done to achieve reproducible results.

Annex A: Bibliography

This annex is new; it lists some useful references.

Annex B: Program debugging support

This annex is new; it provides guidance to debugger developers for features that are desired for supporting the debugging of floating point code.

Index of operations

This is a new index, which lists all the operations described in the standard (required or optional).

Discussed but not included

* Annex "L" recommended to language developers how to bind items in the standard to features in a language.

* Annex "U" provided guidance on the choice of numeric underflow definitions.

:In 754 the definition of underflow was that the result is tiny and encounters a loss of accuracy.

:Two definitions were allowed for the determination of the 'tiny' condition: before or after rounding the infinitely precise result to working precision, with unbounded exponent.

:Two definitions of loss of accuracy were permitted: inexact result or loss due only to denormalization. No known hardware systems implemented the latter and it has been removed from the revised standard as an option.

:Annex U of 754r recommended that only tininess after rounding and inexact as loss of accuracy be a cause for underflow signal.

* Annex "Z" introduced optional data types for supporting other fixed width floating point formats, as well as arbitrary precision formats ("i.e.", where the precision of representation and rounding is determined at execution time) – some of this material was moved into the body of the draft by generalizing section 5. Arbitrary precision was dropped.

* Inheritance and propagation of modes (exception handling, presubstitution, rounding) and flags (inexact, underflow, overflow, divide by zero, invalid). The desire is to have mode changes be able to be inherited by a callee, but not affect the caller. And have the flags propagate out to a caller.

* Interval and other arithmetics were discussed but not included as being outside scope (and a large piece of work in their own right). Work is starting in 2008 on a proposed IEEE standard for interval arithmetic.

External links

* Committee working page: [http://grouper.ieee.org/groups/754/ IEEE 754: Standard for Binary Floating-Point Arithmetic]
* Final working group draft: [http://www.validlab.com/754R/drafts/archive/2006-10-04.pdf DRAFT Standard for Floating-Point Arithmetic P754] (1.2.5, 4 October 2006; the 8 ballot drafts are not available for copyright reasons).
* [http://www2.hursley.ibm.com/decimal/DPDecimal.html Densely Packed Decimal]
* Prof. Kahan's paper on [http://www.cs.berkeley.edu/~wkahan/Mindless.pdf How Futile are Mindless Assessments of Roundoff in Floating-Point Computation]
* [http://www.open-std.org/JTC1/SC22/WG11 ISO Language Independent Arithmetic Standard]
* RFC 1832 - XDR: External Data Representation RFC


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • IEEE-754 — L’IEEE 754 est un standard pour la représentation des nombres à virgule flottante en binaire. Il est le plus employé actuellement pour le calcul des nombres à virgule flottante dans le domaine informatique, avec les CPU et les FPU. Le standard… …   Wikipédia en Français

  • Ieee 754 — L’IEEE 754 est un standard pour la représentation des nombres à virgule flottante en binaire. Il est le plus employé actuellement pour le calcul des nombres à virgule flottante dans le domaine informatique, avec les CPU et les FPU. Le standard… …   Wikipédia en Français

  • IEEE 754 — L’IEEE 754 est un standard pour la représentation des nombres à virgule flottante en binaire. Il est le plus employé actuellement pour le calcul des nombres à virgule flottante dans le domaine informatique, avec les CPU et les FPU. Le standard… …   Wikipédia en Français

  • IEEE 754-2008 — Der Standard IEEE 754 2008, der früherer Arbeitstitel lautete IEEE 754r, ist eine notwendig gewordene Revision des 1985 verabschiedeten Gleitkommastandards IEEE 754. Der alte Standard war sehr erfolgreich und wurde in zahlreichen Prozessoren und… …   Deutsch Wikipedia

  • IEEE 754-1985 — The IEEE Standard for Binary Floating Point Arithmetic (IEEE 754) is the most widely used standard for floating point computation, and is followed by many CPU and FPU implementations. The standard defines formats for representing floating point… …   Wikipedia

  • IEEE 754-2008 — The IEEE Standard for Floating Point Arithmetic (IEEE 754) is the most widely used standard for floating point computation, and is followed by many hardware (CPU and FPU) and software implementations. Many computer languages allow or require that …   Wikipedia

  • IEEE 754r — ist eine notwendig gewordene Revision des vor etwa 20 Jahren (1985) verabschiedeten Gleitkommastandards IEEE 754. Der alte Standard war sehr erfolgreich und wurde in zahlreichen Prozessoren und Programmiersprachen übernommen. Die Diskussion über… …   Deutsch Wikipedia

  • IEEE 802.11n-2009 — is an amendment to the IEEE 802.11 2007 wireless networking standard to improve network throughput over the two previous standards 802.11a and 802.11g with a significant increase in the maximum net data rate from 54 Mbit/s to 600 Mbit/s (slightly …   Wikipedia

  • IEEE 802.1Q — is the networking standard that supports Virtual LANs (VLANs) on an Ethernet network. The standard defines a system of VLAN tagging for Ethernet frames and the accompanying procedures to be used by bridges and switches in handling such frames.… …   Wikipedia

  • IEEE 802.2 — is the IEEE 802 standard defining Logical Link Control (LLC), which is the upper portion of the data link layer of the OSI Model. The LLC sublayer presents a uniform interface to the user of the data link service, usually the network layer.… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”