Abstract data `IEEE' may be used for implementation of a cross-compiler. This abstract data implements IEEE floating point arithmetic by machine independent way with the aid of package `arithm'. This abstract data is necessary because host machine may not support such arithmetic for target machine. For example, VAX does not support IEEE floating point arithmetic. The floating point numbers are represented by bytes in big endian mode. The implementation of the package functions are not sufficiently efficient in order to use for run-time. The package functions are oriented to implement constant-folding in compilers. All integer sizes (see transformation functions) are given in bytes and must be positive.
Functions of addition, subtraction, multiplication, division, conversion floating point numbers of different formats can fix input exceptions. If an operand of such operation is trapping (signal) not a number then invalid operation and reserved operand exceptions are fixed and the result is (quiet) NaN, otherwise if an operand is (quiet) NaN then only reserved operand exception is fixed and the result is (quiet) NaN. Operation specific processing the rest of special case values of operands is placed with description of the operation. In general case the function can fix output exceptions and produces results for exception according to the following table. The result and status for a given exceptional operation are determined by the highest priority exception. If, for example, an operation produces both overflow and imprecise result exceptions, the overflow exception, having higher priority, determines the behavior of the operation. The behavior of this operation is therefore described by the Overflow entry of the table.
    Exception|Condition|                     |Result |Status
  -----------|---------|---------------------|-------|-------------
             |masked   |         IEEE_RN(_RP)| +Inf  |IEEE_OFL and
             |overflow | sign +  IEEE_RZ(_RM)| +Max  |IEEE_IMP    
             |exception|---------------------|-------|-------------
    Overflow |         | sign -  IEEE_RN(_RM)| -Inf  |IEEE_OFL and
             |         |         IEEE_RZ(_RP)| -Max  |IEEE_IMP    
             |---------|---------------------|-------|-------------
             |unmasked | Precise result      |See    |IEEE_OFL
             |overflow |---------------------|above  |-------------
             |exception| Imprecise result    |       |IEEE_OFL and
             |         |                     |       |IEEE_IMP    
  -----------|---------|---------------------|-------|-------------
             |masked   |                     |Rounded|IEEE_UFL and
             |underflow| Imprecise result    |result |IEEE_IMP    
   Underflow |exception|                     |       |     
             |---------|---------------------|-------|-------------
             |unmasked | Precise result      |result |IEEE_UFL
             |underflow|---------------------|-------|-------------
             |exception| Imprecise result    |Rounded|IEEE_UFL and
             |         |                     |result |IEEE_IMP    
  -----------|-------------------------------|-------|-------------
             |masked imprecise exception     |Rounded|IEEE_IMP
   Imprecise |                               |result |             
             |-------------------------------|-------|-------------
             |unmasked imprecise exception   |Rounded|IEEE_IMP
             |                               |result |             
The package uses package `bits'. The interface part of the abstract data is file `IEEE.h'. The implementation part is file `IEEE.c'. The interface contains the following external definitions:
have values which are sizes of IEEE single, double, and quad precision floating point numbers (`4', `8', and `16' correspondingly).
have values which are maximal length of string generated by functions creating decimal ascii representation of IEEE floats (see functions IEEE_single_to_string, IEEE_doublele_to_string, and IEEE_quad_to_string).
have values which are maximal length of string generated by functions creating binary ascii representation of IEEE floats with given base (see functions IEEE_single_to_binary_string, IEEE_doublele_to_binary_string, and IEEE_quad_to_binary_string).
represent correspondingly IEEE single precision, double, and quad precision floating point numbers. The size of these type are equal to `IEEE_FLOAT_SIZE', `IEEE_DOUBLE_SIZE', and `IEEE_QUAD_SIZE'.
        `void IEEE_reset (void)'
        
        `IEEE_get_sticky_status_bits',
        `IEEE_get_status_bits', and
        `IEEE_get_trap_mask'.
        
        `int IEEE_get_trap_mask (void)'
        
        `int IEEE_set_trap_mask (int mask)'
        
If the mask bit corresponding given exception is set, a floating point exception trap does not occur for given exception. Such exception is said to be masked exception. Initial exception trap mask is zero. Remember that more one exception may be occurred simultaneously.
        `int IEEE_set_sticky_status_bits (int mask)'
        
Function
        `int IEEE_get_sticky_status_bits (void)'
        
        `int IEEE_get_status_bits (void)'
        
defines rounding control (round to nearest representable number, round toward minus infinity, round toward plus infinity, round toward zero).
Round to nearest means the result produced is the representable value nearest to the infinitely-precise result. There are special cases when infinitely precise result falls exactly halfway between two representable values. In this cases the result will be whichever of those two representable values has a fractional part whose least significant bit is zero.
Round toward minus infinity means the result produced is the representable value closest to but no greater than the infinitely precise result.
Round toward plus infinity means the result produced is the representable value closest to but no less than the infinitely precise result.
Round toward zero, i.e. the result produced is the representable value closest to but no greater in magnitude than the infinitely precise result. There are two functions
        `int IEEE_set_round (int round_mode)'
        
        `int IEEE_get_round (void)'
        
        `void default_floating_point_exception_trap (void)'
        
        `void (*IEEE_set_floating_point_exception_trap
                (void (*function) (void))) (void)'
        
        `IEEE_float_t IEEE_positive_zero (void)'
        
        `IEEE_negative_zero',
        `IEEE_NaN',
        `IEEE_trapping_NaN',
        `IEEE_positive_infinity',
        `IEEE_negative_infinity',
        `IEEE_double_positive_zero',
        `IEEE_double_negative_zero',
        `IEEE_double_NaN',
        `IEEE_double_trapping_NaN',
        `IEEE_double_positive_infinity',
        `IEEE_double_negative_infinity'.
        `IEEE_quad_positive_zero',
        `IEEE_quad_negative_zero',
        `IEEE_quad_NaN',
        `IEEE_quad_trapping_NaN',
        `IEEE_quad_positive_infinity',
        `IEEE_quad_negative_infinity'.
        
According to the IEEE standard NaN (and trapping NaN) can be represented by more one bit string. But all functions of the package generate and use only one its representation created by function `IEEE_NaN' (and `IEEE_trapping_NaN', `IEEE_double_NaN', `IEEE_double_trapping_NaN', `IEEE_quad_NaN', `IEEE_quad_trapping_NaN'). A (quiet) NaN does not cause an Invalid Operation exception and can be reported as an operation result. A trapping NaN causes an Invalid Operation exception if used as in input operand to floating point operation. Trapping NaN can not be reported as an operation result.
        `int IEEE_is_positive_zero (IEEE_float single_float)'
        
        `IEEE_is_negative_zero',
        `IEEE_is_NaN',
        `IEEE_is_trapping_NaN',
        `IEEE_is_positive_infinity',
        `IEEE_is_negative_infinity',
        `IEEE_is_positive_maximum' (positive max value),
        `IEEE_is_negative_maximum',
        `IEEE_is_positive_minimum' (positive min value),
        `IEEE_is_negative_minimum',
        `IEEE_is_double_positive_zero',
        `IEEE_is_double_negative_zero',
        `IEEE_is_double_NaN',
        `IEEE_is_double_trapping_NaN',
        `IEEE_is_double_positive_infinity',
        `IEEE_is_double_negative_infinity',
        `IEEE_is_double_positive_maximum',
        `IEEE_is_double_negative_maximum',
        `IEEE_is_double_positive_minimum',
        `IEEE_is_double_negative_minimum'.
        `IEEE_is_quad_positive_zero',
        `IEEE_is_quad_negative_zero',
        `IEEE_is_quad_NaN',
        `IEEE_is_quad_trapping_NaN',
        `IEEE_is_quad_positive_infinity',
        `IEEE_is_quad_negative_infinity',
        `IEEE_is_quad_positive_maximum',
        `IEEE_is_quad_negative_maximum',
        `IEEE_is_quad_positive_minimum',
        `IEEE_is_quad_negative_minimum'.
        
        `int IEEE_is_normalized (IEEE_float_t single_float)'
        
        `IEEE_is_denormalized'
        
        `IEEE_is_double_normalized' and
        `IEEE_is_double_denormalized' and
        `IEEE_is_quad_normalized' and
        `IEEE_is_quad_denormalized'
        
        `IEEE_float_t IEEE_add_single (IEEE_float_t single1,
                                       IEEE_float_t single2)'
        
        `IEEE_subtract_single',
        `IEEE_multiply_single',
        `IEEE_divide_single',
        `IEEE_add_double',
        `IEEE_subtract_double',
        `IEEE_multiply_double',
        `IEEE_divide_double'.
        `IEEE_add_quad',
        `IEEE_subtract_quad',
        `IEEE_multiply_quad',
        `IEEE_divide_quad'.
        
            first  |         second operand                
            operand|---------------------------------------
                   |    +Inf      |    -Inf     |   Others
            -------|--------------|-------------|----------
            +Inf   |    +Inf      |     NaN     |   +Inf
                   |    none      |IEEE_INV(_RO)|   none
            -------|--------------|-------------|----------
            -Inf   |    NaN       |    -Inf     |   -Inf
                   |IEEE_INV(_RO) |    none     |   none
            -------|--------------|-------------|----------
            Others |    +Inf      |    -Inf     |
                   |    none      |    none     |          
            first  |         second operand                
            operand|---------------------------------------
                   |    +Inf     |    -Inf      |   Others
            -------|-------------|--------------|----------
            +Inf   |     NaN     |    +Inf      |   +Inf
                   |IEEE_INV(_RO)|    none      |   none
            -------|-------------|--------------|----------
            -Inf   |    -Inf     |    NaN       |   -Inf
                   |    none     |IEEE_INV(_RO) |   none
            -------|-------------|--------------|----------
            Others |    -Inf     |    +Inf      |
                   |    none     |    none      |          
        first  |         second operand                
        operand|---------------------------------------------------
               |    +Inf     |    -Inf     |    0        |   Others
        -------|-------------|-------------|-------------|---------
        +Inf   |    +Inf     |    -Inf     |    NaN      |  (+-)Inf
               |    none     |    none     |IEEE_INV(_RO)|   none  
        -------|-------------|-------------|-------------|---------
        -Inf   |    -Inf     |    +Inf     |    NaN      |  (+-)Inf
               |    none     |    none     |IEEE_INV(_RO)|   none  
        -------|-------------|-------------|-------------|---------
        0      |     NaN     |    NaN      |   (+-)0     |  (+-)0  
               |IEEE_INV(_RO)|IEEE_INV(_RO)|   none      |  none   
        -------|-------------|-------------|-------------|---------
        Others |   (+-)Inf   |   (+-)Inf   |   (+-)0     |         
               |    none     |    none     |   none      |         
        first  |         second operand                
        operand|---------------------------------------------------
               |    +Inf     |    -Inf     |    0        |   Others
        -------|-------------|-------------|-------------|---------
        +Inf   |     NaN     |     NaN     |   (+-)Inf   |  (+-)Inf
               |IEEE_INV(_RO)|IEEE_INV(_RO)|   none      |   none  
        -------|-------------|-------------|-------------|---------
        -Inf   |     NaN     |     NaN     |   (+-)Inf   |  (+-)Inf
               |IEEE_INV(_RO)|IEEE_INV(_RO)|   none      |   none  
        -------|-------------|-------------|-------------|---------
        0      |   (+-)0     |   (+-)0     |     NaN     |  (+-)0  
               |   none      |   none      |IEEE_INV(_RO)|  none   
        -------|-------------|-------------|-------------|---------
        Others |   (+-)0     |   (+-)0     |   (+-)Inf   |         
               |   none      |    none     |   IEEE_DZ   |         
        `int IEEE_eq_single (IEEE_float_t single1,
                             IEEE_float_t single2)'
        
        `IEEE_ne_single',
        `IEEE_gt_single',
        `IEEE_lt_single',
        `IEEE_ge_single',
        `IEEE_le_single',
        `IEEE_eq_double',
        `IEEE_ne_double',
        `IEEE_gt_double',
        `IEEE_lt_double',
        `IEEE_ge_double',
        `IEEE_le_double'.
        `IEEE_eq_quad',
        `IEEE_ne_quad',
        `IEEE_gt_quad',
        `IEEE_lt_quad',
        `IEEE_ge_quad',
        `IEEE_le_quad'.
        
        
        first  |         second operand                
        operand|---------------------------------------
               |    SNaN     |    QNaN      |   Others
        -------|-------------|--------------|----------
        SNaN   |   FALSE     |   FALSE      |  FALSE
               |  IEEE_INV   |  IEEE_INV    | IEEE_INV
        -------|-------------|--------------|----------
        QNaN   |   FALSE     |   FALSE      |  FALSE
               |  IEEE_INV   |    none      |   none
        -------|-------------|--------------|----------
        Others |   FALSE     |   FALSE      |
               |  IEEE_INV   |    none      |          
        
        first  |         second operand                
        operand|---------------------------------------
               |    SNaN     |    QNaN      |   Others
        -------|-------------|--------------|----------
        SNaN   |   FALSE     |   FALSE      |  FALSE
               |  IEEE_INV   |  IEEE_INV    | IEEE_INV
        -------|-------------|--------------|----------
        QNaN   |   FALSE     |   FALSE      |  FALSE
               |  IEEE_INV   |  IEEE_INV    | IEEE_INV
        -------|-------------|--------------|----------
        Others |   FALSE     |   FALSE      |
               |  IEEE_INV   |  IEEE_INV    |          
        `IEEE_double_t IEEE_single_to_double
                       (IEEE_float_t single_float)',
        
        `IEEE_float_t IEEE_double_to_single
                      (IEEE_double_t double_float)',
        
        `IEEE_quad_t IEEE_single_to_quad
                     (IEEE_float_t single_float)',
        
        `IEEE_float_t IEEE_quad_to_single
                      (IEEE_quad_t quad_float)',
        
        `IEEE_quad_t IEEE_double_to_quad
                     (IEEE_double_t double_float)',
        
        `IEEE_double_t IEEE_quad_to_double
                      (IEEE_quad_t quad_float)',
        
        `IEEE_float_t IEEE_single_from_integer
                      (int size, const void *integer)',
        
        `IEEE_float_t IEEE_single_from_unsigned_integer
                      (int size, const void *unsigned_integer)',
        
        `IEEE_double_t IEEE_double_from_integer
                       (int size, const void *integer)',
        
        `IEEE_double_t IEEE_double_from_unsigned_integer
                       (int size, const void *unsigned_integer)',
        
        `IEEE_quad_t IEEE_quad_from_integer
                     (int size, const void *integer)',
        
        `IEEE_quad_t IEEE_quad_from_unsigned_integer
                     (int size, const void *unsigned_integer)',
        
        `void IEEE_single_to_integer
              (int size, IEEE_float_t single_float, void *integer)',
        
        `void IEEE_single_to_unsigned_integer
              (int size, IEEE_float_t single_float,
               void *unsigned_integer)',
        
        `void IEEE_double_to_integer
              (int size, IEEE_double_t double_float, void *integer)',
        
        `void IEEE_double_to_unsigned_integer
              (int size, IEEE_double_t double_float,
               void *unsigned_integer)'.
        `void IEEE_quad_to_integer
              (int size, IEEE_quad_t quad_float, void *integer)',
        
        `void IEEE_quad_to_unsigned_integer
              (int size, IEEE_quad_t quad_float,
               void *unsigned_integer)'.
        
                    Operand     | Result & Exception
                  --------------|-------------------
                      SNaN      |     0  
                                |IEEE_INV(_RO)
                  --------------|-------------------
                      QNaN      |     0    
                                |IEEE_INV(_RO)     
                  --------------|-------------------
                      +Inf      |    IMax     
                                |  IEEE_INV     
                  --------------|-------------------
                      -Inf      |    IMin     
                                |  IEEE_INV     
                  --------------|-------------------
                      Others    |             
                                |               
                    Operand     | Result & Exception
                  --------------|-------------------
                      SNaN      |     0  
                                |IEEE_INV(_RO)
                  --------------|-------------------
                      QNaN      |     0    
                                |IEEE_INV(_RO)     
                  --------------|-------------------
                      +Inf      |    IMax     
                                |  IEEE_INV     
                  --------------|-------------------
                      -Inf or   |    0     
                 negative number|  IEEE_INV     
                  --------------|-------------------
                      Others    |             
                                |               
        `char *IEEE_single_to_string (IEEE_float_t single_float,
                                      char *result)'
        
        `IEEE_string_to_double'
        `IEEE_string_to_quad'
        
        `char *IEEE_single_to_binary_string (IEEE_float_t single_float,
                                             int base, char *result)'
        
        `IEEE_string_to_binary_double'
        `IEEE_string_to_binary_quad'
        
        `char *IEEE_single_from_string (const char *operand,
                                        IEEE_float_t *result)'
        
           ['+' | '-'] [<decimal digits>] [ '.' [<decimal digits>] ]
                [ ('e' | 'E') ['+' | '-'] <decimal digits>]
        
The function can fix output exceptions as described above. There are analogous functions
        `IEEE_double_from_string'
        `IEEE_quad_from_string'
        
        `char *IEEE_single_from_binary_string (const char *operand,
                                               int base,
                                               IEEE_float_t *result)'
        
           ['+' | '-'] [<digits less base>] [ '.' [<digits less base>] ]
                [ ('p' | 'P') ['+' | '-'] <decimal digits>]
        
The function can fix output exceptions as described above. There are analogous functions
        `IEEE_double_from_binary_string'
        `IEEE_quad_from_binary_string'