Wednesday, 25 November 2015

FLOATING-POINT IN BINARY?

FLOATING POINT STANDARD:
  • Defined by IEEE Std 754-1985
  • Developed in response to divergence of representations
  • Portability issues for scientific code
  • Now almost universally adopted
  • Two representations
  • Single precision (32-bit)
  • Double precision (64-bit)







S: sign bit (0 => non-negative, 1 => negative)
Normalize significand: 1.0 ≤ |significand| < 2.0
  • Significand is Fraction with the “1.” restored
  • Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit)
Exponent: excess representation: actual exponent + Bias
  • Ensures exponent is unsigned
  • Single precision: Bias = 127;
  • Double precision: Bias = 1203 

FOR MORE DETAILS CLICK HERE


No comments:

Post a Comment