Floating Point

In floating point notation, the real number is stored as 2 separate bits of data

  1. A storage location called the mantissa holds the complete number without the point.
  2. A storage location called the exponent holds the number of places that the point must be moved in the original number to place it at the left hand side.

To Work out the mantissa and exponent you need to 

  • Move the point all the way so the number is a fractional number
  • The entire number without the point is the mantissa 
  • The number of places the point was moved is the exponent

Example 1 

How would 1101.0011 be represented in binary floating point representation using 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent? 

The decimal place moved 4 places to the left

Sign would be 0

The mantissa is then 110100111 + 000000 to make it up to 15 bits

= 110100111000000 

The exponent is then 4 = 00000100 in  binary 

Example 2 – Negative Exponent

How would 0.0001001 be represented in binary floating point representation using 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent? 

The decimal place moved 3 places to the right                      

Sign would be 0

The mantissa is then 1001 + 00000000000 to make it up to 15 bits

= 100100000000000 

The exponent is then -3 

00000011 
11111100 
              +1 
11111101 would be the exponent

 

Example 3 – Negative Mantissa

How would -101.00011 be represented in binary floating point representation using 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent? 

The decimal place moved 3 places to the left

The exponent is then 3  

The Sign is 1 because the mantissa is negative 

The mantissa is then 10100011 + 0000000 to make it up to 15 bits

Range and Accuracy

It is possible to improve the accuracy of a floating point number by increasing the number of bits devoted to the mantissa. The range of numbers held can be increased if more bits are devoted to the storage of the exponent.

There will always be a trade-off between accuracy and range when using floating point notation, as there will always be a set number of bits allocated to storing real numbers, with the potential to increase or decrease the number of bits used for the mantissa against the number of bits used for the exponent.