Proper Overflow and Underflow

Scaling methods
We have several choices when implementing.

Two methods that operate to scale an operand:
1.  Multiply by 2 ** n  (with the fmul instruction of x86)
2.  Use the fscale instruction
The timing to load scales into FPU registers:
1.  When scaling is carried out.
2.  When a strictfp method is called     (pre-loading)
3.  When the JIT compiler is initialized.
There is a trade-off between these choices
             Trying to save memory accesses with pre-loading causes
             the registers to be occupied for a long time.

There are more choices...
Please see our paper.