Proper Overflow and Underflow Scaling methods We have several choices when implementing. Two methods that operate to scale an operand: 1. Multiply by 2 ** n (with the fmul instruction of x86) 2. Use the fscale instruction The timing to load scales into FPU registers: 1. When scaling is carried out. 2. When a strictfp method is called (pre-loading) 3. When the JIT compiler is initialized. There is a trade-off between these choices Trying to save memory accesses with pre-loading causes the registers to be occupied for a long time. There are more choices... Please see our paper.