Small benchmark - Multiplication 10 multiplications were repeated 10 ** 6 times. The amount of time taken by 10 ** 7 operations was shown. The effect of pre-load is small in comparison with the overhead introduced by the scaling itself. Scaling with the fscale instruction showed better results.