Letting the compiler decide how to optimize the multiply by five
gives it the freedom to make better choices for the best technique
for a given target machine.
For example, GCC on x86_64 produces a little bit better code:
Old-way (3 steps with a data dependency between each step):
shrq $5, %r13
leaq 1(%rbx,%r13), %rax
leaq (%rax,%rbx,4), %rbx
New-way (3 steps with no dependency between the first two steps
which can be run in parallel):
leaq (%rbx,%rbx,4), %rax # i*5
shrq $5, %r13 # perturb >>= PERTURB_SHIFT
leaq 1(%r13,%rax), %rbx # 1 + perturb + i*5