Same idea than Linux adaptive mutex: spin a bit when
the lock can't be acquired.
max : 100 rounds
round to try : min(max, cached*2 + 10)
update cached: cached += (count - cached) / 8
use trylock with the standard lock when count reaches
the current max.
Add asm("rep; nop"); for x86_64

If performed well, extend to mutex and replace
the Linux adaptive patch (marked _NP :-).