MEDIUM: task: change the construction of the loop in process_runnable_tasks()
This patch slightly rearranges the loop to pack the locked code a little
bit, and to try to concentrate accesses to the tree together to benefit
more from the cache.
It also fixes how the loop handles the right margin : now that is guaranteed
that the retrieved nodes are filtered to only match the current thread, we
don't need to rewind every 16 entries. Instead we can rewind each time we
reach the right margin again.
With this change, we now achieve the following performance for 10 H2 conns
each containing 100 streams :
1 thread : 550kreq/s
2 thread : 644kreq/s
3 thread : 598kreq/s
1 file changed