168fc5332c7b3f43c8841a999fc40a3acef85223 - haproxy

commit	168fc5332c7b3f43c8841a999fc40a3acef85223	[log] [tgz]
author	Willy Tarreau <w@1wt.eu>	Mon Mar 01 06:21:22 2021 +0100
committer	Willy Tarreau <w@1wt.eu>	Fri Mar 05 08:30:08 2021 +0100
tree	c3d3a43b36c374820d70ef6d620c9571047ceab4
parent	958ae26c3558f0a5cdcb7a92cc535f1cd1ac9a64 [diff]

BUG/MINOR: mt-list: always perform a cpu_relax call on failure

On highly threaded machines it is possible to occasionally trigger the
watchdog on certain contended areas like the server's connection list,
because while the mechanism inherently cannot guarantee a constant
progress, it lacks CPU relax calls which are absolutely necessary in
this situation to let a thread finish its job.

The loop's "while (1)" was changed to use a "for" statement calling
__ha_cpu_relax() as its continuation expression. This way the "continue"
statements jump to the unique place containing the pause without
excessively inflating the code.

This was sufficient to definitely fix the problem on 64-core ARM Graviton2
machines. This patch should probably be backported once it's confirmed it
also helps on many-cores x86 machines since some people are facing
contention in these environments. This patch depends on previous commit
"REORG: atomic: reimplement pl_cpu_relax() from atomic-ops.h".

An attempt was made to first read the value before exchanging, and it
significantly degraded the performance. It's very likely that this caused
other cores to lose exclusive ownership on their line and slow down their
next xchg operation.

In addition it was found that MT_LIST_ADD is significantly faster than
MT_LIST_ADDQ under high contention, because it fails one step earlier
when conflicting with an adjacent MT_LIST_DEL(). It might be worth
switching some operations' order to favor MT_LIST_ADDQ() instead.

include/haproxy/list.h[diff]

1 file changed

tree: c3d3a43b36c374820d70ef6d620c9571047ceab4