BUG/MEDIUM: pool: fix rare risk of deadlock in pool_flush()

As reported by github user @JB0925 in issue #2427, there is a possible
crash in pool_flush(). The problem is that if the free_list is not empty
in the first test, and is empty at the moment the xchg() is performed,
for example because another thread called it in parallel, we place a
POOL_BUSY there that is never removed later, causing the next thread to
wait forever.

This was introduced in 2.5 with commit 2a4523f6f ("BUG/MAJOR: pools: fix
possible race with free() in the lockless variant"). It has probably
very rarely been detected, because:
  - pool_flush() is only called when stopping is set
  - the function does nothing if global pools are disabled, which is
    the case on most modern systems with a fast memory allocator.

It's possible to reproduce it by modifying __task_free() to call
pool_flush() on 1% of the calls instead of only when stopping.

The fix is quite simple, it consists in moving the zeroing of the
entry in the break path after verifying that the entry was not already
busy.

This must be backported wherever commit 2a4523f6f is.

(cherry picked from commit b746af9990d2f7fe10d200026831b6eb10c4953f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 7bd8817e3c253e1090e0c10cc126a08538ecec7c)
[wt: no buckets in 2.8]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit d5ebde5295d560bd4915df68b930a3ed166cbc96)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 5af6c467d4add12565769ed3626702580f3d48f3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit bed012c9fcd8b4bfd69d69c4814d280ec2857fb3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
1 file changed