d48ed6643b29a9f16616916742e83bf36622df00 - haproxy

commit	d48ed6643b29a9f16616916742e83bf36622df00	[log] [tgz]
author	Willy Tarreau <w@1wt.eu>	Fri Oct 16 09:31:41 2020 +0200
committer	Willy Tarreau <w@1wt.eu>	Fri Oct 16 17:15:54 2020 +0200
tree	921e45b72c2b66cb8ccf7c5190354f3394e79dde
parent	61f799b8da70fdc9fbb345778ae8190a9b16fda7 [diff]

MEDIUM: task: use an upgradable seek lock when scanning the wait queue

Right now when running a configuration with many global timers (e.g. many
health checks), there is a lot of contention on the global wait queue
lock because all threads queue up in front of it to scan it.

With 2000 servers checked every 10 milliseconds (200k checks per second),
after 23 seconds running on 8 threads, the lock stats were this high:

  Stats about Lock TASK_WQ:
      write lock  : 9872564
      write unlock: 9872564 (0)
      wait time for write     : 9208.409 msec
      wait time for write/lock: 932.727 nsec
      read lock   : 240367
      read unlock : 240367 (0)
      wait time for read      : 149.025 msec
      wait time for read/lock : 619.991 nsec

i.e. ~5% of the total runtime spent waiting on this specific lock.

With upgradable locks we don't need to work like this anymore. We
can just try to upgade the read lock to a seek lock before scanning
the queue, then upgrade the seek lock to a write lock for each element
we want to delete there and immediately downgrade it to a seek lock.

The benefit is double:
  - all other threads which need to call next_expired_task() before
    polling won't wait anymore since the seek lock is compatible with
    the read lock ;

  - all other threads competing on trying to grab this lock will fail
    on the upgrade attempt from read to seek, and will let the current
    lock owner finish collecting expired entries.

Doing only this has reduced the wake_expired_tasks() CPU usage in a
very large servers test from 2.15% to 1.04% as reported by perf top,
and increased by 3% the health check rate (all threads being saturated).

This is expected to help against (and possibly solve) the problem
described in issue #875.

src/task.c[diff]

1 file changed

tree: 921e45b72c2b66cb8ccf7c5190354f3394e79dde