BUG/MINOR: task: close a tiny race in the inter-thread wakeup

__task_wakeup() takes care of a small race that exists between threads,
but it uses a store barrier that is not sufficient since apparently the
state read after clearing the leaf_p pointer sometimes is incorrect. This
results in missed wakeups between threads competing at a high rate. Let's
use a full barrier instead to serialize the operations.

This may be backported to 1.9 though it's extremely unlikely that this
bug will ever manifest itself there.
1 file changed