BUG/MAJOR: hlua: improper lock usage with hlua_ctx_resume()
hlua_ctx_resume() itself can safely be used as-is in a multithreading
context because it takes care of taking the lua lock.
However, when hlua_ctx_resume() returns, the lock is released and it is
thus the caller's responsibility to ensure it owns the lock prior to
performing additional manipulations on the Lua stack. Unfortunately, since
early haproxy lua implementation, we used to do it wrong:
The most common hlua_ctx_resume() pattern we can find in the code (because
it was duplicated over and over over time) is the following:
|ret = hlua_ctx_resume()
|switch (ret) {
| case HLUA_E_OK:
| break;
| case HLUA_E_ERRMSG:
| break;
| [...]
|}
Problem is: for some of the switch cases, we still perform lua stack
manipulations. This is the case for the HLUA_E_ERRMSG for instance where
we often use lua_tostring() to retrieve last lua error message on the top
of the stack, or sometimes for the HLUA_E_OK case, when we need to perform
some lua cleanup logic once the resume ended. But all of this is done
WITHOUT the lua lock, so this means that the main lua stack could be
accessed simultaneously by concurrent threads when a script was loaded
using 'lua-load'.
While it is not critical for switch-cases dedicated to error handling,
(those are not supposed to happen very often), it can be very problematic
for stack manipulations occuring in the HLUA_E_OK case under heavy load
for instance. In this case, main lua stack corruptions will eventually
happen. This is especially true inside hlua_filter_new(), where this bug
was known to cause lua stack corruptions under load, leading to lua errors
and even crashing the process as reported by @bgrooot in GH #2467.
The fix is relatively simple, once hlua_ctx_resume() returns: we should
consider that ANY lua stack access should be lua-lock protected. If the
related lua calls may raise lua errors, then (RE)SET_SAFE_LJMP
combination should be used as usual (it allows to lock the lua stack and
catch lua exceptions at the same time), else hlua_{lock,unlock} may be
used if no exceptions are expected.
This patch should fix GH #2467.
It should be backported to all stable versions.
[ada: some ctx adj will be required for older versions as event_hdl
doesn't exist prior to 2.8 and filters were implemented in 2.5, thus
some chunks won't apply]
(cherry picked from commit 8670db7a898308455bf02315710c1204d39e4664)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit b0ff26d52bdf53f924800927ea32382bbd40924d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 258fc3fa01157e12969186a9b57ab9ab2491eb13)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 9fa313713b99aa5018071d83ec665e38d2bd0c1a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit e41a1c7fc1189cc16d3e32d1a6d782757eb27f97)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
1 file changed