BUG/MEDIUM: event_hdl: fix async data refcount issue
In _event_hdl_publish(), when publishing an event to async handler(s),
async_data is allocated only once and then relies on a refcount
logic to reuse the same data block for multiple async event handlers.
(this allows to save significant amount of memory)
Because the refcount is first set to 0, there is a small race where
the consumers could consume async data (async data refcount reaching 0)
before publishing is actually over.
The consequence is that async data may be freed by one of the consumers
while we still rely on it within _event_hdl_publish().
This was discovered by chance when stress-testing the API with multiple
async handlers registered to the same event: some of the handlers were
notified about a new event for which the event data was already freed,
resulting in invalid reads and/or segfaults.
To fix this, we first set the refcount to 1, assuming that the
publish function relies on async_data until the publish is over.
At the end of the publish, the reference to the async data is dropped.
This way, async_data is either freed by _event_hdl_publish() itself
or by one of the consumers, depending on who is the last one relying
on it.
If 68e692da0 ("MINOR: event_hdl: add event handler base api")
is being backported, then this commit should be backported with it.
diff --git a/src/event_hdl.c b/src/event_hdl.c
index 9760746..a9fa93a 100644
--- a/src/event_hdl.c
+++ b/src/event_hdl.c
@@ -173,6 +173,14 @@
state);
}
+static inline void _event_hdl_async_data_drop(struct event_hdl_async_event_data *data)
+{
+ if (HA_ATOMIC_SUB_FETCH(&data->refcount, 1) == 0) {
+ /* we were the last one holding a reference to event data - free required */
+ pool_free(pool_head_sub_event_data, data);
+ }
+}
+
void event_hdl_async_free_event(struct event_hdl_async_event *e)
{
if (unlikely(event_hdl_sub_type_equal(e->type, EVENT_HDL_SUB_END))) {
@@ -183,11 +191,8 @@
event_hdl_drop(e->sub_mgmt.this);
HA_ATOMIC_DEC(&jobs);
}
- else if (e->_data &&
- HA_ATOMIC_SUB_FETCH(&e->_data->refcount, 1) == 0) {
- /* we are the last event holding reference to event data - free required */
- pool_free(pool_head_sub_event_data, e->_data); /* data wrapper */
- }
+ else if (e->_data)
+ _event_hdl_async_data_drop(e->_data); /* data wrapper */
pool_free(pool_head_sub_event, e);
}
@@ -563,7 +568,7 @@
if (HA_ATOMIC_SUB_FETCH(&sub->refcount, 1) != 0)
return;
- /* we are the last event holding reference to event data - free required */
+ /* we were the last one holding a reference to event sub - free required */
if (sub->hdl.private_free) {
/* free private data if specified upon registration */
sub->hdl.private_free(sub->hdl.private);
@@ -746,7 +751,14 @@
/* async data assignment */
memcpy(async_data->data, data->_ptr, data->_size);
- async_data->refcount = 0; /* initialize async->refcount (first use, atomic operation not required) */
+ /* Initialize refcount, we start at 1 to prevent async
+ * data from being freed by an async handler while we
+ * still use it. We will drop the reference when the
+ * publish is over.
+ *
+ * (first use, atomic operation not required)
+ */
+ async_data->refcount = 1;
}
new_event->_data = async_data;
new_event->data = async_data->data;
@@ -766,6 +778,10 @@
} /* end async mode */
} /* end hdl should be notified */
} /* end mt_list */
+ if (async_data) {
+ /* we finished publishing, drop the reference on async data */
+ _event_hdl_async_data_drop(async_data);
+ }
if (error) {
event_hdl_report_hdl_state(ha_warning, &cur_sub->hdl, "PUBLISH", "memory error");
return 0;