OPTIM: pools: reduce local pool cache size to 512kB

Now that we support batched allocations/releases, it appears that we can
reach the same performance on H2 with shared pools and 256kB thread-local
cache as without shared pools, a fast allocator and 1MB thread-local cache.
With 512kB we're up to 10% faster on highly multiplexed H2 than without the
shared cache. This was tested on a 16-core ARM machine. Thus it's time to
slightly reduce the per-thread memory cost, which may also improve the
performance on machines with smaller L2 caches. It essentially reverts
commit f587003fe ("MINOR: pools: double the local pool cache size to 1 MB").
diff --git a/include/haproxy/defaults.h b/include/haproxy/defaults.h
index baa9aff..8ea94e5 100644
--- a/include/haproxy/defaults.h
+++ b/include/haproxy/defaults.h
@@ -404,7 +404,7 @@
 
 /* default per-thread pool cache size when enabled */
 #ifndef CONFIG_HAP_POOL_CACHE_SIZE
-#define CONFIG_HAP_POOL_CACHE_SIZE 1048576
+#define CONFIG_HAP_POOL_CACHE_SIZE 524288
 #endif
 
 #ifndef CONFIG_HAP_POOL_CLUSTER_SIZE