MINOR: bind-conf: support a new shards value: "by-group"

Setting "shards by-group" will create one shard per thread group. This
can often be a reasonable tradeoff between a single one that can be
suboptimal on CPUs with many cores, and too many that will eat a lot
of file descriptors. It was shown to provide good results on a 224
thread machine, with a distribution that was even smoother than the
system's since here it can take into account the number of connections
per thread in the group. Depending on how popular it becomes, it could
even become the default setting in a future version.
diff --git a/doc/configuration.txt b/doc/configuration.txt
index 1996d73..3adab34 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -15141,7 +15141,7 @@
   See https://www.rfc-editor.org/rfc/rfc9000.html#section-8.1.2 for more
   information about QUIC retry.
 
-shards <number> | by-thread
+shards <number> | by-thread | by-group
   In multi-threaded mode, on operating systems supporting multiple listeners on
   the same IP:port, this will automatically create this number of multiple
   identical listeners for the same line, all bound to a fair share of the number
@@ -15157,7 +15157,11 @@
   thread). The special "by-thread" value also creates as many shards as there
   are threads on the "bind" line. Since the system will evenly distribute the
   incoming traffic between all these shards, it is important that this number
-  is an integral divisor of the number of threads.
+  is an integral divisor of the number of threads. Alternately, the other
+  special value "by-group" will create one shard per thread group. This can
+  be useful when dealing with many threads and not wanting to create too many
+  sockets. The load distribution will be a bit less optimal but the contention
+  (especially in the system) will still be lower than with a single socket.
 
 ssl
   This setting is only available when support for OpenSSL was built in. It
diff --git a/src/cfgparse.c b/src/cfgparse.c
index aba342d..a978721 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -2972,9 +2972,11 @@
 				shards = bind_conf->settings.shards;
 				todo = thread_set_count(&bind_conf->thread_set);
 
-				/* special values: -1 = "by-thread" */
+				/* special values: -1 = "by-thread", -2 = "by-group" */
 				if (shards == -1)
 					shards = todo;
+				else if (shards == -2)
+					shards = my_popcountl(bind_conf->thread_set.grps);
 
 				/* no more shards than total threads */
 				if (shards > todo)
diff --git a/src/listener.c b/src/listener.c
index ce7f569..5fe714b 100644
--- a/src/listener.c
+++ b/src/listener.c
@@ -1854,7 +1854,7 @@
 	return 0;
 }
 
-/* parse the "shards" bind keyword. Takes an integer or "by-thread" */
+/* parse the "shards" bind keyword. Takes an integer, "by-thread", or "by-group" */
 static int bind_parse_shards(char **args, int cur_arg, struct proxy *px, struct bind_conf *conf, char **err)
 {
 	int val;
@@ -1866,6 +1866,8 @@
 
 	if (strcmp(args[cur_arg + 1], "by-thread") == 0) {
 		val = -1; /* -1 = "by-thread", will be fixed in check_config_validity() */
+	} else if (strcmp(args[cur_arg + 1], "by-group") == 0) {
+		val = -2; /* -2 = "by-group", will be fixed in check_config_validity() */
 	} else {
 		val = atol(args[cur_arg + 1]);
 		if (val < 1 || val > MAX_THREADS) {