MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD (take #2)

Let's provide a default value for fd_hard_limit, if it's not set in the
configuration. With this patch we could set some specific default via
compile-time variable DEFAULT_MAXFD as well. Hope, this will be helpfull for
haproxy package maintainers.

    make -j 8 TARGET=linux-glibc DEBUG=-DDEFAULT_MAXFD=50000

If haproxy is comipled without DEFAULT_MAXFD defined, the default will be set
to 1048576.

This is done to avoid killing the process by its watchdog, while it started
without any limitations in its configuration or in the command line and the
hard RLIMIT_NOFILE is extremely huge (~1000000000). We use in this case
compute_ideal_maxconn() to calculate maxconn and maxsock, maxsock defines the
size of internal fdtab, which becames very-very large as well. When
the process starts to simply loop over this fdtab (0(n)), this takes a lot of
time, so watchdog does it job.

To avoid this, maxconn now is always reduced to some reasonable value either
by explicit global.fd-hard-limit from configuration, or by its default. The
default may be changed at build-time and overwritten then by
global.fd-hard-limit at runtime. Explicit global.fd-hard-limit from the
configuration has always precedence over DEFAULT_MAXFD, if set.

Must be backported in all stable versions until v2.6.0, including v2.6.0.

(cherry picked from commit 41275a691839df5f8dc7cb9faa4e259fbb755d34)
[wt: the discussion around this patch came to an agreement on the list:
     https://www.mail-archive.com/haproxy@formilux.org/msg45098.html ]
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit d6c8f7d7ae18a52783febde97b91291e5e211c65)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 58ad85764e0c5e7f4bf5cf4f678cd7cb70d9454a)
Signed-off-by: Willy Tarreau <w@1wt.eu>
diff --git a/doc/configuration.txt b/doc/configuration.txt
index 8146ddc..0dc223c 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -1566,9 +1566,14 @@
   much RAM for regular usage. The fd-hard-limit setting is provided to enforce
   a possibly lower bound to this limit. This means that it will always respect
   the system-imposed limits when they are below <number> but the specified
-  value will be used if system-imposed limits are higher. In the example below,
-  no other setting is specified and the maxconn value will automatically adapt
-  to the lower of "fd-hard-limit" and the system-imposed limit:
+  value will be used if system-imposed limits are higher. By default
+  fd-hard-limit is set to 1048576. This default could be changed via
+  DEFAULT_MAXFD compile-time variable, that could serve as the maximum (kernel)
+  system limit, if RLIMIT_NOFILE hard limit is extremely large. fd-hard-limit
+  set in global section allows to temporarily override the value provided via
+  DEFAULT_MAXFD at the build-time. In the example below, no other setting is
+  specified and the maxconn value will automatically adapt to the lower of
+  "fd-hard-limit" and the RLIMIT_NOFILE limit:
 
       global
           # use as many FDs as possible but no more than 50000
diff --git a/include/haproxy/defaults.h b/include/haproxy/defaults.h
index 6bcdc96..b743d6a 100644
--- a/include/haproxy/defaults.h
+++ b/include/haproxy/defaults.h
@@ -289,6 +289,24 @@
 #define DEFAULT_MAXCONN 100
 #endif
 
+/* Default file descriptor limit.
+ *
+ * DEFAULT_MAXFD explicitly reduces the hard RLIMIT_NOFILE, which is used by the
+ * process as the base value to calculate the default global.maxsock, if
+ * global.maxconn, global.rlimit_memmax are not defined. This is useful in the
+ * case, when hard nofile limit has been bumped to fs.nr_open (kernel max),
+ * which is extremely large on many modern distros. So, we will also finish with
+ * an extremely large default global.maxsock. The only way to override
+ * DEFAULT_MAXFD, if defined, is to set fd_hard_limit in the config global
+ * section. If DEFAULT_MAXFD is not set, a reasonable maximum of 1048576 will be
+ * used as the default value, which almost guarantees that a process will
+ * correctly start in any situation and will be not killed then by watchdog,
+ * when it will loop over the allocated fdtab.
+*/
+#ifndef DEFAULT_MAXFD
+#define DEFAULT_MAXFD 1048576
+#endif
+
 /* Define a maxconn which will be used in the master process once it re-exec to
  * the MODE_MWORKER_WAIT and won't change when SYSTEM_MAXCONN is set.
  *
diff --git a/src/haproxy.c b/src/haproxy.c
index b320b85..ffd114f 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -1395,7 +1395,19 @@
 	 *   - two FDs per connection
 	 */
 
-	if (global.fd_hard_limit && remain > global.fd_hard_limit)
+	/* on some modern distros for archs like amd64 fs.nr_open (kernel max) could
+	 * be in order of 1 billion, systemd since the version 256~rc3-3 bumped
+	 * fs.nr_open as the hard RLIMIT_NOFILE (rlim_fd_max_at_boot). If we are
+	 * started without global.maxconn or global.rlimit_memmax_all, we risk to
+	 * finish with computed global.maxconn = ~500000000 and computed
+	 * global.maxsock = ~1000000000. So, fdtab will be unnecessary and extremely
+	 * huge and watchdog will kill the process, when it tries to loop over the
+	 * fdtab (see fd_reregister_all).
+	 */
+	if (!global.fd_hard_limit)
+		global.fd_hard_limit = DEFAULT_MAXFD;
+
+	if (remain > global.fd_hard_limit)
 		remain = global.fd_hard_limit;
 
 	/* subtract listeners and checks */