MAJOR: fd/threads: Make the fdcache mostly lockless.

Create a local, per-thread, fdcache, for file descriptors that only belongs
to one thread, and make the global fd cache mostly lockless, as we can get
a lot of contention on the fd cache lock.
diff --git a/src/cli.c b/src/cli.c
index ed8cc5b..85d3567 100644
--- a/src/cli.c
+++ b/src/cli.c
@@ -811,7 +811,7 @@
 			     (fdt.ev & FD_POLL_IN)  ? 'I' : 'i',
 			     fdt.linger_risk ? 'L' : 'l',
 			     fdt.cloned ? 'C' : 'c',
-			     fdt.cache,
+			     fdt.fdcache_entry.next >= -2 ? 1 : 0,
 			     fdt.owner,
 			     fdt.iocb,
 			     (fdt.iocb == conn_fd_handler)  ? "conn_fd_handler" :