MEDIUM: queue: use a trylock on the server's queue

Doing so makes sure that threads attempting to wake up new connections
for a server will give up early if another thread is already in charge
of this. The goal is to avoid unneeded contention on low server counts.

Now with a single server with 16 threads in roundrobin we get the same
performance as with multiple servers, i.e. ~575kreq/s instead of ~496k
before. Leastconn is seeing a similar jump, from ~460 to ~560k (the
difference being the calls to fwlc_srv_reposition).

The overhead of process_srv_queue() is now around 2% instead of ~20%
previously.
diff --git a/src/queue.c b/src/queue.c
index 7b88422..f0dcd11 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -352,16 +352,23 @@
 	         (!p->srv_act &&
 	          (s == p->lbprm.fbck || (p->options & PR_O_USE_ALL_BK))));
 
-	HA_SPIN_LOCK(SERVER_LOCK, &s->queue.lock);
-	maxconn = srv_dynamic_maxconn(s);
-	while (s->served < maxconn) {
-		int ret = pendconn_process_next_strm(s, p, px_ok);
-		if (!ret)
+	/* let's repeat that under the lock on each round. Threads competing
+	 * for the same server will give up, knowing that at least one of
+	 * them will check the conditions again before quitting.
+	 */
+	while (s->served < (maxconn = srv_dynamic_maxconn(s))) {
+		if (HA_SPIN_TRYLOCK(SERVER_LOCK, &s->queue.lock) != 0)
 			break;
-		_HA_ATOMIC_INC(&s->served);
-		done++;
+
+		while (s->served < maxconn) {
+			int ret = pendconn_process_next_strm(s, p, px_ok);
+			if (!ret)
+				break;
+			_HA_ATOMIC_INC(&s->served);
+			done++;
+		}
+		HA_SPIN_UNLOCK(SERVER_LOCK, &s->queue.lock);
 	}
-	HA_SPIN_UNLOCK(SERVER_LOCK, &s->queue.lock);
 
 	if (done) {
 		_HA_ATOMIC_SUB(&p->totpend, done);