MEDIUM: tasks: collect per-task CPU time and latency

Right now we measure for each task the cumulated time spent waiting for
the CPU and using it. The timestamp uses a 64-bit integer to report a
nanosecond-level date. This is only enabled when "profiling.tasks" is
enabled, and consumes less than 1% extra CPU on x86_64 when enabled.
The cumulated processing time and wait time are reported in "show sess".

The task's counters are also reset when an HTTP transaction is reset
since the HTTP part pretends to restart on a fresh new stream. This
will make sure we always report correct numbers for each request in
the logs.
diff --git a/src/proto_http.c b/src/proto_http.c
index f7222cd..019556c 100644
--- a/src/proto_http.c
+++ b/src/proto_http.c
@@ -3809,6 +3809,12 @@
 	stream_stop_content_counters(s);
 	stream_update_time_stats(s);
 
+	/* reset the profiling counter */
+	s->task->calls     = 0;
+	s->task->cpu_time  = 0;
+	s->task->lat_time  = 0;
+	s->task->call_date = (profiling & HA_PROF_TASKS) ? now_mono_time() : 0;
+
 	s->logs.accept_date = date; /* user-visible date for logging */
 	s->logs.tv_accept = now;  /* corrected date for internal use */
 	s->logs.t_handshake = 0; /* There are no handshake in keep alive connection. */