MINOR: freq_ctr: introduce a new averaging method

While the current functions report average event counts per period, we are
also interested in average values per event. For this we use a different
method. The principle is to rely on a long tail which sums the new value
with a fraction of the previous value, resulting in a sliding window of
infinite length depending on the precision we're interested in.

The idea is that we always keep (N-1)/N of the sum and add the new sampled
value. The sum over N values can be computed with a simple program for a
constant value 1 at each iteration :

   \       N - 1              e - 1
    >  ( --------- )^x ~= N * -----
   /         N                  e
  x = 1

Note: I'm not sure how to demonstrate this but at least this is easily
verified with a simple program, the sum equals N * 0.632120 for any N
moderately large (tens to hundreds).

Inserting a constant sample value V here simply results in :

   sum = V * N * (e - 1) / e

But we don't want to integrate over a small period, but infinitely. Let's
cut the infinity in P periods of N values. Each period M is exactly the same
as period M-1 with a factor of ((N-1)/N)^N applied. A test shows that given a
large N :

     N - 1           1
  ( ------- )^N ~=  ---
       N             e

Our sum is now a sum of each factor times  :

   N*P                                     P
  ,---                                   ,---
   \         N - 1               e - 1    \     1
    >  v ( --------- )^x ~= VN * -----  *  >   ---
   /           N                   e      /    e^x
  '---                                   '---
  x = 1                                  x = 0

For P "large enough", in tests we get this :

   \     1        e
    >   --- ~=  -----
   /    e^x     e - 1
  x = 0

This simplifies the sum above :

   \         N - 1
    >  v ( --------- )^x = VN
   /           N
  x = 1

So basically by summing values and applying the last result an (N-1)/N factor
we just get N times the values over the long term, so we can recover the
constant value V by dividing by N.

A value added at the entry of the sliding window of N values will thus be
reduced to 1/e or 36.7% after N terms have been added. After a second batch,
it will only be 1/e^2, or 13.5%, and so on. So practically speaking, each
old period of N values represents only a quickly fading ratio of the global
sum :

  period    ratio
    1       36.7%
    2       13.5%
    3       4.98%
    4       1.83%
    5       0.67%
    6       0.25%
    7       0.09%
    8       0.033%
    9       0.012%
   10       0.0045%

So after 10N samples, the initial value has already faded out by a factor of
22026, which is quite fast. If the sliding window is 1024 samples wide, it
means that a sample will only count for 1/22k of its initial value after 10k
samples went after it, which results in half of the value it would represent
using an arithmetic mean. The benefit of this method is that it's very cheap
in terms of computations when N is a power of two. This is very well suited
to record response times as large values will fade out faster than with an
arithmetic mean and will depend on sample count and not time.

Demonstrating all the above assumptions with maths instead of a program is
left as an exercise for the reader.
1 file changed