blob: 255ae4a7d74a53bff07166f128931c688cb4748e [file] [log] [blame]
Willy Tarreau8263b912011-09-05 01:04:44 +020012010/01/24 - Design of multi-criteria request rate shaping.
2
3We want to be able to rate-shape traffic on multiple cirteria. For instance, we
4may want to support shaping of per-host header requests, as well as per source.
5
6In order to achieve this, we will use checkpoints, one per criterion. Each of
7these checkpoints will consist in a test, a rate counter and a queue.
8
9A request reaches the checkpoint and checks the counter. If the counter is
10below the limit, it is updated and the request continues. If the limit is
11reached, the request attaches itself into the queue and sleeps. The sleep time
12is computed from the queue status, and updates the queue status.
13
14A task is dedicated to each queue. Its sole purpose is to be woken up when the
15next task may wake up, to check the frequency counter, wake as many requests as
16possible and update the counter. All the woken up requests are detached from
17the queue. Maybe the task dedicated to the queue can be avoided and replaced
18with all queued tasks's sleep counters, though this looks tricky. Or maybe it's
19just the first request in the queue that should be responsible for waking up
20other tasks, and not to forget to pass on this responsibility to next tasks if
21it leaves the queue.
22
23The woken up request then goes on evaluating other criteria and possibly sleeps
24again on another one. In the end, the task will have waited the amount of time
25required to pass all checkpoints, and all checkpoints will be able to maintain
26a permanent load of exactly their limit if enough streams flow through them.
27
28Since a request can only sleep in one queue at a time, it makes sense to use a
29linked list element in each session to attach it to any queue. It could very
30well be shared with the pendconn hooks which could then be part of the session.
31
32This mechanism could be used to rate-shape sessions and requests per backend
33and per server.
34
35When rate-shaping on dynamic criteria, such as the source IP address, we have
36to first extract the data pattern, then look it up in a table very similar to
37the stickiness tables, but with a frequency counter. At the checkpoint, the
38pattern is looked up, the entry created or refreshed, and the frequency counter
39updated and checked. Then the request either goes on or sleeps as described
40above, but if it sleeps, it's still in the checkpoint's queue, but with a date
41computed from the criterion's status.
42
43This means that we need 3 distinct features :
44
45 - optional pattern extraction
46 - per-pattern or per-queue frequency counter
47 - time-ordered queue with a task
48
49Based on past experiences with frequency counters, it does not appear very easy
50to exactly compute sleep delays in advance for multiple requests. So most
51likely we'll have to run per-criterion queues too, with only the head of the
52queue holding a wake-up timeout.
53
54This finally leads us to the following :
55
56 - optional pattern extraction
57 - per-pattern or per-queue frequency counter
58 - per-frequency counter queue
59 - head of the queue serves as a global queue timer.
60
61This brings us to a very flexible architecture :
62 - 1 list of rule-based checkpoints per frontend
63 - 1 list of rule-based checkpoints per backend
64 - 1 list of rule-based checkpoints per server
65
66Each of these lists have a lot of rules conditionned by ACLs, just like the
67use-backend rules, except that all rules are evaluated in turn.
68
69Since we might sometimes just want to enable that without setting any limit and
70just for enabling control in ACLs (or logging ?), we should probably try to
71find a flexible way of declaring just a counter without a queue.
72
73These checkpoints could be of two types :
74 - rate-limit (described here)
75 - concurrency-limit (very similar with the counter and no timer). This
76 feature would require to keep track of all accounted criteria in a
77 request so that they can be released upon request completion.
78
79It should be possible to define a max of requests in the queue, above which a
80503 is returned. The same applies for the max delay in the queue. We could have
81it per-task (currently it's the connection timeout) and abort tasks with a 503
82when the delay is exceeded.
83
84Per-server connection concurrency could be converted to use this mechanism
85which is very similar.
86
87The construct should be flexible enough so that the counters may be checked
88from ACLs. That would allow to reject connections or switch to an alternate
89backend when some limits are reached.
90