blob: 7e949a92b35ecd1d9ca60b7a266385c81112ce88 [file] [log] [blame]
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001 -----------------------------------------
Willy Tarreau7d1b48f2016-05-10 15:36:58 +02002 Filters Guide - version 1.7
Christopher Faulet9adb0a52016-06-21 11:50:49 +02003 ( Last update: 2016-06-21 )
Christopher Fauletc3fe5332016-04-07 15:30:10 +02004 ------------------------------------------
5 Author : Christopher Faulet
6 Contact : christopher dot faulet at capflam dot org
7
8
9ABSTRACT
10--------
11
12The filters support is a new feature of HAProxy 1.7. It is a way to extend
13HAProxy without touching its core code and, in certain extent, without knowing
14its internals. This feature will ease contributions, reducing impact of
15changes. Another advantage will be to simplify HAProxy by replacing some parts
16by filters. As we will see, and as an example, the HTTP compression is the first
17feature moved in a filter.
18
19This document describes how to write a filter and what you have to keep in mind
20to do so. It also talks about the known limits and the pitfalls to avoid.
21
22As said, filters are quite new for now. The API is not freezed and will be
23updated/modified/improved/extended as needed.
24
25
26
27SUMMARY
28-------
29
30 1. Filters introduction
31 2. How to use filters
32 3. How to write a new filter
33 3.1. API Overview
34 3.2. Defining the filter name and its configuration
35 3.3. Managing the filter lifecycle
Christopher Faulet9adb0a52016-06-21 11:50:49 +020036 3.4. Handling the streams activity
Christopher Fauletc3fe5332016-04-07 15:30:10 +020037 3.5. Analyzing the channels activity
38 3.6. Filtering the data exchanged
39 4. FAQ
40
41
42
431. FILTERS INTRODUCTION
44-----------------------
45
46First of all, to fully understand how filters work and how to create one, it is
47best to know, at least from a distance, what is a proxy (frontend/backend), a
48stream and a channel in HAProxy and how these entities are linked to each other.
49doc/internals/entities.pdf is a good overview.
50
51Then, to support filters, many callbacks has been added to HAProxy at different
52places, mainly around channel analyzers. Their purpose is to allow filters to
53be involved in the data processing, from the stream creation/destruction to
54the data forwarding. Depending of what it should do, a filter can implement all
55or part of these callbacks. For now, existing callbacks are focused on
56streams. But futur improvements could enlarge filters scope. For example, it
57could be useful to handle events at the connection level.
58
59In HAProxy configuration file, a filter is declared in a proxy section, except
60default. So the configuration corresponding to a filter declaration is attached
61to a specific proxy, and will be shared by all its instances. it is opaque from
62the HAProxy point of view, this is the filter responsibility to manage it. For
63each filter declaration matches a uniq configuration. Several declarations of
64the same filter in the same proxy will be handle as different filters by
65HAProxy.
66
67A filter instance is represented by a partially opaque context (or a state)
68attached to a stream and passed as arguments to callbacks. Through this context,
69filter instances are stateful. Depending the filter is declared in a frontend or
70a backend section, its instances will be created, respectively, when a stream is
71created or when a backend is selected. Their behaviors will also be
72different. Only instances of filters declared in a frontend section will be
73aware of the creation and the destruction of the stream, and will take part in
74the channels analyzing before the backend is defined.
75
76It is important to remember the configuration of a filter is shared by all its
77instances, while the context of an instance is owned by a uniq stream.
78
79Filters are designed to be chained. It is possible to declare several filters in
80the same proxy section. The declaration order is important because filters will
81be called one after the other respecting this order. Frontend and backend
82filters are also chained, frontend ones called first. Even if the filters
83processing is serialized, each filter will bahave as it was alone (unless it was
84developed to be aware of other filters). For all that, some constraints are
85imposed to filters, especially when data exchanged between the client and the
86server are processed. We will dicuss again these contraints when we will tackle
87the subject of writing a filter.
88
89
90
912. HOW TO USE FILTERS
92---------------------
93
94To use a filter, you must use the parameter 'filter' followed by the filter name
95and, optionnaly, its configuration in the desired listen, frontend or backend
96section. For example:
97
98 listen test
99 ...
100 filter trace name TST
101 ...
102
103
104See doc/configuration.txt for a formal definition of the parameter 'filter'.
105Note that additional parameters on the filter line must be parsed by the filter
106itself.
107
108The list of available filters is reported by 'haproxy -vv':
109
110 $> haproxy -vv
111 HA-Proxy version 1.7-dev2-3a1d4a-33 2016/03/21
112 Copyright 2000-2016 Willy Tarreau <willy@haproxy.org>
113
114 [...]
115
116 Available filters :
117 [COMP] compression
118 [TRACE] trace
119
120
121Multiple filter lines can be used in a proxy section to chain filters. Filters
122will be called in the declaration order.
123
124Some filters can support implicit declarartions in certain circumstances
125(without the filter line). This is not recommanded for new features but are
126useful for existing ones moved in a filter, for backward compatibility
127reasons. Implicit declarartions are supported when there is only one filter used
128on a proxy. When several filters are used, explicit declarartions are mandatory.
129The HTTP compression filter is one of these filters. Alone, using 'compression'
130keywords is enough to use it. But when at least a second filter is used, a
131filter line must be added.
132
133 # filter line is optionnal
134 listen t1
135 bind *:80
136 compression algo gzip
137 compression offload
138 server srv x.x.x.x:80
139
140 # filter line is mandatory for the compression filter
141 listen t2
142 bind *:81
143 filter trace name T2
144 filter compression
145 compression algo gzip
146 compression offload
147 server srv x.x.x.x:80
148
149
150
151
1523. HOW TO WRITE A NEW FILTER
153----------------------------
154
155If you want to write a filter, there are 2 header files that you must know:
156
157 * include/types/filters.h: This is the main header file, containing all
158 important structures you will use. It represents
159 the filter API.
160 * include/proto/filters.h: This header file contains helper functions that
161 you may need to use. It also contains the internal
162 API used by HAProxy to handle filters.
163
164To ease the filters integration, it is better to follow some conventions:
165
166 * Use 'flt_' prefix to name your filter (e.g: flt_http_comp or flt_trace).
167 * Keep everything related to your filter in a same file.
168
169The filter 'trace' can be used as a template to write your own filter. It is a
170good start to see how filters really work.
171
1723.1 API OVERVIEW
173----------------
174
175Writing a filter can be summarized to write functions and attach them to the
176existing callbacks. Available callbacks are listed in the following structure:
177
178 struct flt_ops {
179 /*
180 * Callbacks to manage the filter lifecycle
181 */
182 int (*init) (struct proxy *p, struct flt_conf *fconf);
183 void (*deinit)(struct proxy *p, struct flt_conf *fconf);
184 int (*check) (struct proxy *p, struct flt_conf *fconf);
185
186 /*
187 * Stream callbacks
188 */
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200189 int (*attach) (struct stream *s, struct filter *f);
190 int (*stream_start) (struct stream *s, struct filter *f);
191 int (*stream_set_backend)(struct stream *s, struct filter *f, struct proxy *be);
192 void (*stream_stop) (struct stream *s, struct filter *f);
193 void (*detach) (struct stream *s, struct filter *f);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200194
195 /*
196 * Channel callbacks
197 */
198 int (*channel_start_analyze)(struct stream *s, struct filter *f,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200199 struct channel *chn);
200 int (*channel_pre_analyze) (struct stream *s, struct filter *f,
201 struct channel *chn,
202 unsigned int an_bit);
203 int (*channel_post_analyze) (struct stream *s, struct filter *f,
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200204 struct channel *chn,
205 unsigned int an_bit);
206 int (*channel_end_analyze) (struct stream *s, struct filter *f,
207 struct channel *chn);
208
209 /*
210 * HTTP callbacks
211 */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200212 int (*http_headers) (struct stream *s, struct filter *f,
213 struct http_msg *msg);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200214 int (*http_data) (struct stream *s, struct filter *f,
215 struct http_msg *msg);
216 int (*http_chunk_trailers)(struct stream *s, struct filter *f,
217 struct http_msg *msg);
218 int (*http_end) (struct stream *s, struct filter *f,
219 struct http_msg *msg);
220 int (*http_forward_data) (struct stream *s, struct filter *f,
221 struct http_msg *msg,
222 unsigned int len);
223
224 void (*http_reset) (struct stream *s, struct filter *f,
225 struct http_msg *msg);
226 void (*http_reply) (struct stream *s, struct filter *f,
227 short status,
228 const struct chunk *msg);
229
230 /*
231 * TCP callbacks
232 */
233 int (*tcp_data) (struct stream *s, struct filter *f,
234 struct channel *chn);
235 int (*tcp_forward_data)(struct stream *s, struct filter *f,
236 struct channel *chn,
237 unsigned int len);
238 };
239
240
241We will explain in following parts when these callbacks are called and what they
242should do.
243
244Filters are declared in proxy sections. So each proxy have an ordered list of
245filters, possibly empty if no filter is used. When the configuration of a proxy
246is parsed, each filter line represents an entry in this list. In the structure
247'proxy', the filters configurations are stored in the field 'filter_configs',
248each one of type 'struct flt_conf *':
249
250 /*
251 * Structure representing the filter configuration, attached to a proxy and
252 * accessible from a filter when instantiated in a stream
253 */
254 struct flt_conf {
255 const char *id; /* The filter id */
256 struct flt_ops *ops; /* The filter callbacks */
257 void *conf; /* The filter configuration */
258 struct list list; /* Next filter for the same proxy */
259 };
260
261 * 'flt_conf.id' is an identifier, defined by the filter. It can be
262 NULL. HAProxy does not use this field. Filters can use it in log messages or
263 as a uniq identifier to check multiple declarations. It is the filter
264 responsibility to free it, if necessary.
265
266 * 'flt_conf.conf' is opaque. It is the internal configuration of a filter,
267 generally allocated and filled by its parsing function (See § 3.2). It is
268 the filter responsibility to free it.
269
270 * 'flt_conf.ops' references the callbacks implemented by the filter. This
271 field must be set during the parsing phase (See § 3.2) and can be refine
272 during the initialization phase (See § 3.3). If it is dynamically allocated,
273 it is the filter responsibility to free it.
274
275
276The filter configuration is global and shared by all its instances. A filter
277instance is created in the context of a stream and attached to this stream. in
278the structure 'stream', the field 'strm_flt' is the state of all filter
279instances attached to a stream:
280
281 /*
282 * Structure reprensenting the "global" state of filters attached to a
283 * stream.
284 */
285 struct strm_flt {
286 struct list filters; /* List of filters attached to a stream */
287 struct filter *current[2]; /* From which filter resume processing, for a specific channel.
288 * This is used for resumable callbacks only,
289 * If NULL, we start from the first filter.
290 * 0: request channel, 1: response channel */
291 unsigned short flags; /* STRM_FL_* */
292 unsigned char nb_req_data_filters; /* Number of data filters registerd on the request channel */
293 unsigned char nb_rsp_data_filters; /* Number of data filters registerd on the response channel */
294 };
295
296
297Filter instances attached to a stream are stored in the field
298'strm_flt.filters', each instance is of type 'struct filter *':
299
300 /*
301 * Structure reprensenting a filter instance attached to a stream
302 *
303 * 2D-Array fields are used to store info per channel. The first index
304 * stands for the request channel, and the second one for the response
305 * channel. Especially, <next> and <fwd> are offets representing amount of
306 * data that the filter are, respectively, parsed and forwarded on a
307 * channel. Filters can access these values using FLT_NXT and FLT_FWD
308 * macros.
309 */
310 struct filter {
311 struct flt_conf *config; /* the filter's configuration */
312 void *ctx; /* The filter context (opaque) */
313 unsigned short flags; /* FLT_FL_* */
314 unsigned int next[2]; /* Offset, relative to buf->p, to the next
315 * byte to parse for a specific channel
316 * 0: request channel, 1: response channel */
317 unsigned int fwd[2]; /* Offset, relative to buf->p, to the next
318 * byte to forward for a specific channel
319 * 0: request channel, 1: response channel */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200320 unsigned int pre_analyzers; /* bit field indicating analyzers to
321 * pre-process */
322 unsigned int post_analyzers; /* bit field indicating analyzers to
323 * post-process */
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200324 struct list list; /* Next filter for the same proxy/stream */
325 };
326
327 * 'filter.config' is the filter configuration previously described. All
328 instances of a filter share it.
329
330 * 'filter.ctx' is an opaque context. It is managed by the filter, so it is its
331 responsibility to free it.
332
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200333 * 'filter.pre_analyzers and 'filter.post_analyzers will be described later
334 (See § 3.5).
335
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200336 * 'filter.next' and 'filter.fwd' will be described later (See § 3.6).
337
338
3393.2. DEFINING THE FILTER NAME AND ITS CONFIGURATION
340---------------------------------------------------
341
342When you write a filter, the first thing to do is to add it in the supported
343filters. To do so, you must register its name as a valid keyword on the filter
344line:
345
346 /* Declare the filter parser for "my_filter" keyword */
347 static struct flt_kw_list flt_kws = { "MY_FILTER_SCOPE", { }, {
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200348 { "my_filter", parse_my_filter_cfg, NULL /* private data */ },
349 { NULL, NULL, NULL },
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200350 }
351 };
352
353 __attribute__((constructor))
354 static void
355 __my_filter_init(void)
356 {
357 flt_register_keywords(&flt_kws);
358 }
359
360
361Then you must define the internal configuration your filter will use. For
362example:
363
364 struct my_filter_config {
365 struct proxy *proxy;
366 char *name;
367 /* ... */
368 };
369
370
371You also must list all callbacks implemented by your filter. Here, we use a
372global variable:
373
374 struct flt_ops my_filter_ops {
375 .init = my_filter_init,
376 .deinit = my_filter_deinit,
377 .check = my_filter_config_check,
378
379 /* ... */
380 };
381
382
383Finally, you must define the function to parse your filter configuration, here
384'parse_my_filter_cfg'. This function must parse all remaining keywords on the
385filter line:
386
387 /* Return -1 on error, else 0 */
388 static int
389 parse_my_filter_cfg(char **args, int *cur_arg, struct proxy *px,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200390 struct flt_conf *flt_conf, char **err, void *private)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200391 {
392 struct my_filter_config *my_conf;
393 int pos = *cur_arg;
394
395 /* Allocate the internal configuration used by the filter */
396 my_conf = calloc(1, sizeof(*my_conf));
397 if (!my_conf) {
398 memprintf(err, "%s: out of memory", args[*cur_arg]);
399 return -1;
400 }
401 my_conf->proxy = px;
402
403 /* ... */
404
405 /* Parse all keywords supported by the filter and fill the internal
406 * configuration */
407 pos++; /* Skip the filter name */
408 while (*args[pos]) {
409 if (!strcmp(args[pos], "name")) {
410 if (!*args[pos + 1]) {
411 memprintf(err, "'%s' : '%s' option without value",
412 args[*cur_arg], args[pos]);
413 goto error;
414 }
415 my_conf->name = strdup(args[pos + 1]);
416 if (!my_conf->name) {
417 memprintf(err, "%s: out of memory", args[*cur_arg]);
418 goto error;
419 }
420 pos += 2;
421 }
422
423 /* ... parse other keywords ... */
424 }
425 *cur_arg = pos;
426
427 /* Set callbacks supported by the filter */
428 flt_conf->ops = &my_filter_ops;
429
430 /* Last, save the internal configuration */
431 flt_conf->conf = my_conf;
432 return 0;
433
434 error:
435 if (my_conf->name)
436 free(my_conf->name);
437 free(my_conf);
438 return -1;
439 }
440
441
442WARNING: In your parsing function, you must define 'flt_conf->ops'. You must
443 also parse all arguments on the filter line. This is mandatory.
444
445In the previous example, we expect to read a filter line as follows:
446
447 filter my_filter name MY_NAME ...
448
449
450Optionnaly, by implementing the 'flt_ops.check' callback, you add a step to
451check the internal configuration of your filter after the parsing phase, when
452the HAProxy configuration is fully defined. For example:
453
454 /* Check configuration of a trace filter for a specified proxy.
455 * Return 1 on error, else 0. */
456 static int
457 my_filter_config_check(struct proxy *px, struct flt_conf *my_conf)
458 {
459 if (px->mode != PR_MODE_HTTP) {
460 Alert("The filter 'my_filter' cannot be used in non-HTTP mode.\n");
461 return 1;
462 }
463
464 /* ... */
465
466 return 0;
467 }
468
469
470
4713.3. MANAGING THE FILTER LIFECYCLE
472----------------------------------
473
474Once the configuration parsed and checked, filters are ready to by used. There
475are two callbacks to manage the filter lifecycle:
476
477 * 'flt_ops.init': It initializes the filter for a proxy. You may define this
478 callback if you need to complete your filter configuration.
479
480 * 'flt_ops.deinit': It cleans up what the parsing function and the init
481 callback have done. This callback is useful to release
482 memory allocated for the filter configuration.
483
484Here is an example:
485
486 /* Initialize the filter. Returns -1 on error, else 0. */
487 static int
488 my_filter_init(struct proxy *px, struct flt_conf *fconf)
489 {
490 struct my_filter_config *my_conf = fconf->conf;
491
492 /* ... */
493
494 return 0;
495 }
496
497 /* Free ressources allocated by the trace filter. */
498 static void
499 my_filter_deinit(struct proxy *px, struct flt_conf *fconf)
500 {
501 struct my_filter_config *my_conf = fconf->conf;
502
503 if (my_conf) {
504 free(my_conf->name);
505 /* ... */
506 free(my_conf);
507 }
508 fconf->conf = NULL;
509 }
510
511
512TODO: Add callbacks to handle creation/destruction of filter instances. And
513 document it.
514
515
Christopher Faulet9adb0a52016-06-21 11:50:49 +02005163.4. HANDLING THE STREAMS ACTIVITY
517-----------------------------------
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200518
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200519You may be interessted to handle streams activity. For now, there is three
520callbacks that you should define to do so:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200521
522 * 'flt_ops.stream_start': It is called when a stream is started. This callback
523 can fail by returning a negative value. It will be
524 considered as a critical error by HAProxy which
525 disabled the listener for a short time.
526
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200527 * 'flt_ops.stream_set_backend': It is called when a backend is set for a
528 stream. This callbacks will be called for all
529 filters attached to a stream (frontend and
530 backend). Note this callback is not called if
531 the frontend and the backend are the same.
532
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200533 * 'flt_ops.stream_stop': It is called when a stream is stopped. This callback
534 always succeed. Anyway, it is too late to return an
535 error.
536
537For example:
538
539 /* Called when a stream is created. Returns -1 on error, else 0. */
540 static int
541 my_filter_stream_start(struct stream *s, struct filter *filter)
542 {
543 struct my_filter_config *my_conf = FLT_CONF(filter);
544
545 /* ... */
546
547 return 0;
548 }
549
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200550 /* Called when a backend is set for a stream */
551 static int
552 my_filter_stream_set_backend(struct stream *s, struct filter *filter,
553 struct proxy *be)
554 {
555 struct my_filter_config *my_conf = FLT_CONF(filter);
556
557 /* ... */
558
559 return 0;
560 }
561
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200562 /* Called when a stream is destroyed */
563 static void
564 my_filter_stream_stop(struct stream *s, struct filter *filter)
565 {
566 struct my_filter_config *my_conf = FLT_CONF(filter);
567
568 /* ... */
569 }
570
571
572WARNING: Handling the streams creation and destuction is only possible for
573 filters defined on proxies with the frontend capability.
574
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200575In addition, it is possible to handle creation and destruction of filter
576instances using following callbacks:
577
578 * 'flt_ops.attach': It is called after a filter instance creation, when it is
579 attached to a stream. This happens when the stream is
580 started for filters defined on the stream's frontend and
581 when the backend is set for filters declared on the
582 stream's backend. It is possible to ignore the filter, if
583 needed, by returning 0. This could be useful to have
584 conditional filtering.
585
586 * 'flt_ops.detach': It is called when a filter instance is detached from a
587 stream, before its destruction. This happens when the
588 stream is stopped for filters defined on the stream's
589 frontend and when the analyze ends for filters defined on
590 the stream's backend.
591
592For example:
593
594 /* Called when a filter instance is created and attach to a stream */
595 static int
596 my_filter_attach(struct stream *s, struct filter *filter)
597 {
598 struct my_filter_config *my_conf = FLT_CONF(filter);
599
600 if (/* ... */)
601 return 0; /* Ignore the filter here */
602 return 1;
603 }
604
605 /* Called when a filter instance is detach from a stream, just before its
606 * destruction */
607 static void
608 my_filter_detach(struct stream *s, struct filter *filter)
609 {
610 struct my_filter_config *my_conf = FLT_CONF(filter);
611
612 /* ... */
613 }
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200614
6153.5. ANALYZING THE CHANNELS ACTIVITY
616------------------------------------
617
618The main purpose of filters is to take part in the channels analyzing. To do so,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200619there is 2 callbacks, 'flt_ops.channel_pre_analyze' and
620'flt_ops.channel_post_analyze', called respectively before and after each
621analyzer attached to a channel, execpt analyzers responsible for the data
622parsing/forwarding (TCP or HTTP data). Concretely, on the request channel, these
623callbacks could be called before following analyzers:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200624
625 * tcp_inspect_request (AN_REQ_INSPECT_FE and AN_REQ_INSPECT_BE)
626 * http_wait_for_request (AN_REQ_WAIT_HTTP)
627 * http_wait_for_request_body (AN_REQ_HTTP_BODY)
628 * http_process_req_common (AN_REQ_HTTP_PROCESS_FE)
629 * process_switching_rules (AN_REQ_SWITCHING_RULES)
630 * http_process_req_ common (AN_REQ_HTTP_PROCESS_BE)
631 * http_process_tarpit (AN_REQ_HTTP_TARPIT)
632 * process_server_rules (AN_REQ_SRV_RULES)
633 * http_process_request (AN_REQ_HTTP_INNER)
634 * tcp_persist_rdp_cookie (AN_REQ_PRST_RDP_COOKIE)
635 * process_sticking_rules (AN_REQ_STICKING_RULES)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200636
637And on the response channel:
638
639 * tcp_inspect_response (AN_RES_INSPECT)
640 * http_wait_for_response (AN_RES_WAIT_HTTP)
641 * process_store_rules (AN_RES_STORE_RULES)
642 * http_process_res_common (AN_RES_HTTP_PROCESS_BE)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200643
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200644Unlike the other callbacks previously seen before, 'flt_ops.channel_pre_analyze'
645can interrupt the stream processing. So a filter can decide to not execute the
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200646analyzer that follows and wait the next iteration. If there are more than one
647filter, following ones are skipped. On the next iteration, the filtering resumes
648where it was stopped, i.e. on the filter that has previously stopped the
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200649processing. So it is possible for a filter to stop the stream processing on a
650specific analyzer for a while before continuing. Moreover, this callback can be
651called many times for the same analyzer, until it finishes its processing. For
652example:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200653
654 /* Called before a processing happens on a given channel.
655 * Returns a negative value if an error occurs, 0 if it needs to wait,
656 * any other value otherwise. */
657 static int
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200658 my_filter_chn_pre_analyze(struct stream *s, struct filter *filter,
659 struct channel *chn, unsigned an_bit)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200660 {
661 struct my_filter_config *my_conf = FLT_CONF(filter);
662
663 switch (an_bit) {
664 case AN_REQ_WAIT_HTTP:
665 if (/* wait that a condition is verified before continuing */)
666 return 0;
667 break;
668 /* ... * /
669 }
670 return 1;
671 }
672
673 * 'an_bit' is the analyzer id. All analyzers are listed in
674 'include/types/channels.h'.
675
676 * 'chn' is the channel on which the analyzing is done. You can know if it is
677 the request or the response channel by testing if CF_ISRESP flag is set:
678
679 │ ((chn->flags & CF_ISRESP) == CF_ISRESP)
680
681
682In previous example, the stream processing is blocked before receipt of the HTTP
683request until a condition is verified.
684
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200685'flt_ops.channel_post_analyze', for its part, is not resumable. It returns a
686negative value if an error occurs, any other value otherwise. It is called when
687a filterable analyzer finishes its processing. So it called once for the same
688analyzer. For example:
689
690 /* Called after a processing happens on a given channel.
691 * Returns a negative value if an error occurs, any other
692 * value otherwise. */
693 static int
694 my_filter_chn_post_analyze(struct stream *s, struct filter *filter,
695 struct channel *chn, unsigned an_bit)
696 {
697 struct my_filter_config *my_conf = FLT_CONF(filter);
698 struct http_msg *msg;
699
700 switch (an_bit) {
701 case AN_REQ_WAIT_HTTP:
702 if (/* A test on received headers before any other treatment */) {
703 msg = ((chn->flags & CF_ISRESP) ? &s->txn->rsp : &s->txn->req);
704 txn->status = 400;
705 msg->msg_state = HTTP_MSG_ERROR;
706 http_reply_and_close(s, s->txn->status,
707 http_error_message(s, HTTP_ERR_400));
708 return -1; /* This is an error ! */
709 }
710 break;
711 /* ... * /
712 }
713 return 1;
714 }
715
716
717Pre and post analyzer callbacks of a filter are not automatically called. You
718must register it explicitly on analyzers, updating the value of
719'filter.pre_analyzers' and 'filter.post_analyzers' bit fields. All analyzer bits
720are listed in 'include/types/channels.h'. Here is an example:
721
722 static int
723 my_filter_stream_start(struct stream *s, struct filter *filter)
724 {
725 /* ... * /
726
727 /* Register the pre analyzer callback on all request and response
728 * analyzers */
729 filter->pre_analyzers |= (AN_REQ_ALL | AN_RES_ALL)
730
731 /* Register the post analyzer callback of only on AN_REQ_WAIT_HTTP and
732 * AN_RES_WAIT_HTTP analyzers */
733 filter->post_analyzers |= (AN_REQ_WAIT_HTTP | AN_RES_WAIT_HTTP)
734
735 /* ... * /
736 return 0;
737 }
738
739
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200740To surround activity of a filter during the channel analyzing, two new analyzers
741has been added:
742
743 * 'flt_start_analyze' (AN_FLT_START_FE/AN_FLT_START_BE): For a specific
744 filter, this analyzer is called before any call to the 'channel_analyze'
745 callback. From the filter point of view, it calls the
746 'flt_ops.channel_start_analyze' callback.
747
748 * 'flt_end_analyze' (AN_FLT_END): For a specific filter, this analyzer is
749 called when all other analyzers have finished their processing. From the
750 filter point of view, it calls the 'flt_ops.channel_end_analyze' callback.
751
752For TCP streams, these analyzers are called only once. For HTTP streams, if the
753client connection is kept alive, this happens at each request/response roundtip.
754
755'flt_ops.channel_start_analyze' and 'flt_ops.channel_end_analyze' callbacks can
756interrupt the stream processing, as 'flt_ops.channel_analyze'. Here is an
757example:
758
759 /* Called when analyze starts for a given channel
760 * Returns a negative value if an error occurs, 0 if it needs to wait,
761 * any other value otherwise. */
762 static int
763 my_filter_chn_start_analyze(struct stream *s, struct filter *filter,
764 struct channel *chn)
765 {
766 struct my_filter_config *my_conf = FLT_CONF(filter);
767
768 /* ... TODO ... */
769
770 return 1;
771 }
772
773 /* Called when analyze ends for a given channel
774 * Returns a negative value if an error occurs, 0 if it needs to wait,
775 * any other value otherwise. */
776 static int
777 my_filter_chn_end_analyze(struct stream *s, struct filter *filter,
778 struct channel *chn)
779 {
780 struct my_filter_config *my_conf = FLT_CONF(filter);
781
782 /* ... TODO ... */
783
784 return 1;
785 }
786
787
788Workflow on channels can be summarized as following:
789
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200790 FE: Called for filters defined on the stream's frontend
791 BE: Called for filters defined on the stream's backend
792
793 +------->---------+
794 | | |
795 +----------------------+ | +----------------------+
796 | flt_ops.attach (FE) | | | flt_ops.attach (BE) |
797 +----------------------+ | +----------------------+
798 | | |
799 V | V
800 +--------------------------+ | +------------------------------------+
801 | flt_ops.stream_start (FE)| | | flt_ops.stream_set_backend (FE+BE) |
802 +--------------------------+ | +------------------------------------+
803 | | |
804 ... | ...
805 | | |
806 +-<-- [1] ^ |
807 | --+ | | --+
808 +------<----------+ | | +--------<--------+ |
809 | | | | | | |
810 V | | | V | |
811+-------------------------------+ | | | +-------------------------------+ | |
812| flt_start_analyze (FE) +-+ | | | flt_start_analyze (BE) +-+ |
813|(flt_ops.channel_start_analyze)| | F | |(flt_ops.channel_start_analyze)| |
814+---------------+---------------+ | R | +-------------------------------+ |
815 | | O | | |
816 +------<---------+ | N ^ +--------<-------+ | B
817 | | | T | | | | A
818+---------------|------------+ | | E | +---------------|------------+ | | C
819|+--------------V-------------+ | | N | |+--------------V-------------+ | | K
820||+----------------------------+ | | D | ||+----------------------------+ | | E
821|||flt_ops.channel_pre_analyze | | | | |||flt_ops.channel_pre_analyze | | | N
822||| V | | | | ||| V | | | D
823||| analyzer (FE) +-+ | | ||| analyzer (FE+BE) +-+ |
824+|| V | | | +|| V | |
825 +|flt_ops.channel_post_analyze| | | +|flt_ops.channel_post_analyze| |
826 +----------------------------+ | | +----------------------------+ |
827 | --+ | | |
828 +------------>------------+ ... |
829 | |
830 [ data filtering (see below) ] |
831 | |
832 ... |
833 | |
834 +--------<--------+ |
835 | | |
836 V | |
837 +-------------------------------+ | |
838 | flt_end_analyze (FE+BE) +-+ |
839 | (flt_ops.channel_end_analyze) | |
840 +---------------+---------------+ |
841 | --+
842 V
843 +----------------------+
844 | flt_ops.detach (BE) |
845 +----------------------+
846 |
847 If HTTP stream, go back to [1] --<--+
848 |
849 ...
850 |
851 V
852 +--------------------------+
853 | flt_ops.stream_stop (FE) |
854 +--------------------------+
855 |
856 V
857 +----------------------+
858 | flt_ops.detach (FE) |
859 +----------------------+
860 |
861 V
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200862
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200863By zooming on an analyzer box we have:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200864
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200865 ...
866 |
867 V
868 |
869 +-----------<-----------+
870 | |
871 +-----------------+--------------------+ |
872 | | | |
873 | +--------<---------+ | |
874 | | | | |
875 | V | | |
876 | flt_ops.channel_pre_analyze ->-+ | ^
877 | | | |
878 | | | |
879 | V | |
880 | analyzer --------->-----+--+
881 | | |
882 | | |
883 | V |
884 | flt_ops.channel_post_analyze |
885 | | |
886 | | |
887 +-----------------+--------------------+
888 |
889 V
890 ...
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200891
892
893 3.6. FILTERING THE DATA EXCHANGED
894-----------------------------------
895
896WARNING: To fully understand this part, you must be aware on how the buffers
897 work in HAProxy. In particular, you must be comfortable with the idea
898 of circular buffers. See doc/internals/buffer-operations.txt and
899 doc/internals/buffer-ops.fig for details.
900 doc/internals/body-parsing.txt could also be useful.
901
902An extended feature of the filters is the data filtering. By default a filter
903does not look into data exchanged between the client and the server because it
904is expensive. Indeed, instead of forwarding data without any processing, each
905byte need to be buffered.
906
907So, to enable the data filtering on a channel, at any time, in one of previous
908callbacks, you should call 'register_data_filter' function. And conversely, to
909disable it, you should call 'unregister_data_filter' function. For example:
910
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200911 my_filter_http_headers(struct stream *s, struct filter *filter,
912 struct http_msg *msg)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200913 {
914 struct my_filter_config *my_conf = FLT_CONF(filter);
915
916 /* 'chn' must be the request channel */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200917 if (!(msg->chn->flags & CF_ISRESP)) {
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200918 struct http_txn *txn = s->txn;
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200919 struct buffer *req = msg->chn->buf;
920 struct hdr_ctx ctx;
921
922 /* Enable the data filtering for the request if 'X-Filter' header
923 * is set to 'true'. */
924 if (http_find_header2("X-Filter", 8, req->p, &txn->hdr_idx, &ctx) &&
925 ctx.vlen >= 3 && memcmp(ctx.line + ctx.val, "true", 4) == 0)
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200926 register_data_filter(s, chn, filter);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200927 }
928
929 return 1;
930 }
931
932Here, the data filtering is enabled if the HTTP header 'X-Filter' is found and
933set to 'true'.
934
935If several filters are declared, the evaluation order remains the same,
936regardless the order of the registrations to the data filtering.
937
938Depending on the stream type, TCP or HTTP, the way to handle data filtering will
939be slightly different. Among other things, for HTTP streams, there are more
940callbacks to help you to fully handle all steps of an HTTP transaction. But the
941basis is the same. The data filtering is done in 2 stages:
942
943 * The data parsing: At this stage, filters will analyze input data on a
944 channel. Once a filter has parsed some data, it cannot parse it again. At
945 any time, a filter can choose to not parse all available data. So, it is
946 possible for a filter to retain data for a while. Because filters are
947 chained, a filter cannot parse more data than its predecessors. Thus only
948 data considered as parsed by the last filter will be available to the next
949 stage, the data forwarding.
950
951 * The data forwarding: At this stage, filters will decide how much data
952 HAProxy can forward among those considered as parsed at the previous
953 stage. Once a filter has marked data as forwardable, it cannot analyze it
954 anymore. At any time, a filter can choose to not forward all parsed
955 data. So, it is possible for a filter to retain data for a while. Because
956 filters are chained, a filter cannot forward more data than its
957 predecessors. Thus only data marked as forwardable by the last filter will
958 be actually forwarded by HAProxy.
959
960Internally, filters own 2 offsets, relatively to 'buf->p', representing the
961number of bytes already parsed in the available input data and the number of
962bytes considered as forwarded. We will call these offsets, respectively, 'nxt'
963and 'fwd'. Following macros reference these offsets:
964
965 * FLT_NXT(flt, chn), flt_req_nxt(flt) and flt_rsp_nxt(flt)
966
967 * FLT_FWD(flt, chn), flt_req_fwd(flt) and flt_rsp_fwd(flt)
968
969where 'flt' is the 'struct filter' passed as argument in all callbacks and 'chn'
970is the considered channel.
971
972Using these offsets, following operations on buffers are possible:
973
974 chn->buf->p + FLT_NXT(flt, chn) // the pointer on parsable data for
975 // the filter 'flt' on the channel 'chn'.
976 // Everything between chn->buf->p and 'nxt' offset was already parsed
977 // by the filter.
978
979 chn->buf->i - FLT_NXT(flt, chn) // the number of bytes of parsable data for
980 // the filter 'flt' on the channel 'chn'.
981
982 chn->buf->p + FLT_FWD(flt, chn) // the pointer on forwardable data for
983 // the filter 'flt' on the channel 'chn'.
984 // Everything between chn->buf->p and 'fwd' offset was already forwarded
985 // by the filter.
986
987
988Note that at any time, for a filter, 'nxt' offset is always greater or equal to
989'fwd' offset.
990
991TODO: Add schema with buffer states when there is 2 filters that analyze data.
992
993
9943.6.1 FILTERING DATA ON TCP STREAMS
995-----------------------------------
996
997The TCP data filtering is the easy case, because HAProxy do not parse these
998data. So you have only two callbacks that you need to consider:
999
1000 * 'flt_ops.tcp_data': This callback is called when unparsed data are
1001 available. If not defined, all available data will be considered as parsed
1002 for the filter.
1003
1004 * 'flt_ops.tcp_forward_data': This callback is called when parsed data are
1005 available. If not defined, all parsed data will be considered as forwarded
1006 for the filter.
1007
1008Here is an example:
1009
1010 /* Returns a negative value if an error occurs, else the number of
1011 * consumed bytes. */
1012 static int
1013 my_filter_tcp_data(struct stream *s, struct filter *filter,
1014 struct channel *chn)
1015 {
1016 struct my_filter_config *my_conf = FLT_CONF(filter);
1017 int avail = chn->buf->i - FLT_NXT(filter, chn);
1018 int ret = avail;
1019
1020 /* Do not parse more than 'my_conf->max_parse' bytes at a time */
1021 if (my_conf->max_parse != 0 && ret > my_conf->max_parse)
1022 ret = my_conf->max_parse;
1023
1024 /* if available data are not completely parsed, wake up the stream to
1025 * be sure to not freeze it. */
1026 if (ret != avail)
1027 task_wakeup(s->task, TASK_WOKEN_MSG);
1028 return ret;
1029 }
1030
1031
1032 /* Returns a negative value if an error occurs, else * or the number of
1033 * forwarded bytes. */
1034 static int
1035 my_filter_tcp_forward_data(struct stream *s, struct filter *filter,
1036 struct channel *chn, unsigned int len)
1037 {
1038 struct my_filter_config *my_conf = FLT_CONF(filter);
1039 int ret = len;
1040
1041 /* Do not forward more than 'my_conf->max_forward' bytes at a time */
1042 if (my_conf->max_forward != 0 && ret > my_conf->max_forward)
1043 ret = my_conf->max_forward;
1044
1045 /* if parsed data are not completely forwarded, wake up the stream to
1046 * be sure to not freeze it. */
1047 if (ret != len)
1048 task_wakeup(s->task, TASK_WOKEN_MSG);
1049 return ret;
1050 }
1051
1052
1053
10543.6.2 FILTERING DATA ON HTTP STREAMS
1055------------------------------------
1056
1057The HTTP data filtering is a bit tricky because HAProxy will parse the body
1058structure, especially chunked body. So basically there is the HTTP counterpart
1059to the previous callbacks:
1060
1061 * 'flt_ops.http_data': This callback is called when unparsed data are
1062 available. If not defined, all available data will be considered as parsed
1063 for the filter.
1064
1065 * 'flt_ops.http_forward_data': This callback is called when parsed data are
1066 available. If not defined, all parsed data will be considered as forwarded
1067 for the filter.
1068
1069But the prototype for these callbacks is slightly different. Instead of having
1070the channel as parameter, we have the HTTP message (struct http_msg). You need
1071to be careful when you use 'http_msg.chunk_len' size. This value is the number
1072of bytes remaining to parse in the HTTP body (or the chunk for chunked
1073messages). The HTTP parser of HAProxy uses it to have the number of bytes that
1074it could consume:
1075
1076 /* Available input data in the current chunk from the HAProxy point of view.
1077 * msg->next bytes were already parsed. Without data filtering, HAProxy
1078 * will consume all of it. */
1079 Bytes = MIN(msg->chunk_len, chn->buf->i - msg->next);
1080
1081
1082But in your filter, you need to recompute it:
1083
1084 /* Available input data in the current chunk from the filter point of view.
1085 * 'nxt' bytes were already parsed. */
1086 Bytes = MIN(msg->chunk_len + msg->next, chn->buf->i) - FLT_NXT(flt, chn);
1087
1088
Christopher Fauletf34b28a2016-05-11 17:29:14 +02001089In addition to these callbacks, there are three others:
1090
1091 * 'flt_ops.http_headers': This callback is called just before the HTTP body
1092 parsing and after any processing on the request/response HTTP headers. When
1093 defined, this callback is always called for HTTP streams (i.e. without needs
1094 of a registration on data filtering).
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001095
1096 * 'flt_ops.http_end': This callback is called when the whole HTTP
1097 request/response is processed. It can interrupt the stream processing. So,
1098 it could be used to synchronize the HTTP request with the HTTP response, for
1099 example:
1100
1101 /* Returns a negative value if an error occurs, 0 if it needs to wait,
1102 * any other value otherwise. */
1103 static int
1104 my_filter_http_end(struct stream *s, struct filter *filter,
1105 struct http_msg *msg)
1106 {
1107 struct my_filter_ctx *my_ctx = filter->ctx;
1108
1109
1110 if (!(msg->chn->flags & CF_ISRESP)) /* The request */
1111 my_ctx->end_of_req = 1;
1112 else /* The response */
1113 my_ctx->end_of_rsp = 1;
1114
1115 /* Both the request and the response are finished */
1116 if (my_ctx->end_of_req == 1 && my_ctx->end_of_rsp == 1)
1117 return 1;
1118
1119 /* Wait */
1120 return 0;
1121 }
1122
1123
1124 * 'flt_ops.http_chunk_trailers': This callback is called for chunked HTTP
1125 messages only when all chunks were parsed. HTTP trailers can be parsed into
1126 several passes. This callback will be called each time. The number of bytes
1127 parsed by HAProxy at each iteration is stored in 'msg->sol'.
1128
1129Then, to finish, there are 2 informational callbacks:
1130
1131 * 'flt_ops.http_reset': This callback is called when a HTTP message is
1132 reset. This only happens when a '100-continue' response is received. It
1133 could be useful to reset the filter context before receiving the true
1134 response.
1135
1136 * 'flt_ops.http_reply': This callback is called when, at any time, HAProxy
1137 decides to stop the processing on a HTTP message and to send an internal
1138 response to the client. This mainly happens when an error or a redirect
1139 occurs.
1140
1141
11423.6.3 REWRITING DATA
1143--------------------
1144
1145The last part, and the trickiest one about the data filtering, is about the data
1146rewriting. For now, the filter API does not offer a lot of functions to handle
1147it. There are only functions to notify HAProxy that the data size has changed to
1148let it update internal state of filters. This is your responsibility to update
1149data itself, i.e. the buffer offsets. For a HTTP message, you also must update
1150'msg->next' and 'msg->chunk_len' values accordingly:
1151
1152 * 'flt_change_next_size': This function must be called when a filter alter
1153 incoming data. It updates 'nxt' offset value of all its predecessors. Do not
1154 call this function when a filter change the size of incoming data leads to
1155 an undefined behavior.
1156
1157 unsigned int avail = MIN(msg->chunk_len + msg->next, chn->buf->i) -
1158 flt_rsp_next(filter);
1159
1160 if (avail > 10 and /* ...Some condition... */) {
1161 /* Move the buffer forward to have buf->p pointing on unparsed
1162 * data */
1163 b_adv(msg->chn->buf, flt_rsp_nxt(filter));
1164
1165 /* Skip first 10 bytes. To simplify this example, we consider a
1166 * non-wrapping buffer */
1167 memmove(buf->p + 10, buf->p, avail - 10);
1168
1169 /* Restore buf->p value */
1170 b_rew(msg->chn->buf, flt_rsp_nxt(filter));
1171
1172 /* Now update other filters */
1173 flt_change_next_size(filter, msg->chn, -10);
1174
1175 /* Update the buffer state */
1176 buf->i -= 10;
1177
1178 /* And update the HTTP message state */
1179 msg->chunk_len -= 10;
1180
1181 return (avail - 10);
1182 }
1183 else
1184 return 0; /* Wait for more data */
1185
1186
1187 * 'flt_change_forward_size': This function must be called when a filter alter
1188 parsed data. It updates offset values ('nxt' and 'fwd') of all filters. Do
1189 not call this function when a filter change the size of parsed data leads to
1190 an undefined behavior.
1191
1192 /* len is the number of bytes of forwardable data */
1193 if (len > 10 and /* ...Some condition... */) {
1194 /* Move the buffer forward to have buf->p pointing on non-forwarded
1195 * data */
1196 b_adv(msg->chn->buf, flt_rsp_fwd(filter));
1197
1198 /* Skip first 10 bytes. To simplify this example, we consider a
1199 * non-wrapping buffer */
1200 memmove(buf->p + 10, buf->p, len - 10);
1201
1202 /* Restore buf->p value */
1203 b_rew(msg->chn->buf, flt_rsp_fwd(filter));
1204
1205 /* Now update other filters */
1206 flt_change_forward_size(filter, msg->chn, -10);
1207
1208 /* Update the buffer state */
1209 buf->i -= 10;
1210
1211 /* And update the HTTP message state */
1212 msg->next -= 10;
1213
1214 return (len - 10);
1215 }
1216 else
1217 return 0; /* Wait for more data */
1218
1219
1220TODO: implement all the stuff to easily rewrite data. For HTTP messages, this
1221 requires to have a chunked message. Else the size of data cannot be
1222 changed.
1223
1224
1225
1226
12274. FAQ
1228------
1229
12304.1. Detect multiple declarations of the same filter
1231----------------------------------------------------
1232
1233TODO