blob: dc557982e95d781335967b09c9e34e61abc5fc29 [file] [log] [blame]
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001 -----------------------------------------
Willy Tarreau7b677262017-04-03 09:27:49 +02002 Filters Guide - version 1.8
Christopher Fauleta00d8172016-11-10 14:58:05 +01003 ( Last update: 2016-11-10 )
Christopher Fauletc3fe5332016-04-07 15:30:10 +02004 ------------------------------------------
5 Author : Christopher Faulet
6 Contact : christopher dot faulet at capflam dot org
7
8
9ABSTRACT
10--------
11
12The filters support is a new feature of HAProxy 1.7. It is a way to extend
13HAProxy without touching its core code and, in certain extent, without knowing
14its internals. This feature will ease contributions, reducing impact of
15changes. Another advantage will be to simplify HAProxy by replacing some parts
16by filters. As we will see, and as an example, the HTTP compression is the first
17feature moved in a filter.
18
19This document describes how to write a filter and what you have to keep in mind
20to do so. It also talks about the known limits and the pitfalls to avoid.
21
22As said, filters are quite new for now. The API is not freezed and will be
23updated/modified/improved/extended as needed.
24
25
26
27SUMMARY
28-------
29
30 1. Filters introduction
31 2. How to use filters
32 3. How to write a new filter
33 3.1. API Overview
34 3.2. Defining the filter name and its configuration
35 3.3. Managing the filter lifecycle
Christopher Faulet9adb0a52016-06-21 11:50:49 +020036 3.4. Handling the streams activity
Christopher Fauletc3fe5332016-04-07 15:30:10 +020037 3.5. Analyzing the channels activity
38 3.6. Filtering the data exchanged
39 4. FAQ
40
41
42
431. FILTERS INTRODUCTION
44-----------------------
45
46First of all, to fully understand how filters work and how to create one, it is
47best to know, at least from a distance, what is a proxy (frontend/backend), a
48stream and a channel in HAProxy and how these entities are linked to each other.
49doc/internals/entities.pdf is a good overview.
50
51Then, to support filters, many callbacks has been added to HAProxy at different
52places, mainly around channel analyzers. Their purpose is to allow filters to
53be involved in the data processing, from the stream creation/destruction to
54the data forwarding. Depending of what it should do, a filter can implement all
55or part of these callbacks. For now, existing callbacks are focused on
56streams. But futur improvements could enlarge filters scope. For example, it
57could be useful to handle events at the connection level.
58
59In HAProxy configuration file, a filter is declared in a proxy section, except
60default. So the configuration corresponding to a filter declaration is attached
61to a specific proxy, and will be shared by all its instances. it is opaque from
62the HAProxy point of view, this is the filter responsibility to manage it. For
63each filter declaration matches a uniq configuration. Several declarations of
64the same filter in the same proxy will be handle as different filters by
65HAProxy.
66
67A filter instance is represented by a partially opaque context (or a state)
68attached to a stream and passed as arguments to callbacks. Through this context,
69filter instances are stateful. Depending the filter is declared in a frontend or
70a backend section, its instances will be created, respectively, when a stream is
71created or when a backend is selected. Their behaviors will also be
72different. Only instances of filters declared in a frontend section will be
73aware of the creation and the destruction of the stream, and will take part in
74the channels analyzing before the backend is defined.
75
76It is important to remember the configuration of a filter is shared by all its
77instances, while the context of an instance is owned by a uniq stream.
78
79Filters are designed to be chained. It is possible to declare several filters in
80the same proxy section. The declaration order is important because filters will
81be called one after the other respecting this order. Frontend and backend
82filters are also chained, frontend ones called first. Even if the filters
83processing is serialized, each filter will bahave as it was alone (unless it was
84developed to be aware of other filters). For all that, some constraints are
85imposed to filters, especially when data exchanged between the client and the
86server are processed. We will dicuss again these contraints when we will tackle
87the subject of writing a filter.
88
89
90
912. HOW TO USE FILTERS
92---------------------
93
94To use a filter, you must use the parameter 'filter' followed by the filter name
95and, optionnaly, its configuration in the desired listen, frontend or backend
96section. For example:
97
98 listen test
99 ...
100 filter trace name TST
101 ...
102
103
104See doc/configuration.txt for a formal definition of the parameter 'filter'.
105Note that additional parameters on the filter line must be parsed by the filter
106itself.
107
108The list of available filters is reported by 'haproxy -vv':
109
110 $> haproxy -vv
111 HA-Proxy version 1.7-dev2-3a1d4a-33 2016/03/21
112 Copyright 2000-2016 Willy Tarreau <willy@haproxy.org>
113
114 [...]
115
116 Available filters :
117 [COMP] compression
118 [TRACE] trace
119
120
121Multiple filter lines can be used in a proxy section to chain filters. Filters
122will be called in the declaration order.
123
124Some filters can support implicit declarartions in certain circumstances
125(without the filter line). This is not recommanded for new features but are
126useful for existing ones moved in a filter, for backward compatibility
127reasons. Implicit declarartions are supported when there is only one filter used
128on a proxy. When several filters are used, explicit declarartions are mandatory.
129The HTTP compression filter is one of these filters. Alone, using 'compression'
130keywords is enough to use it. But when at least a second filter is used, a
131filter line must be added.
132
133 # filter line is optionnal
134 listen t1
135 bind *:80
136 compression algo gzip
137 compression offload
138 server srv x.x.x.x:80
139
140 # filter line is mandatory for the compression filter
141 listen t2
142 bind *:81
143 filter trace name T2
144 filter compression
145 compression algo gzip
146 compression offload
147 server srv x.x.x.x:80
148
149
150
151
1523. HOW TO WRITE A NEW FILTER
153----------------------------
154
155If you want to write a filter, there are 2 header files that you must know:
156
157 * include/types/filters.h: This is the main header file, containing all
158 important structures you will use. It represents
159 the filter API.
160 * include/proto/filters.h: This header file contains helper functions that
161 you may need to use. It also contains the internal
162 API used by HAProxy to handle filters.
163
164To ease the filters integration, it is better to follow some conventions:
165
166 * Use 'flt_' prefix to name your filter (e.g: flt_http_comp or flt_trace).
167 * Keep everything related to your filter in a same file.
168
169The filter 'trace' can be used as a template to write your own filter. It is a
170good start to see how filters really work.
171
1723.1 API OVERVIEW
173----------------
174
175Writing a filter can be summarized to write functions and attach them to the
176existing callbacks. Available callbacks are listed in the following structure:
177
178 struct flt_ops {
179 /*
180 * Callbacks to manage the filter lifecycle
181 */
182 int (*init) (struct proxy *p, struct flt_conf *fconf);
183 void (*deinit)(struct proxy *p, struct flt_conf *fconf);
184 int (*check) (struct proxy *p, struct flt_conf *fconf);
185
186 /*
187 * Stream callbacks
188 */
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200189 int (*attach) (struct stream *s, struct filter *f);
190 int (*stream_start) (struct stream *s, struct filter *f);
191 int (*stream_set_backend)(struct stream *s, struct filter *f, struct proxy *be);
192 void (*stream_stop) (struct stream *s, struct filter *f);
193 void (*detach) (struct stream *s, struct filter *f);
Christopher Fauleta00d8172016-11-10 14:58:05 +0100194 void (*check_timeouts) (struct stream *s, struct filter *f);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200195
196 /*
197 * Channel callbacks
198 */
199 int (*channel_start_analyze)(struct stream *s, struct filter *f,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200200 struct channel *chn);
201 int (*channel_pre_analyze) (struct stream *s, struct filter *f,
202 struct channel *chn,
203 unsigned int an_bit);
204 int (*channel_post_analyze) (struct stream *s, struct filter *f,
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200205 struct channel *chn,
206 unsigned int an_bit);
207 int (*channel_end_analyze) (struct stream *s, struct filter *f,
208 struct channel *chn);
209
210 /*
211 * HTTP callbacks
212 */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200213 int (*http_headers) (struct stream *s, struct filter *f,
214 struct http_msg *msg);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200215 int (*http_data) (struct stream *s, struct filter *f,
216 struct http_msg *msg);
217 int (*http_chunk_trailers)(struct stream *s, struct filter *f,
218 struct http_msg *msg);
219 int (*http_end) (struct stream *s, struct filter *f,
220 struct http_msg *msg);
221 int (*http_forward_data) (struct stream *s, struct filter *f,
222 struct http_msg *msg,
223 unsigned int len);
224
225 void (*http_reset) (struct stream *s, struct filter *f,
226 struct http_msg *msg);
227 void (*http_reply) (struct stream *s, struct filter *f,
228 short status,
229 const struct chunk *msg);
230
231 /*
232 * TCP callbacks
233 */
234 int (*tcp_data) (struct stream *s, struct filter *f,
235 struct channel *chn);
236 int (*tcp_forward_data)(struct stream *s, struct filter *f,
237 struct channel *chn,
238 unsigned int len);
239 };
240
241
242We will explain in following parts when these callbacks are called and what they
243should do.
244
245Filters are declared in proxy sections. So each proxy have an ordered list of
246filters, possibly empty if no filter is used. When the configuration of a proxy
247is parsed, each filter line represents an entry in this list. In the structure
248'proxy', the filters configurations are stored in the field 'filter_configs',
249each one of type 'struct flt_conf *':
250
251 /*
252 * Structure representing the filter configuration, attached to a proxy and
253 * accessible from a filter when instantiated in a stream
254 */
255 struct flt_conf {
256 const char *id; /* The filter id */
257 struct flt_ops *ops; /* The filter callbacks */
258 void *conf; /* The filter configuration */
259 struct list list; /* Next filter for the same proxy */
260 };
261
262 * 'flt_conf.id' is an identifier, defined by the filter. It can be
263 NULL. HAProxy does not use this field. Filters can use it in log messages or
264 as a uniq identifier to check multiple declarations. It is the filter
265 responsibility to free it, if necessary.
266
267 * 'flt_conf.conf' is opaque. It is the internal configuration of a filter,
268 generally allocated and filled by its parsing function (See § 3.2). It is
269 the filter responsibility to free it.
270
271 * 'flt_conf.ops' references the callbacks implemented by the filter. This
272 field must be set during the parsing phase (See § 3.2) and can be refine
273 during the initialization phase (See § 3.3). If it is dynamically allocated,
274 it is the filter responsibility to free it.
275
276
277The filter configuration is global and shared by all its instances. A filter
278instance is created in the context of a stream and attached to this stream. in
279the structure 'stream', the field 'strm_flt' is the state of all filter
280instances attached to a stream:
281
282 /*
283 * Structure reprensenting the "global" state of filters attached to a
284 * stream.
285 */
286 struct strm_flt {
287 struct list filters; /* List of filters attached to a stream */
288 struct filter *current[2]; /* From which filter resume processing, for a specific channel.
289 * This is used for resumable callbacks only,
290 * If NULL, we start from the first filter.
291 * 0: request channel, 1: response channel */
292 unsigned short flags; /* STRM_FL_* */
293 unsigned char nb_req_data_filters; /* Number of data filters registerd on the request channel */
294 unsigned char nb_rsp_data_filters; /* Number of data filters registerd on the response channel */
295 };
296
297
298Filter instances attached to a stream are stored in the field
299'strm_flt.filters', each instance is of type 'struct filter *':
300
301 /*
302 * Structure reprensenting a filter instance attached to a stream
303 *
304 * 2D-Array fields are used to store info per channel. The first index
305 * stands for the request channel, and the second one for the response
306 * channel. Especially, <next> and <fwd> are offets representing amount of
307 * data that the filter are, respectively, parsed and forwarded on a
308 * channel. Filters can access these values using FLT_NXT and FLT_FWD
309 * macros.
310 */
311 struct filter {
312 struct flt_conf *config; /* the filter's configuration */
313 void *ctx; /* The filter context (opaque) */
314 unsigned short flags; /* FLT_FL_* */
315 unsigned int next[2]; /* Offset, relative to buf->p, to the next
316 * byte to parse for a specific channel
317 * 0: request channel, 1: response channel */
318 unsigned int fwd[2]; /* Offset, relative to buf->p, to the next
319 * byte to forward for a specific channel
320 * 0: request channel, 1: response channel */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200321 unsigned int pre_analyzers; /* bit field indicating analyzers to
322 * pre-process */
323 unsigned int post_analyzers; /* bit field indicating analyzers to
324 * post-process */
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200325 struct list list; /* Next filter for the same proxy/stream */
326 };
327
328 * 'filter.config' is the filter configuration previously described. All
329 instances of a filter share it.
330
331 * 'filter.ctx' is an opaque context. It is managed by the filter, so it is its
332 responsibility to free it.
333
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200334 * 'filter.pre_analyzers and 'filter.post_analyzers will be described later
335 (See § 3.5).
336
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200337 * 'filter.next' and 'filter.fwd' will be described later (See § 3.6).
338
339
3403.2. DEFINING THE FILTER NAME AND ITS CONFIGURATION
341---------------------------------------------------
342
343When you write a filter, the first thing to do is to add it in the supported
344filters. To do so, you must register its name as a valid keyword on the filter
345line:
346
347 /* Declare the filter parser for "my_filter" keyword */
348 static struct flt_kw_list flt_kws = { "MY_FILTER_SCOPE", { }, {
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200349 { "my_filter", parse_my_filter_cfg, NULL /* private data */ },
350 { NULL, NULL, NULL },
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200351 }
352 };
353
354 __attribute__((constructor))
355 static void
356 __my_filter_init(void)
357 {
358 flt_register_keywords(&flt_kws);
359 }
360
361
362Then you must define the internal configuration your filter will use. For
363example:
364
365 struct my_filter_config {
366 struct proxy *proxy;
367 char *name;
368 /* ... */
369 };
370
371
372You also must list all callbacks implemented by your filter. Here, we use a
373global variable:
374
375 struct flt_ops my_filter_ops {
376 .init = my_filter_init,
377 .deinit = my_filter_deinit,
378 .check = my_filter_config_check,
379
380 /* ... */
381 };
382
383
384Finally, you must define the function to parse your filter configuration, here
385'parse_my_filter_cfg'. This function must parse all remaining keywords on the
386filter line:
387
388 /* Return -1 on error, else 0 */
389 static int
390 parse_my_filter_cfg(char **args, int *cur_arg, struct proxy *px,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200391 struct flt_conf *flt_conf, char **err, void *private)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200392 {
393 struct my_filter_config *my_conf;
394 int pos = *cur_arg;
395
396 /* Allocate the internal configuration used by the filter */
397 my_conf = calloc(1, sizeof(*my_conf));
398 if (!my_conf) {
399 memprintf(err, "%s: out of memory", args[*cur_arg]);
400 return -1;
401 }
402 my_conf->proxy = px;
403
404 /* ... */
405
406 /* Parse all keywords supported by the filter and fill the internal
407 * configuration */
408 pos++; /* Skip the filter name */
409 while (*args[pos]) {
410 if (!strcmp(args[pos], "name")) {
411 if (!*args[pos + 1]) {
412 memprintf(err, "'%s' : '%s' option without value",
413 args[*cur_arg], args[pos]);
414 goto error;
415 }
416 my_conf->name = strdup(args[pos + 1]);
417 if (!my_conf->name) {
418 memprintf(err, "%s: out of memory", args[*cur_arg]);
419 goto error;
420 }
421 pos += 2;
422 }
423
424 /* ... parse other keywords ... */
425 }
426 *cur_arg = pos;
427
428 /* Set callbacks supported by the filter */
429 flt_conf->ops = &my_filter_ops;
430
431 /* Last, save the internal configuration */
432 flt_conf->conf = my_conf;
433 return 0;
434
435 error:
436 if (my_conf->name)
437 free(my_conf->name);
438 free(my_conf);
439 return -1;
440 }
441
442
443WARNING: In your parsing function, you must define 'flt_conf->ops'. You must
444 also parse all arguments on the filter line. This is mandatory.
445
446In the previous example, we expect to read a filter line as follows:
447
448 filter my_filter name MY_NAME ...
449
450
451Optionnaly, by implementing the 'flt_ops.check' callback, you add a step to
452check the internal configuration of your filter after the parsing phase, when
453the HAProxy configuration is fully defined. For example:
454
455 /* Check configuration of a trace filter for a specified proxy.
456 * Return 1 on error, else 0. */
457 static int
458 my_filter_config_check(struct proxy *px, struct flt_conf *my_conf)
459 {
460 if (px->mode != PR_MODE_HTTP) {
461 Alert("The filter 'my_filter' cannot be used in non-HTTP mode.\n");
462 return 1;
463 }
464
465 /* ... */
466
467 return 0;
468 }
469
470
471
4723.3. MANAGING THE FILTER LIFECYCLE
473----------------------------------
474
475Once the configuration parsed and checked, filters are ready to by used. There
476are two callbacks to manage the filter lifecycle:
477
478 * 'flt_ops.init': It initializes the filter for a proxy. You may define this
479 callback if you need to complete your filter configuration.
480
481 * 'flt_ops.deinit': It cleans up what the parsing function and the init
482 callback have done. This callback is useful to release
483 memory allocated for the filter configuration.
484
485Here is an example:
486
487 /* Initialize the filter. Returns -1 on error, else 0. */
488 static int
489 my_filter_init(struct proxy *px, struct flt_conf *fconf)
490 {
491 struct my_filter_config *my_conf = fconf->conf;
492
493 /* ... */
494
495 return 0;
496 }
497
498 /* Free ressources allocated by the trace filter. */
499 static void
500 my_filter_deinit(struct proxy *px, struct flt_conf *fconf)
501 {
502 struct my_filter_config *my_conf = fconf->conf;
503
504 if (my_conf) {
505 free(my_conf->name);
506 /* ... */
507 free(my_conf);
508 }
509 fconf->conf = NULL;
510 }
511
512
513TODO: Add callbacks to handle creation/destruction of filter instances. And
514 document it.
515
516
Christopher Faulet9adb0a52016-06-21 11:50:49 +02005173.4. HANDLING THE STREAMS ACTIVITY
518-----------------------------------
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200519
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200520You may be interessted to handle streams activity. For now, there is three
521callbacks that you should define to do so:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200522
523 * 'flt_ops.stream_start': It is called when a stream is started. This callback
524 can fail by returning a negative value. It will be
525 considered as a critical error by HAProxy which
526 disabled the listener for a short time.
527
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200528 * 'flt_ops.stream_set_backend': It is called when a backend is set for a
529 stream. This callbacks will be called for all
530 filters attached to a stream (frontend and
531 backend). Note this callback is not called if
532 the frontend and the backend are the same.
533
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200534 * 'flt_ops.stream_stop': It is called when a stream is stopped. This callback
535 always succeed. Anyway, it is too late to return an
536 error.
537
538For example:
539
540 /* Called when a stream is created. Returns -1 on error, else 0. */
541 static int
542 my_filter_stream_start(struct stream *s, struct filter *filter)
543 {
544 struct my_filter_config *my_conf = FLT_CONF(filter);
545
546 /* ... */
547
548 return 0;
549 }
550
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200551 /* Called when a backend is set for a stream */
552 static int
553 my_filter_stream_set_backend(struct stream *s, struct filter *filter,
554 struct proxy *be)
555 {
556 struct my_filter_config *my_conf = FLT_CONF(filter);
557
558 /* ... */
559
560 return 0;
561 }
562
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200563 /* Called when a stream is destroyed */
564 static void
565 my_filter_stream_stop(struct stream *s, struct filter *filter)
566 {
567 struct my_filter_config *my_conf = FLT_CONF(filter);
568
569 /* ... */
570 }
571
572
573WARNING: Handling the streams creation and destuction is only possible for
574 filters defined on proxies with the frontend capability.
575
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200576In addition, it is possible to handle creation and destruction of filter
577instances using following callbacks:
578
579 * 'flt_ops.attach': It is called after a filter instance creation, when it is
580 attached to a stream. This happens when the stream is
581 started for filters defined on the stream's frontend and
582 when the backend is set for filters declared on the
583 stream's backend. It is possible to ignore the filter, if
584 needed, by returning 0. This could be useful to have
585 conditional filtering.
586
587 * 'flt_ops.detach': It is called when a filter instance is detached from a
588 stream, before its destruction. This happens when the
589 stream is stopped for filters defined on the stream's
590 frontend and when the analyze ends for filters defined on
591 the stream's backend.
592
593For example:
594
595 /* Called when a filter instance is created and attach to a stream */
596 static int
597 my_filter_attach(struct stream *s, struct filter *filter)
598 {
599 struct my_filter_config *my_conf = FLT_CONF(filter);
600
601 if (/* ... */)
602 return 0; /* Ignore the filter here */
603 return 1;
604 }
605
606 /* Called when a filter instance is detach from a stream, just before its
607 * destruction */
608 static void
609 my_filter_detach(struct stream *s, struct filter *filter)
610 {
611 struct my_filter_config *my_conf = FLT_CONF(filter);
612
613 /* ... */
614 }
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200615
Christopher Fauleta00d8172016-11-10 14:58:05 +0100616Finally, you may be interested to be notified when the stream is woken up
617because of an expired timer. This could let you a chance to check your own
618timeouts, if any. To do so you can use the following callback:
619
620 * 'flt_opt.check_timeouts': It is called when a stream is woken up because
621 of an expired timer.
622
623For example:
624
625 /* Called when a stream is woken up because of an expired timer */
626 static void
627 my_filter_check_timeouts(struct stream *s, struct filter *filter)
628 {
629 struct my_filter_config *my_conf = FLT_CONF(filter);
630
631 /* ... */
632 }
633
634
Christopher Fauletc3fe5332016-04-07 15:30:10 +02006353.5. ANALYZING THE CHANNELS ACTIVITY
636------------------------------------
637
638The main purpose of filters is to take part in the channels analyzing. To do so,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200639there is 2 callbacks, 'flt_ops.channel_pre_analyze' and
640'flt_ops.channel_post_analyze', called respectively before and after each
641analyzer attached to a channel, execpt analyzers responsible for the data
642parsing/forwarding (TCP or HTTP data). Concretely, on the request channel, these
643callbacks could be called before following analyzers:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200644
645 * tcp_inspect_request (AN_REQ_INSPECT_FE and AN_REQ_INSPECT_BE)
646 * http_wait_for_request (AN_REQ_WAIT_HTTP)
647 * http_wait_for_request_body (AN_REQ_HTTP_BODY)
648 * http_process_req_common (AN_REQ_HTTP_PROCESS_FE)
649 * process_switching_rules (AN_REQ_SWITCHING_RULES)
650 * http_process_req_ common (AN_REQ_HTTP_PROCESS_BE)
651 * http_process_tarpit (AN_REQ_HTTP_TARPIT)
652 * process_server_rules (AN_REQ_SRV_RULES)
653 * http_process_request (AN_REQ_HTTP_INNER)
654 * tcp_persist_rdp_cookie (AN_REQ_PRST_RDP_COOKIE)
655 * process_sticking_rules (AN_REQ_STICKING_RULES)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200656
657And on the response channel:
658
659 * tcp_inspect_response (AN_RES_INSPECT)
660 * http_wait_for_response (AN_RES_WAIT_HTTP)
661 * process_store_rules (AN_RES_STORE_RULES)
662 * http_process_res_common (AN_RES_HTTP_PROCESS_BE)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200663
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200664Unlike the other callbacks previously seen before, 'flt_ops.channel_pre_analyze'
665can interrupt the stream processing. So a filter can decide to not execute the
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200666analyzer that follows and wait the next iteration. If there are more than one
667filter, following ones are skipped. On the next iteration, the filtering resumes
668where it was stopped, i.e. on the filter that has previously stopped the
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200669processing. So it is possible for a filter to stop the stream processing on a
670specific analyzer for a while before continuing. Moreover, this callback can be
671called many times for the same analyzer, until it finishes its processing. For
672example:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200673
674 /* Called before a processing happens on a given channel.
675 * Returns a negative value if an error occurs, 0 if it needs to wait,
676 * any other value otherwise. */
677 static int
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200678 my_filter_chn_pre_analyze(struct stream *s, struct filter *filter,
679 struct channel *chn, unsigned an_bit)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200680 {
681 struct my_filter_config *my_conf = FLT_CONF(filter);
682
683 switch (an_bit) {
684 case AN_REQ_WAIT_HTTP:
685 if (/* wait that a condition is verified before continuing */)
686 return 0;
687 break;
688 /* ... * /
689 }
690 return 1;
691 }
692
693 * 'an_bit' is the analyzer id. All analyzers are listed in
694 'include/types/channels.h'.
695
696 * 'chn' is the channel on which the analyzing is done. You can know if it is
697 the request or the response channel by testing if CF_ISRESP flag is set:
698
699 │ ((chn->flags & CF_ISRESP) == CF_ISRESP)
700
701
702In previous example, the stream processing is blocked before receipt of the HTTP
703request until a condition is verified.
704
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200705'flt_ops.channel_post_analyze', for its part, is not resumable. It returns a
706negative value if an error occurs, any other value otherwise. It is called when
707a filterable analyzer finishes its processing. So it called once for the same
708analyzer. For example:
709
710 /* Called after a processing happens on a given channel.
711 * Returns a negative value if an error occurs, any other
712 * value otherwise. */
713 static int
714 my_filter_chn_post_analyze(struct stream *s, struct filter *filter,
715 struct channel *chn, unsigned an_bit)
716 {
717 struct my_filter_config *my_conf = FLT_CONF(filter);
718 struct http_msg *msg;
719
720 switch (an_bit) {
721 case AN_REQ_WAIT_HTTP:
722 if (/* A test on received headers before any other treatment */) {
723 msg = ((chn->flags & CF_ISRESP) ? &s->txn->rsp : &s->txn->req);
724 txn->status = 400;
725 msg->msg_state = HTTP_MSG_ERROR;
726 http_reply_and_close(s, s->txn->status,
727 http_error_message(s, HTTP_ERR_400));
728 return -1; /* This is an error ! */
729 }
730 break;
731 /* ... * /
732 }
733 return 1;
734 }
735
736
737Pre and post analyzer callbacks of a filter are not automatically called. You
738must register it explicitly on analyzers, updating the value of
739'filter.pre_analyzers' and 'filter.post_analyzers' bit fields. All analyzer bits
740are listed in 'include/types/channels.h'. Here is an example:
741
742 static int
743 my_filter_stream_start(struct stream *s, struct filter *filter)
744 {
745 /* ... * /
746
747 /* Register the pre analyzer callback on all request and response
748 * analyzers */
749 filter->pre_analyzers |= (AN_REQ_ALL | AN_RES_ALL)
750
751 /* Register the post analyzer callback of only on AN_REQ_WAIT_HTTP and
752 * AN_RES_WAIT_HTTP analyzers */
753 filter->post_analyzers |= (AN_REQ_WAIT_HTTP | AN_RES_WAIT_HTTP)
754
755 /* ... * /
756 return 0;
757 }
758
759
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200760To surround activity of a filter during the channel analyzing, two new analyzers
761has been added:
762
Christopher Faulet0184ea72017-01-05 14:06:34 +0100763 * 'flt_start_analyze' (AN_REQ/RES_FLT_START_FE/AN_REQ_RES_FLT_START_BE): For
764 a specific filter, this analyzer is called before any call to the
765 'channel_analyze' callback. From the filter point of view, it calls the
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200766 'flt_ops.channel_start_analyze' callback.
767
Christopher Faulet0184ea72017-01-05 14:06:34 +0100768 * 'flt_end_analyze' (AN_REQ/RES_FLT_END): For a specific filter, this analyzer
769 is called when all other analyzers have finished their processing. From the
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200770 filter point of view, it calls the 'flt_ops.channel_end_analyze' callback.
771
772For TCP streams, these analyzers are called only once. For HTTP streams, if the
773client connection is kept alive, this happens at each request/response roundtip.
774
775'flt_ops.channel_start_analyze' and 'flt_ops.channel_end_analyze' callbacks can
776interrupt the stream processing, as 'flt_ops.channel_analyze'. Here is an
777example:
778
779 /* Called when analyze starts for a given channel
780 * Returns a negative value if an error occurs, 0 if it needs to wait,
781 * any other value otherwise. */
782 static int
783 my_filter_chn_start_analyze(struct stream *s, struct filter *filter,
784 struct channel *chn)
785 {
786 struct my_filter_config *my_conf = FLT_CONF(filter);
787
788 /* ... TODO ... */
789
790 return 1;
791 }
792
793 /* Called when analyze ends for a given channel
794 * Returns a negative value if an error occurs, 0 if it needs to wait,
795 * any other value otherwise. */
796 static int
797 my_filter_chn_end_analyze(struct stream *s, struct filter *filter,
798 struct channel *chn)
799 {
800 struct my_filter_config *my_conf = FLT_CONF(filter);
801
802 /* ... TODO ... */
803
804 return 1;
805 }
806
807
808Workflow on channels can be summarized as following:
809
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200810 FE: Called for filters defined on the stream's frontend
811 BE: Called for filters defined on the stream's backend
812
813 +------->---------+
814 | | |
815 +----------------------+ | +----------------------+
816 | flt_ops.attach (FE) | | | flt_ops.attach (BE) |
817 +----------------------+ | +----------------------+
818 | | |
819 V | V
820 +--------------------------+ | +------------------------------------+
821 | flt_ops.stream_start (FE)| | | flt_ops.stream_set_backend (FE+BE) |
822 +--------------------------+ | +------------------------------------+
823 | | |
824 ... | ...
825 | | |
826 +-<-- [1] ^ |
827 | --+ | | --+
828 +------<----------+ | | +--------<--------+ |
829 | | | | | | |
830 V | | | V | |
831+-------------------------------+ | | | +-------------------------------+ | |
832| flt_start_analyze (FE) +-+ | | | flt_start_analyze (BE) +-+ |
833|(flt_ops.channel_start_analyze)| | F | |(flt_ops.channel_start_analyze)| |
834+---------------+---------------+ | R | +-------------------------------+ |
835 | | O | | |
836 +------<---------+ | N ^ +--------<-------+ | B
837 | | | T | | | | A
838+---------------|------------+ | | E | +---------------|------------+ | | C
839|+--------------V-------------+ | | N | |+--------------V-------------+ | | K
840||+----------------------------+ | | D | ||+----------------------------+ | | E
841|||flt_ops.channel_pre_analyze | | | | |||flt_ops.channel_pre_analyze | | | N
842||| V | | | | ||| V | | | D
843||| analyzer (FE) +-+ | | ||| analyzer (FE+BE) +-+ |
844+|| V | | | +|| V | |
845 +|flt_ops.channel_post_analyze| | | +|flt_ops.channel_post_analyze| |
846 +----------------------------+ | | +----------------------------+ |
847 | --+ | | |
848 +------------>------------+ ... |
849 | |
850 [ data filtering (see below) ] |
851 | |
852 ... |
853 | |
854 +--------<--------+ |
855 | | |
856 V | |
857 +-------------------------------+ | |
858 | flt_end_analyze (FE+BE) +-+ |
859 | (flt_ops.channel_end_analyze) | |
860 +---------------+---------------+ |
861 | --+
862 V
863 +----------------------+
864 | flt_ops.detach (BE) |
865 +----------------------+
866 |
867 If HTTP stream, go back to [1] --<--+
868 |
869 ...
870 |
871 V
872 +--------------------------+
873 | flt_ops.stream_stop (FE) |
874 +--------------------------+
875 |
876 V
877 +----------------------+
878 | flt_ops.detach (FE) |
879 +----------------------+
880 |
881 V
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200882
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200883By zooming on an analyzer box we have:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200884
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200885 ...
886 |
887 V
888 |
889 +-----------<-----------+
890 | |
891 +-----------------+--------------------+ |
892 | | | |
893 | +--------<---------+ | |
894 | | | | |
895 | V | | |
896 | flt_ops.channel_pre_analyze ->-+ | ^
897 | | | |
898 | | | |
899 | V | |
900 | analyzer --------->-----+--+
901 | | |
902 | | |
903 | V |
904 | flt_ops.channel_post_analyze |
905 | | |
906 | | |
907 +-----------------+--------------------+
908 |
909 V
910 ...
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200911
912
913 3.6. FILTERING THE DATA EXCHANGED
914-----------------------------------
915
916WARNING: To fully understand this part, you must be aware on how the buffers
917 work in HAProxy. In particular, you must be comfortable with the idea
918 of circular buffers. See doc/internals/buffer-operations.txt and
919 doc/internals/buffer-ops.fig for details.
920 doc/internals/body-parsing.txt could also be useful.
921
922An extended feature of the filters is the data filtering. By default a filter
923does not look into data exchanged between the client and the server because it
924is expensive. Indeed, instead of forwarding data without any processing, each
925byte need to be buffered.
926
927So, to enable the data filtering on a channel, at any time, in one of previous
928callbacks, you should call 'register_data_filter' function. And conversely, to
929disable it, you should call 'unregister_data_filter' function. For example:
930
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200931 my_filter_http_headers(struct stream *s, struct filter *filter,
932 struct http_msg *msg)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200933 {
934 struct my_filter_config *my_conf = FLT_CONF(filter);
935
936 /* 'chn' must be the request channel */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200937 if (!(msg->chn->flags & CF_ISRESP)) {
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200938 struct http_txn *txn = s->txn;
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200939 struct buffer *req = msg->chn->buf;
940 struct hdr_ctx ctx;
941
942 /* Enable the data filtering for the request if 'X-Filter' header
943 * is set to 'true'. */
944 if (http_find_header2("X-Filter", 8, req->p, &txn->hdr_idx, &ctx) &&
945 ctx.vlen >= 3 && memcmp(ctx.line + ctx.val, "true", 4) == 0)
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200946 register_data_filter(s, chn, filter);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200947 }
948
949 return 1;
950 }
951
952Here, the data filtering is enabled if the HTTP header 'X-Filter' is found and
953set to 'true'.
954
955If several filters are declared, the evaluation order remains the same,
956regardless the order of the registrations to the data filtering.
957
958Depending on the stream type, TCP or HTTP, the way to handle data filtering will
959be slightly different. Among other things, for HTTP streams, there are more
960callbacks to help you to fully handle all steps of an HTTP transaction. But the
961basis is the same. The data filtering is done in 2 stages:
962
963 * The data parsing: At this stage, filters will analyze input data on a
964 channel. Once a filter has parsed some data, it cannot parse it again. At
965 any time, a filter can choose to not parse all available data. So, it is
966 possible for a filter to retain data for a while. Because filters are
967 chained, a filter cannot parse more data than its predecessors. Thus only
968 data considered as parsed by the last filter will be available to the next
969 stage, the data forwarding.
970
971 * The data forwarding: At this stage, filters will decide how much data
972 HAProxy can forward among those considered as parsed at the previous
973 stage. Once a filter has marked data as forwardable, it cannot analyze it
974 anymore. At any time, a filter can choose to not forward all parsed
975 data. So, it is possible for a filter to retain data for a while. Because
976 filters are chained, a filter cannot forward more data than its
977 predecessors. Thus only data marked as forwardable by the last filter will
978 be actually forwarded by HAProxy.
979
980Internally, filters own 2 offsets, relatively to 'buf->p', representing the
981number of bytes already parsed in the available input data and the number of
982bytes considered as forwarded. We will call these offsets, respectively, 'nxt'
983and 'fwd'. Following macros reference these offsets:
984
985 * FLT_NXT(flt, chn), flt_req_nxt(flt) and flt_rsp_nxt(flt)
986
987 * FLT_FWD(flt, chn), flt_req_fwd(flt) and flt_rsp_fwd(flt)
988
989where 'flt' is the 'struct filter' passed as argument in all callbacks and 'chn'
990is the considered channel.
991
992Using these offsets, following operations on buffers are possible:
993
994 chn->buf->p + FLT_NXT(flt, chn) // the pointer on parsable data for
995 // the filter 'flt' on the channel 'chn'.
996 // Everything between chn->buf->p and 'nxt' offset was already parsed
997 // by the filter.
998
999 chn->buf->i - FLT_NXT(flt, chn) // the number of bytes of parsable data for
1000 // the filter 'flt' on the channel 'chn'.
1001
1002 chn->buf->p + FLT_FWD(flt, chn) // the pointer on forwardable data for
1003 // the filter 'flt' on the channel 'chn'.
1004 // Everything between chn->buf->p and 'fwd' offset was already forwarded
1005 // by the filter.
1006
1007
1008Note that at any time, for a filter, 'nxt' offset is always greater or equal to
1009'fwd' offset.
1010
1011TODO: Add schema with buffer states when there is 2 filters that analyze data.
1012
1013
10143.6.1 FILTERING DATA ON TCP STREAMS
1015-----------------------------------
1016
1017The TCP data filtering is the easy case, because HAProxy do not parse these
1018data. So you have only two callbacks that you need to consider:
1019
1020 * 'flt_ops.tcp_data': This callback is called when unparsed data are
1021 available. If not defined, all available data will be considered as parsed
1022 for the filter.
1023
1024 * 'flt_ops.tcp_forward_data': This callback is called when parsed data are
1025 available. If not defined, all parsed data will be considered as forwarded
1026 for the filter.
1027
1028Here is an example:
1029
1030 /* Returns a negative value if an error occurs, else the number of
1031 * consumed bytes. */
1032 static int
1033 my_filter_tcp_data(struct stream *s, struct filter *filter,
1034 struct channel *chn)
1035 {
1036 struct my_filter_config *my_conf = FLT_CONF(filter);
1037 int avail = chn->buf->i - FLT_NXT(filter, chn);
1038 int ret = avail;
1039
1040 /* Do not parse more than 'my_conf->max_parse' bytes at a time */
1041 if (my_conf->max_parse != 0 && ret > my_conf->max_parse)
1042 ret = my_conf->max_parse;
1043
1044 /* if available data are not completely parsed, wake up the stream to
1045 * be sure to not freeze it. */
1046 if (ret != avail)
1047 task_wakeup(s->task, TASK_WOKEN_MSG);
1048 return ret;
1049 }
1050
1051
1052 /* Returns a negative value if an error occurs, else * or the number of
1053 * forwarded bytes. */
1054 static int
1055 my_filter_tcp_forward_data(struct stream *s, struct filter *filter,
1056 struct channel *chn, unsigned int len)
1057 {
1058 struct my_filter_config *my_conf = FLT_CONF(filter);
1059 int ret = len;
1060
1061 /* Do not forward more than 'my_conf->max_forward' bytes at a time */
1062 if (my_conf->max_forward != 0 && ret > my_conf->max_forward)
1063 ret = my_conf->max_forward;
1064
1065 /* if parsed data are not completely forwarded, wake up the stream to
1066 * be sure to not freeze it. */
1067 if (ret != len)
1068 task_wakeup(s->task, TASK_WOKEN_MSG);
1069 return ret;
1070 }
1071
1072
1073
10743.6.2 FILTERING DATA ON HTTP STREAMS
1075------------------------------------
1076
1077The HTTP data filtering is a bit tricky because HAProxy will parse the body
1078structure, especially chunked body. So basically there is the HTTP counterpart
1079to the previous callbacks:
1080
1081 * 'flt_ops.http_data': This callback is called when unparsed data are
1082 available. If not defined, all available data will be considered as parsed
1083 for the filter.
1084
1085 * 'flt_ops.http_forward_data': This callback is called when parsed data are
1086 available. If not defined, all parsed data will be considered as forwarded
1087 for the filter.
1088
1089But the prototype for these callbacks is slightly different. Instead of having
1090the channel as parameter, we have the HTTP message (struct http_msg). You need
1091to be careful when you use 'http_msg.chunk_len' size. This value is the number
1092of bytes remaining to parse in the HTTP body (or the chunk for chunked
1093messages). The HTTP parser of HAProxy uses it to have the number of bytes that
1094it could consume:
1095
1096 /* Available input data in the current chunk from the HAProxy point of view.
1097 * msg->next bytes were already parsed. Without data filtering, HAProxy
1098 * will consume all of it. */
1099 Bytes = MIN(msg->chunk_len, chn->buf->i - msg->next);
1100
1101
1102But in your filter, you need to recompute it:
1103
1104 /* Available input data in the current chunk from the filter point of view.
1105 * 'nxt' bytes were already parsed. */
1106 Bytes = MIN(msg->chunk_len + msg->next, chn->buf->i) - FLT_NXT(flt, chn);
1107
1108
Christopher Fauletf34b28a2016-05-11 17:29:14 +02001109In addition to these callbacks, there are three others:
1110
1111 * 'flt_ops.http_headers': This callback is called just before the HTTP body
1112 parsing and after any processing on the request/response HTTP headers. When
1113 defined, this callback is always called for HTTP streams (i.e. without needs
1114 of a registration on data filtering).
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001115
1116 * 'flt_ops.http_end': This callback is called when the whole HTTP
1117 request/response is processed. It can interrupt the stream processing. So,
1118 it could be used to synchronize the HTTP request with the HTTP response, for
1119 example:
1120
1121 /* Returns a negative value if an error occurs, 0 if it needs to wait,
1122 * any other value otherwise. */
1123 static int
1124 my_filter_http_end(struct stream *s, struct filter *filter,
1125 struct http_msg *msg)
1126 {
1127 struct my_filter_ctx *my_ctx = filter->ctx;
1128
1129
1130 if (!(msg->chn->flags & CF_ISRESP)) /* The request */
1131 my_ctx->end_of_req = 1;
1132 else /* The response */
1133 my_ctx->end_of_rsp = 1;
1134
1135 /* Both the request and the response are finished */
1136 if (my_ctx->end_of_req == 1 && my_ctx->end_of_rsp == 1)
1137 return 1;
1138
1139 /* Wait */
1140 return 0;
1141 }
1142
1143
1144 * 'flt_ops.http_chunk_trailers': This callback is called for chunked HTTP
1145 messages only when all chunks were parsed. HTTP trailers can be parsed into
1146 several passes. This callback will be called each time. The number of bytes
1147 parsed by HAProxy at each iteration is stored in 'msg->sol'.
1148
1149Then, to finish, there are 2 informational callbacks:
1150
1151 * 'flt_ops.http_reset': This callback is called when a HTTP message is
1152 reset. This only happens when a '100-continue' response is received. It
1153 could be useful to reset the filter context before receiving the true
1154 response.
1155
1156 * 'flt_ops.http_reply': This callback is called when, at any time, HAProxy
1157 decides to stop the processing on a HTTP message and to send an internal
1158 response to the client. This mainly happens when an error or a redirect
1159 occurs.
1160
1161
11623.6.3 REWRITING DATA
1163--------------------
1164
1165The last part, and the trickiest one about the data filtering, is about the data
1166rewriting. For now, the filter API does not offer a lot of functions to handle
1167it. There are only functions to notify HAProxy that the data size has changed to
1168let it update internal state of filters. This is your responsibility to update
1169data itself, i.e. the buffer offsets. For a HTTP message, you also must update
1170'msg->next' and 'msg->chunk_len' values accordingly:
1171
1172 * 'flt_change_next_size': This function must be called when a filter alter
1173 incoming data. It updates 'nxt' offset value of all its predecessors. Do not
1174 call this function when a filter change the size of incoming data leads to
1175 an undefined behavior.
1176
1177 unsigned int avail = MIN(msg->chunk_len + msg->next, chn->buf->i) -
1178 flt_rsp_next(filter);
1179
1180 if (avail > 10 and /* ...Some condition... */) {
1181 /* Move the buffer forward to have buf->p pointing on unparsed
1182 * data */
1183 b_adv(msg->chn->buf, flt_rsp_nxt(filter));
1184
1185 /* Skip first 10 bytes. To simplify this example, we consider a
1186 * non-wrapping buffer */
1187 memmove(buf->p + 10, buf->p, avail - 10);
1188
1189 /* Restore buf->p value */
1190 b_rew(msg->chn->buf, flt_rsp_nxt(filter));
1191
1192 /* Now update other filters */
1193 flt_change_next_size(filter, msg->chn, -10);
1194
1195 /* Update the buffer state */
1196 buf->i -= 10;
1197
1198 /* And update the HTTP message state */
1199 msg->chunk_len -= 10;
1200
1201 return (avail - 10);
1202 }
1203 else
1204 return 0; /* Wait for more data */
1205
1206
1207 * 'flt_change_forward_size': This function must be called when a filter alter
1208 parsed data. It updates offset values ('nxt' and 'fwd') of all filters. Do
1209 not call this function when a filter change the size of parsed data leads to
1210 an undefined behavior.
1211
1212 /* len is the number of bytes of forwardable data */
1213 if (len > 10 and /* ...Some condition... */) {
1214 /* Move the buffer forward to have buf->p pointing on non-forwarded
1215 * data */
1216 b_adv(msg->chn->buf, flt_rsp_fwd(filter));
1217
1218 /* Skip first 10 bytes. To simplify this example, we consider a
1219 * non-wrapping buffer */
1220 memmove(buf->p + 10, buf->p, len - 10);
1221
1222 /* Restore buf->p value */
1223 b_rew(msg->chn->buf, flt_rsp_fwd(filter));
1224
1225 /* Now update other filters */
1226 flt_change_forward_size(filter, msg->chn, -10);
1227
1228 /* Update the buffer state */
1229 buf->i -= 10;
1230
1231 /* And update the HTTP message state */
1232 msg->next -= 10;
1233
1234 return (len - 10);
1235 }
1236 else
1237 return 0; /* Wait for more data */
1238
1239
1240TODO: implement all the stuff to easily rewrite data. For HTTP messages, this
1241 requires to have a chunked message. Else the size of data cannot be
1242 changed.
1243
1244
1245
1246
12474. FAQ
1248------
1249
12504.1. Detect multiple declarations of the same filter
1251----------------------------------------------------
1252
1253TODO