blob: a89abb430af44065643252b1791af4aaec7c4a02 [file] [log] [blame]
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001 -----------------------------------------
Willy Tarreau7d1b48f2016-05-10 15:36:58 +02002 Filters Guide - version 1.7
Christopher Fauletf34b28a2016-05-11 17:29:14 +02003 ( Last update: 2016-05-11 )
Christopher Fauletc3fe5332016-04-07 15:30:10 +02004 ------------------------------------------
5 Author : Christopher Faulet
6 Contact : christopher dot faulet at capflam dot org
7
8
9ABSTRACT
10--------
11
12The filters support is a new feature of HAProxy 1.7. It is a way to extend
13HAProxy without touching its core code and, in certain extent, without knowing
14its internals. This feature will ease contributions, reducing impact of
15changes. Another advantage will be to simplify HAProxy by replacing some parts
16by filters. As we will see, and as an example, the HTTP compression is the first
17feature moved in a filter.
18
19This document describes how to write a filter and what you have to keep in mind
20to do so. It also talks about the known limits and the pitfalls to avoid.
21
22As said, filters are quite new for now. The API is not freezed and will be
23updated/modified/improved/extended as needed.
24
25
26
27SUMMARY
28-------
29
30 1. Filters introduction
31 2. How to use filters
32 3. How to write a new filter
33 3.1. API Overview
34 3.2. Defining the filter name and its configuration
35 3.3. Managing the filter lifecycle
36 3.4. Handling the streams creation and desctruction
37 3.5. Analyzing the channels activity
38 3.6. Filtering the data exchanged
39 4. FAQ
40
41
42
431. FILTERS INTRODUCTION
44-----------------------
45
46First of all, to fully understand how filters work and how to create one, it is
47best to know, at least from a distance, what is a proxy (frontend/backend), a
48stream and a channel in HAProxy and how these entities are linked to each other.
49doc/internals/entities.pdf is a good overview.
50
51Then, to support filters, many callbacks has been added to HAProxy at different
52places, mainly around channel analyzers. Their purpose is to allow filters to
53be involved in the data processing, from the stream creation/destruction to
54the data forwarding. Depending of what it should do, a filter can implement all
55or part of these callbacks. For now, existing callbacks are focused on
56streams. But futur improvements could enlarge filters scope. For example, it
57could be useful to handle events at the connection level.
58
59In HAProxy configuration file, a filter is declared in a proxy section, except
60default. So the configuration corresponding to a filter declaration is attached
61to a specific proxy, and will be shared by all its instances. it is opaque from
62the HAProxy point of view, this is the filter responsibility to manage it. For
63each filter declaration matches a uniq configuration. Several declarations of
64the same filter in the same proxy will be handle as different filters by
65HAProxy.
66
67A filter instance is represented by a partially opaque context (or a state)
68attached to a stream and passed as arguments to callbacks. Through this context,
69filter instances are stateful. Depending the filter is declared in a frontend or
70a backend section, its instances will be created, respectively, when a stream is
71created or when a backend is selected. Their behaviors will also be
72different. Only instances of filters declared in a frontend section will be
73aware of the creation and the destruction of the stream, and will take part in
74the channels analyzing before the backend is defined.
75
76It is important to remember the configuration of a filter is shared by all its
77instances, while the context of an instance is owned by a uniq stream.
78
79Filters are designed to be chained. It is possible to declare several filters in
80the same proxy section. The declaration order is important because filters will
81be called one after the other respecting this order. Frontend and backend
82filters are also chained, frontend ones called first. Even if the filters
83processing is serialized, each filter will bahave as it was alone (unless it was
84developed to be aware of other filters). For all that, some constraints are
85imposed to filters, especially when data exchanged between the client and the
86server are processed. We will dicuss again these contraints when we will tackle
87the subject of writing a filter.
88
89
90
912. HOW TO USE FILTERS
92---------------------
93
94To use a filter, you must use the parameter 'filter' followed by the filter name
95and, optionnaly, its configuration in the desired listen, frontend or backend
96section. For example:
97
98 listen test
99 ...
100 filter trace name TST
101 ...
102
103
104See doc/configuration.txt for a formal definition of the parameter 'filter'.
105Note that additional parameters on the filter line must be parsed by the filter
106itself.
107
108The list of available filters is reported by 'haproxy -vv':
109
110 $> haproxy -vv
111 HA-Proxy version 1.7-dev2-3a1d4a-33 2016/03/21
112 Copyright 2000-2016 Willy Tarreau <willy@haproxy.org>
113
114 [...]
115
116 Available filters :
117 [COMP] compression
118 [TRACE] trace
119
120
121Multiple filter lines can be used in a proxy section to chain filters. Filters
122will be called in the declaration order.
123
124Some filters can support implicit declarartions in certain circumstances
125(without the filter line). This is not recommanded for new features but are
126useful for existing ones moved in a filter, for backward compatibility
127reasons. Implicit declarartions are supported when there is only one filter used
128on a proxy. When several filters are used, explicit declarartions are mandatory.
129The HTTP compression filter is one of these filters. Alone, using 'compression'
130keywords is enough to use it. But when at least a second filter is used, a
131filter line must be added.
132
133 # filter line is optionnal
134 listen t1
135 bind *:80
136 compression algo gzip
137 compression offload
138 server srv x.x.x.x:80
139
140 # filter line is mandatory for the compression filter
141 listen t2
142 bind *:81
143 filter trace name T2
144 filter compression
145 compression algo gzip
146 compression offload
147 server srv x.x.x.x:80
148
149
150
151
1523. HOW TO WRITE A NEW FILTER
153----------------------------
154
155If you want to write a filter, there are 2 header files that you must know:
156
157 * include/types/filters.h: This is the main header file, containing all
158 important structures you will use. It represents
159 the filter API.
160 * include/proto/filters.h: This header file contains helper functions that
161 you may need to use. It also contains the internal
162 API used by HAProxy to handle filters.
163
164To ease the filters integration, it is better to follow some conventions:
165
166 * Use 'flt_' prefix to name your filter (e.g: flt_http_comp or flt_trace).
167 * Keep everything related to your filter in a same file.
168
169The filter 'trace' can be used as a template to write your own filter. It is a
170good start to see how filters really work.
171
1723.1 API OVERVIEW
173----------------
174
175Writing a filter can be summarized to write functions and attach them to the
176existing callbacks. Available callbacks are listed in the following structure:
177
178 struct flt_ops {
179 /*
180 * Callbacks to manage the filter lifecycle
181 */
182 int (*init) (struct proxy *p, struct flt_conf *fconf);
183 void (*deinit)(struct proxy *p, struct flt_conf *fconf);
184 int (*check) (struct proxy *p, struct flt_conf *fconf);
185
186 /*
187 * Stream callbacks
188 */
189 int (*stream_start) (struct stream *s, struct filter *f);
190 void (*stream_stop) (struct stream *s, struct filter *f);
191
192 /*
193 * Channel callbacks
194 */
195 int (*channel_start_analyze)(struct stream *s, struct filter *f,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200196 struct channel *chn);
197 int (*channel_pre_analyze) (struct stream *s, struct filter *f,
198 struct channel *chn,
199 unsigned int an_bit);
200 int (*channel_post_analyze) (struct stream *s, struct filter *f,
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200201 struct channel *chn,
202 unsigned int an_bit);
203 int (*channel_end_analyze) (struct stream *s, struct filter *f,
204 struct channel *chn);
205
206 /*
207 * HTTP callbacks
208 */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200209 int (*http_headers) (struct stream *s, struct filter *f,
210 struct http_msg *msg);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200211 int (*http_data) (struct stream *s, struct filter *f,
212 struct http_msg *msg);
213 int (*http_chunk_trailers)(struct stream *s, struct filter *f,
214 struct http_msg *msg);
215 int (*http_end) (struct stream *s, struct filter *f,
216 struct http_msg *msg);
217 int (*http_forward_data) (struct stream *s, struct filter *f,
218 struct http_msg *msg,
219 unsigned int len);
220
221 void (*http_reset) (struct stream *s, struct filter *f,
222 struct http_msg *msg);
223 void (*http_reply) (struct stream *s, struct filter *f,
224 short status,
225 const struct chunk *msg);
226
227 /*
228 * TCP callbacks
229 */
230 int (*tcp_data) (struct stream *s, struct filter *f,
231 struct channel *chn);
232 int (*tcp_forward_data)(struct stream *s, struct filter *f,
233 struct channel *chn,
234 unsigned int len);
235 };
236
237
238We will explain in following parts when these callbacks are called and what they
239should do.
240
241Filters are declared in proxy sections. So each proxy have an ordered list of
242filters, possibly empty if no filter is used. When the configuration of a proxy
243is parsed, each filter line represents an entry in this list. In the structure
244'proxy', the filters configurations are stored in the field 'filter_configs',
245each one of type 'struct flt_conf *':
246
247 /*
248 * Structure representing the filter configuration, attached to a proxy and
249 * accessible from a filter when instantiated in a stream
250 */
251 struct flt_conf {
252 const char *id; /* The filter id */
253 struct flt_ops *ops; /* The filter callbacks */
254 void *conf; /* The filter configuration */
255 struct list list; /* Next filter for the same proxy */
256 };
257
258 * 'flt_conf.id' is an identifier, defined by the filter. It can be
259 NULL. HAProxy does not use this field. Filters can use it in log messages or
260 as a uniq identifier to check multiple declarations. It is the filter
261 responsibility to free it, if necessary.
262
263 * 'flt_conf.conf' is opaque. It is the internal configuration of a filter,
264 generally allocated and filled by its parsing function (See § 3.2). It is
265 the filter responsibility to free it.
266
267 * 'flt_conf.ops' references the callbacks implemented by the filter. This
268 field must be set during the parsing phase (See § 3.2) and can be refine
269 during the initialization phase (See § 3.3). If it is dynamically allocated,
270 it is the filter responsibility to free it.
271
272
273The filter configuration is global and shared by all its instances. A filter
274instance is created in the context of a stream and attached to this stream. in
275the structure 'stream', the field 'strm_flt' is the state of all filter
276instances attached to a stream:
277
278 /*
279 * Structure reprensenting the "global" state of filters attached to a
280 * stream.
281 */
282 struct strm_flt {
283 struct list filters; /* List of filters attached to a stream */
284 struct filter *current[2]; /* From which filter resume processing, for a specific channel.
285 * This is used for resumable callbacks only,
286 * If NULL, we start from the first filter.
287 * 0: request channel, 1: response channel */
288 unsigned short flags; /* STRM_FL_* */
289 unsigned char nb_req_data_filters; /* Number of data filters registerd on the request channel */
290 unsigned char nb_rsp_data_filters; /* Number of data filters registerd on the response channel */
291 };
292
293
294Filter instances attached to a stream are stored in the field
295'strm_flt.filters', each instance is of type 'struct filter *':
296
297 /*
298 * Structure reprensenting a filter instance attached to a stream
299 *
300 * 2D-Array fields are used to store info per channel. The first index
301 * stands for the request channel, and the second one for the response
302 * channel. Especially, <next> and <fwd> are offets representing amount of
303 * data that the filter are, respectively, parsed and forwarded on a
304 * channel. Filters can access these values using FLT_NXT and FLT_FWD
305 * macros.
306 */
307 struct filter {
308 struct flt_conf *config; /* the filter's configuration */
309 void *ctx; /* The filter context (opaque) */
310 unsigned short flags; /* FLT_FL_* */
311 unsigned int next[2]; /* Offset, relative to buf->p, to the next
312 * byte to parse for a specific channel
313 * 0: request channel, 1: response channel */
314 unsigned int fwd[2]; /* Offset, relative to buf->p, to the next
315 * byte to forward for a specific channel
316 * 0: request channel, 1: response channel */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200317 unsigned int pre_analyzers; /* bit field indicating analyzers to
318 * pre-process */
319 unsigned int post_analyzers; /* bit field indicating analyzers to
320 * post-process */
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200321 struct list list; /* Next filter for the same proxy/stream */
322 };
323
324 * 'filter.config' is the filter configuration previously described. All
325 instances of a filter share it.
326
327 * 'filter.ctx' is an opaque context. It is managed by the filter, so it is its
328 responsibility to free it.
329
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200330 * 'filter.pre_analyzers and 'filter.post_analyzers will be described later
331 (See § 3.5).
332
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200333 * 'filter.next' and 'filter.fwd' will be described later (See § 3.6).
334
335
3363.2. DEFINING THE FILTER NAME AND ITS CONFIGURATION
337---------------------------------------------------
338
339When you write a filter, the first thing to do is to add it in the supported
340filters. To do so, you must register its name as a valid keyword on the filter
341line:
342
343 /* Declare the filter parser for "my_filter" keyword */
344 static struct flt_kw_list flt_kws = { "MY_FILTER_SCOPE", { }, {
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200345 { "my_filter", parse_my_filter_cfg, NULL /* private data */ },
346 { NULL, NULL, NULL },
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200347 }
348 };
349
350 __attribute__((constructor))
351 static void
352 __my_filter_init(void)
353 {
354 flt_register_keywords(&flt_kws);
355 }
356
357
358Then you must define the internal configuration your filter will use. For
359example:
360
361 struct my_filter_config {
362 struct proxy *proxy;
363 char *name;
364 /* ... */
365 };
366
367
368You also must list all callbacks implemented by your filter. Here, we use a
369global variable:
370
371 struct flt_ops my_filter_ops {
372 .init = my_filter_init,
373 .deinit = my_filter_deinit,
374 .check = my_filter_config_check,
375
376 /* ... */
377 };
378
379
380Finally, you must define the function to parse your filter configuration, here
381'parse_my_filter_cfg'. This function must parse all remaining keywords on the
382filter line:
383
384 /* Return -1 on error, else 0 */
385 static int
386 parse_my_filter_cfg(char **args, int *cur_arg, struct proxy *px,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200387 struct flt_conf *flt_conf, char **err, void *private)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200388 {
389 struct my_filter_config *my_conf;
390 int pos = *cur_arg;
391
392 /* Allocate the internal configuration used by the filter */
393 my_conf = calloc(1, sizeof(*my_conf));
394 if (!my_conf) {
395 memprintf(err, "%s: out of memory", args[*cur_arg]);
396 return -1;
397 }
398 my_conf->proxy = px;
399
400 /* ... */
401
402 /* Parse all keywords supported by the filter and fill the internal
403 * configuration */
404 pos++; /* Skip the filter name */
405 while (*args[pos]) {
406 if (!strcmp(args[pos], "name")) {
407 if (!*args[pos + 1]) {
408 memprintf(err, "'%s' : '%s' option without value",
409 args[*cur_arg], args[pos]);
410 goto error;
411 }
412 my_conf->name = strdup(args[pos + 1]);
413 if (!my_conf->name) {
414 memprintf(err, "%s: out of memory", args[*cur_arg]);
415 goto error;
416 }
417 pos += 2;
418 }
419
420 /* ... parse other keywords ... */
421 }
422 *cur_arg = pos;
423
424 /* Set callbacks supported by the filter */
425 flt_conf->ops = &my_filter_ops;
426
427 /* Last, save the internal configuration */
428 flt_conf->conf = my_conf;
429 return 0;
430
431 error:
432 if (my_conf->name)
433 free(my_conf->name);
434 free(my_conf);
435 return -1;
436 }
437
438
439WARNING: In your parsing function, you must define 'flt_conf->ops'. You must
440 also parse all arguments on the filter line. This is mandatory.
441
442In the previous example, we expect to read a filter line as follows:
443
444 filter my_filter name MY_NAME ...
445
446
447Optionnaly, by implementing the 'flt_ops.check' callback, you add a step to
448check the internal configuration of your filter after the parsing phase, when
449the HAProxy configuration is fully defined. For example:
450
451 /* Check configuration of a trace filter for a specified proxy.
452 * Return 1 on error, else 0. */
453 static int
454 my_filter_config_check(struct proxy *px, struct flt_conf *my_conf)
455 {
456 if (px->mode != PR_MODE_HTTP) {
457 Alert("The filter 'my_filter' cannot be used in non-HTTP mode.\n");
458 return 1;
459 }
460
461 /* ... */
462
463 return 0;
464 }
465
466
467
4683.3. MANAGING THE FILTER LIFECYCLE
469----------------------------------
470
471Once the configuration parsed and checked, filters are ready to by used. There
472are two callbacks to manage the filter lifecycle:
473
474 * 'flt_ops.init': It initializes the filter for a proxy. You may define this
475 callback if you need to complete your filter configuration.
476
477 * 'flt_ops.deinit': It cleans up what the parsing function and the init
478 callback have done. This callback is useful to release
479 memory allocated for the filter configuration.
480
481Here is an example:
482
483 /* Initialize the filter. Returns -1 on error, else 0. */
484 static int
485 my_filter_init(struct proxy *px, struct flt_conf *fconf)
486 {
487 struct my_filter_config *my_conf = fconf->conf;
488
489 /* ... */
490
491 return 0;
492 }
493
494 /* Free ressources allocated by the trace filter. */
495 static void
496 my_filter_deinit(struct proxy *px, struct flt_conf *fconf)
497 {
498 struct my_filter_config *my_conf = fconf->conf;
499
500 if (my_conf) {
501 free(my_conf->name);
502 /* ... */
503 free(my_conf);
504 }
505 fconf->conf = NULL;
506 }
507
508
509TODO: Add callbacks to handle creation/destruction of filter instances. And
510 document it.
511
512
5133.4. HANDLING THE STREAMS CREATION AND DESCTRUCTION
514---------------------------------------------------
515
516You may be interessted to handle stream creation and destruction. If so, you
517must define followings callbacks:
518
519 * 'flt_ops.stream_start': It is called when a stream is started. This callback
520 can fail by returning a negative value. It will be
521 considered as a critical error by HAProxy which
522 disabled the listener for a short time.
523
524 * 'flt_ops.stream_stop': It is called when a stream is stopped. This callback
525 always succeed. Anyway, it is too late to return an
526 error.
527
528For example:
529
530 /* Called when a stream is created. Returns -1 on error, else 0. */
531 static int
532 my_filter_stream_start(struct stream *s, struct filter *filter)
533 {
534 struct my_filter_config *my_conf = FLT_CONF(filter);
535
536 /* ... */
537
538 return 0;
539 }
540
541 /* Called when a stream is destroyed */
542 static void
543 my_filter_stream_stop(struct stream *s, struct filter *filter)
544 {
545 struct my_filter_config *my_conf = FLT_CONF(filter);
546
547 /* ... */
548 }
549
550
551WARNING: Handling the streams creation and destuction is only possible for
552 filters defined on proxies with the frontend capability.
553
554
5553.5. ANALYZING THE CHANNELS ACTIVITY
556------------------------------------
557
558The main purpose of filters is to take part in the channels analyzing. To do so,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200559there is 2 callbacks, 'flt_ops.channel_pre_analyze' and
560'flt_ops.channel_post_analyze', called respectively before and after each
561analyzer attached to a channel, execpt analyzers responsible for the data
562parsing/forwarding (TCP or HTTP data). Concretely, on the request channel, these
563callbacks could be called before following analyzers:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200564
565 * tcp_inspect_request (AN_REQ_INSPECT_FE and AN_REQ_INSPECT_BE)
566 * http_wait_for_request (AN_REQ_WAIT_HTTP)
567 * http_wait_for_request_body (AN_REQ_HTTP_BODY)
568 * http_process_req_common (AN_REQ_HTTP_PROCESS_FE)
569 * process_switching_rules (AN_REQ_SWITCHING_RULES)
570 * http_process_req_ common (AN_REQ_HTTP_PROCESS_BE)
571 * http_process_tarpit (AN_REQ_HTTP_TARPIT)
572 * process_server_rules (AN_REQ_SRV_RULES)
573 * http_process_request (AN_REQ_HTTP_INNER)
574 * tcp_persist_rdp_cookie (AN_REQ_PRST_RDP_COOKIE)
575 * process_sticking_rules (AN_REQ_STICKING_RULES)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200576
577And on the response channel:
578
579 * tcp_inspect_response (AN_RES_INSPECT)
580 * http_wait_for_response (AN_RES_WAIT_HTTP)
581 * process_store_rules (AN_RES_STORE_RULES)
582 * http_process_res_common (AN_RES_HTTP_PROCESS_BE)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200583
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200584Unlike the other callbacks previously seen before, 'flt_ops.channel_pre_analyze'
585can interrupt the stream processing. So a filter can decide to not execute the
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200586analyzer that follows and wait the next iteration. If there are more than one
587filter, following ones are skipped. On the next iteration, the filtering resumes
588where it was stopped, i.e. on the filter that has previously stopped the
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200589processing. So it is possible for a filter to stop the stream processing on a
590specific analyzer for a while before continuing. Moreover, this callback can be
591called many times for the same analyzer, until it finishes its processing. For
592example:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200593
594 /* Called before a processing happens on a given channel.
595 * Returns a negative value if an error occurs, 0 if it needs to wait,
596 * any other value otherwise. */
597 static int
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200598 my_filter_chn_pre_analyze(struct stream *s, struct filter *filter,
599 struct channel *chn, unsigned an_bit)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200600 {
601 struct my_filter_config *my_conf = FLT_CONF(filter);
602
603 switch (an_bit) {
604 case AN_REQ_WAIT_HTTP:
605 if (/* wait that a condition is verified before continuing */)
606 return 0;
607 break;
608 /* ... * /
609 }
610 return 1;
611 }
612
613 * 'an_bit' is the analyzer id. All analyzers are listed in
614 'include/types/channels.h'.
615
616 * 'chn' is the channel on which the analyzing is done. You can know if it is
617 the request or the response channel by testing if CF_ISRESP flag is set:
618
619 │ ((chn->flags & CF_ISRESP) == CF_ISRESP)
620
621
622In previous example, the stream processing is blocked before receipt of the HTTP
623request until a condition is verified.
624
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200625'flt_ops.channel_post_analyze', for its part, is not resumable. It returns a
626negative value if an error occurs, any other value otherwise. It is called when
627a filterable analyzer finishes its processing. So it called once for the same
628analyzer. For example:
629
630 /* Called after a processing happens on a given channel.
631 * Returns a negative value if an error occurs, any other
632 * value otherwise. */
633 static int
634 my_filter_chn_post_analyze(struct stream *s, struct filter *filter,
635 struct channel *chn, unsigned an_bit)
636 {
637 struct my_filter_config *my_conf = FLT_CONF(filter);
638 struct http_msg *msg;
639
640 switch (an_bit) {
641 case AN_REQ_WAIT_HTTP:
642 if (/* A test on received headers before any other treatment */) {
643 msg = ((chn->flags & CF_ISRESP) ? &s->txn->rsp : &s->txn->req);
644 txn->status = 400;
645 msg->msg_state = HTTP_MSG_ERROR;
646 http_reply_and_close(s, s->txn->status,
647 http_error_message(s, HTTP_ERR_400));
648 return -1; /* This is an error ! */
649 }
650 break;
651 /* ... * /
652 }
653 return 1;
654 }
655
656
657Pre and post analyzer callbacks of a filter are not automatically called. You
658must register it explicitly on analyzers, updating the value of
659'filter.pre_analyzers' and 'filter.post_analyzers' bit fields. All analyzer bits
660are listed in 'include/types/channels.h'. Here is an example:
661
662 static int
663 my_filter_stream_start(struct stream *s, struct filter *filter)
664 {
665 /* ... * /
666
667 /* Register the pre analyzer callback on all request and response
668 * analyzers */
669 filter->pre_analyzers |= (AN_REQ_ALL | AN_RES_ALL)
670
671 /* Register the post analyzer callback of only on AN_REQ_WAIT_HTTP and
672 * AN_RES_WAIT_HTTP analyzers */
673 filter->post_analyzers |= (AN_REQ_WAIT_HTTP | AN_RES_WAIT_HTTP)
674
675 /* ... * /
676 return 0;
677 }
678
679
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200680To surround activity of a filter during the channel analyzing, two new analyzers
681has been added:
682
683 * 'flt_start_analyze' (AN_FLT_START_FE/AN_FLT_START_BE): For a specific
684 filter, this analyzer is called before any call to the 'channel_analyze'
685 callback. From the filter point of view, it calls the
686 'flt_ops.channel_start_analyze' callback.
687
688 * 'flt_end_analyze' (AN_FLT_END): For a specific filter, this analyzer is
689 called when all other analyzers have finished their processing. From the
690 filter point of view, it calls the 'flt_ops.channel_end_analyze' callback.
691
692For TCP streams, these analyzers are called only once. For HTTP streams, if the
693client connection is kept alive, this happens at each request/response roundtip.
694
695'flt_ops.channel_start_analyze' and 'flt_ops.channel_end_analyze' callbacks can
696interrupt the stream processing, as 'flt_ops.channel_analyze'. Here is an
697example:
698
699 /* Called when analyze starts for a given channel
700 * Returns a negative value if an error occurs, 0 if it needs to wait,
701 * any other value otherwise. */
702 static int
703 my_filter_chn_start_analyze(struct stream *s, struct filter *filter,
704 struct channel *chn)
705 {
706 struct my_filter_config *my_conf = FLT_CONF(filter);
707
708 /* ... TODO ... */
709
710 return 1;
711 }
712
713 /* Called when analyze ends for a given channel
714 * Returns a negative value if an error occurs, 0 if it needs to wait,
715 * any other value otherwise. */
716 static int
717 my_filter_chn_end_analyze(struct stream *s, struct filter *filter,
718 struct channel *chn)
719 {
720 struct my_filter_config *my_conf = FLT_CONF(filter);
721
722 /* ... TODO ... */
723
724 return 1;
725 }
726
727
728Workflow on channels can be summarized as following:
729
730 |
731 +----------+-----------+
732 | flt_ops.stream_start |
733 +----------+-----------+
734 |
735 ...
736 |
737 +-<-- [1] +------->---------+
738 | --+ | | --+
739 +------<----------+ | | +--------<--------+ |
740 | | | | | | |
741 V | | | V | |
742+-------------------------------+ | | | +-------------------------------+ | |
743| flt_start_analyze +-+ | | | flt_start_analyze +-+ |
744|(flt_ops.channel_start_analyze)| | F | |(flt_ops.channel_start_analyze)| |
745+---------------+---------------+ | R | +---------------+---------------+ |
746 | | O | | |
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200747 +------<---------+ | N ^ +--------<-------+ | B
748 | | | T | | | | A
749+---------------+------------+ | | E | +---------------+------------+ | | C
750|+--------------V-------------+ | | N | |+--------------V-------------+ | | K
751||+----------------------------+ | | D | ||+----------------------------+ | | E
752|||flt_ops.channel_pre_analyze | | | | |||flt_ops.channel_pre_analyze | | | N
753||| V | | | | ||| V | | | D
754||| analyzer +-+ | | ||| analyzer +-+ |
755+|| V | | | +|| V | |
756 +|flt_ops.channel_post_analyze| | | +|flt_ops.channel_post_analyze| |
757 +-------------+--------------+ | | +-------------+--------------+ |
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200758 | --+ | | |
759 +------------>------------+ ... |
760 | |
761 [ data filtering (see below) ] |
762 | |
763 ... |
764 | |
765 +--------<--------+ |
766 | | |
767 V | |
768 +-------------------------------+ | |
769 | flt_end_analyze +-+ |
770 | (flt_ops.channel_end_analyze) | |
771 +---------------+---------------+ |
772 | --+
773 If HTTP stream, go back to [1] --<--+
774 |
775 ...
776 |
777 +----------+-----------+
778 | flt_ops.stream_stop |
779 +----------+-----------+
780 |
781 V
782
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200783By zooming on an analyzer box we have:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200784
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200785 ...
786 |
787 V
788 |
789 +-----------<-----------+
790 | |
791 +-----------------+--------------------+ |
792 | | | |
793 | +--------<---------+ | |
794 | | | | |
795 | V | | |
796 | flt_ops.channel_pre_analyze ->-+ | ^
797 | | | |
798 | | | |
799 | V | |
800 | analyzer --------->-----+--+
801 | | |
802 | | |
803 | V |
804 | flt_ops.channel_post_analyze |
805 | | |
806 | | |
807 +-----------------+--------------------+
808 |
809 V
810 ...
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200811
812
813 3.6. FILTERING THE DATA EXCHANGED
814-----------------------------------
815
816WARNING: To fully understand this part, you must be aware on how the buffers
817 work in HAProxy. In particular, you must be comfortable with the idea
818 of circular buffers. See doc/internals/buffer-operations.txt and
819 doc/internals/buffer-ops.fig for details.
820 doc/internals/body-parsing.txt could also be useful.
821
822An extended feature of the filters is the data filtering. By default a filter
823does not look into data exchanged between the client and the server because it
824is expensive. Indeed, instead of forwarding data without any processing, each
825byte need to be buffered.
826
827So, to enable the data filtering on a channel, at any time, in one of previous
828callbacks, you should call 'register_data_filter' function. And conversely, to
829disable it, you should call 'unregister_data_filter' function. For example:
830
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200831 my_filter_http_headers(struct stream *s, struct filter *filter,
832 struct http_msg *msg)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200833 {
834 struct my_filter_config *my_conf = FLT_CONF(filter);
835
836 /* 'chn' must be the request channel */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200837 if (!(msg->chn->flags & CF_ISRESP)) {
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200838 struct http_txn *txn = s->txn;
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200839 struct buffer *req = msg->chn->buf;
840 struct hdr_ctx ctx;
841
842 /* Enable the data filtering for the request if 'X-Filter' header
843 * is set to 'true'. */
844 if (http_find_header2("X-Filter", 8, req->p, &txn->hdr_idx, &ctx) &&
845 ctx.vlen >= 3 && memcmp(ctx.line + ctx.val, "true", 4) == 0)
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200846 register_data_filter(s, chn, filter);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200847 }
848
849 return 1;
850 }
851
852Here, the data filtering is enabled if the HTTP header 'X-Filter' is found and
853set to 'true'.
854
855If several filters are declared, the evaluation order remains the same,
856regardless the order of the registrations to the data filtering.
857
858Depending on the stream type, TCP or HTTP, the way to handle data filtering will
859be slightly different. Among other things, for HTTP streams, there are more
860callbacks to help you to fully handle all steps of an HTTP transaction. But the
861basis is the same. The data filtering is done in 2 stages:
862
863 * The data parsing: At this stage, filters will analyze input data on a
864 channel. Once a filter has parsed some data, it cannot parse it again. At
865 any time, a filter can choose to not parse all available data. So, it is
866 possible for a filter to retain data for a while. Because filters are
867 chained, a filter cannot parse more data than its predecessors. Thus only
868 data considered as parsed by the last filter will be available to the next
869 stage, the data forwarding.
870
871 * The data forwarding: At this stage, filters will decide how much data
872 HAProxy can forward among those considered as parsed at the previous
873 stage. Once a filter has marked data as forwardable, it cannot analyze it
874 anymore. At any time, a filter can choose to not forward all parsed
875 data. So, it is possible for a filter to retain data for a while. Because
876 filters are chained, a filter cannot forward more data than its
877 predecessors. Thus only data marked as forwardable by the last filter will
878 be actually forwarded by HAProxy.
879
880Internally, filters own 2 offsets, relatively to 'buf->p', representing the
881number of bytes already parsed in the available input data and the number of
882bytes considered as forwarded. We will call these offsets, respectively, 'nxt'
883and 'fwd'. Following macros reference these offsets:
884
885 * FLT_NXT(flt, chn), flt_req_nxt(flt) and flt_rsp_nxt(flt)
886
887 * FLT_FWD(flt, chn), flt_req_fwd(flt) and flt_rsp_fwd(flt)
888
889where 'flt' is the 'struct filter' passed as argument in all callbacks and 'chn'
890is the considered channel.
891
892Using these offsets, following operations on buffers are possible:
893
894 chn->buf->p + FLT_NXT(flt, chn) // the pointer on parsable data for
895 // the filter 'flt' on the channel 'chn'.
896 // Everything between chn->buf->p and 'nxt' offset was already parsed
897 // by the filter.
898
899 chn->buf->i - FLT_NXT(flt, chn) // the number of bytes of parsable data for
900 // the filter 'flt' on the channel 'chn'.
901
902 chn->buf->p + FLT_FWD(flt, chn) // the pointer on forwardable data for
903 // the filter 'flt' on the channel 'chn'.
904 // Everything between chn->buf->p and 'fwd' offset was already forwarded
905 // by the filter.
906
907
908Note that at any time, for a filter, 'nxt' offset is always greater or equal to
909'fwd' offset.
910
911TODO: Add schema with buffer states when there is 2 filters that analyze data.
912
913
9143.6.1 FILTERING DATA ON TCP STREAMS
915-----------------------------------
916
917The TCP data filtering is the easy case, because HAProxy do not parse these
918data. So you have only two callbacks that you need to consider:
919
920 * 'flt_ops.tcp_data': This callback is called when unparsed data are
921 available. If not defined, all available data will be considered as parsed
922 for the filter.
923
924 * 'flt_ops.tcp_forward_data': This callback is called when parsed data are
925 available. If not defined, all parsed data will be considered as forwarded
926 for the filter.
927
928Here is an example:
929
930 /* Returns a negative value if an error occurs, else the number of
931 * consumed bytes. */
932 static int
933 my_filter_tcp_data(struct stream *s, struct filter *filter,
934 struct channel *chn)
935 {
936 struct my_filter_config *my_conf = FLT_CONF(filter);
937 int avail = chn->buf->i - FLT_NXT(filter, chn);
938 int ret = avail;
939
940 /* Do not parse more than 'my_conf->max_parse' bytes at a time */
941 if (my_conf->max_parse != 0 && ret > my_conf->max_parse)
942 ret = my_conf->max_parse;
943
944 /* if available data are not completely parsed, wake up the stream to
945 * be sure to not freeze it. */
946 if (ret != avail)
947 task_wakeup(s->task, TASK_WOKEN_MSG);
948 return ret;
949 }
950
951
952 /* Returns a negative value if an error occurs, else * or the number of
953 * forwarded bytes. */
954 static int
955 my_filter_tcp_forward_data(struct stream *s, struct filter *filter,
956 struct channel *chn, unsigned int len)
957 {
958 struct my_filter_config *my_conf = FLT_CONF(filter);
959 int ret = len;
960
961 /* Do not forward more than 'my_conf->max_forward' bytes at a time */
962 if (my_conf->max_forward != 0 && ret > my_conf->max_forward)
963 ret = my_conf->max_forward;
964
965 /* if parsed data are not completely forwarded, wake up the stream to
966 * be sure to not freeze it. */
967 if (ret != len)
968 task_wakeup(s->task, TASK_WOKEN_MSG);
969 return ret;
970 }
971
972
973
9743.6.2 FILTERING DATA ON HTTP STREAMS
975------------------------------------
976
977The HTTP data filtering is a bit tricky because HAProxy will parse the body
978structure, especially chunked body. So basically there is the HTTP counterpart
979to the previous callbacks:
980
981 * 'flt_ops.http_data': This callback is called when unparsed data are
982 available. If not defined, all available data will be considered as parsed
983 for the filter.
984
985 * 'flt_ops.http_forward_data': This callback is called when parsed data are
986 available. If not defined, all parsed data will be considered as forwarded
987 for the filter.
988
989But the prototype for these callbacks is slightly different. Instead of having
990the channel as parameter, we have the HTTP message (struct http_msg). You need
991to be careful when you use 'http_msg.chunk_len' size. This value is the number
992of bytes remaining to parse in the HTTP body (or the chunk for chunked
993messages). The HTTP parser of HAProxy uses it to have the number of bytes that
994it could consume:
995
996 /* Available input data in the current chunk from the HAProxy point of view.
997 * msg->next bytes were already parsed. Without data filtering, HAProxy
998 * will consume all of it. */
999 Bytes = MIN(msg->chunk_len, chn->buf->i - msg->next);
1000
1001
1002But in your filter, you need to recompute it:
1003
1004 /* Available input data in the current chunk from the filter point of view.
1005 * 'nxt' bytes were already parsed. */
1006 Bytes = MIN(msg->chunk_len + msg->next, chn->buf->i) - FLT_NXT(flt, chn);
1007
1008
Christopher Fauletf34b28a2016-05-11 17:29:14 +02001009In addition to these callbacks, there are three others:
1010
1011 * 'flt_ops.http_headers': This callback is called just before the HTTP body
1012 parsing and after any processing on the request/response HTTP headers. When
1013 defined, this callback is always called for HTTP streams (i.e. without needs
1014 of a registration on data filtering).
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001015
1016 * 'flt_ops.http_end': This callback is called when the whole HTTP
1017 request/response is processed. It can interrupt the stream processing. So,
1018 it could be used to synchronize the HTTP request with the HTTP response, for
1019 example:
1020
1021 /* Returns a negative value if an error occurs, 0 if it needs to wait,
1022 * any other value otherwise. */
1023 static int
1024 my_filter_http_end(struct stream *s, struct filter *filter,
1025 struct http_msg *msg)
1026 {
1027 struct my_filter_ctx *my_ctx = filter->ctx;
1028
1029
1030 if (!(msg->chn->flags & CF_ISRESP)) /* The request */
1031 my_ctx->end_of_req = 1;
1032 else /* The response */
1033 my_ctx->end_of_rsp = 1;
1034
1035 /* Both the request and the response are finished */
1036 if (my_ctx->end_of_req == 1 && my_ctx->end_of_rsp == 1)
1037 return 1;
1038
1039 /* Wait */
1040 return 0;
1041 }
1042
1043
1044 * 'flt_ops.http_chunk_trailers': This callback is called for chunked HTTP
1045 messages only when all chunks were parsed. HTTP trailers can be parsed into
1046 several passes. This callback will be called each time. The number of bytes
1047 parsed by HAProxy at each iteration is stored in 'msg->sol'.
1048
1049Then, to finish, there are 2 informational callbacks:
1050
1051 * 'flt_ops.http_reset': This callback is called when a HTTP message is
1052 reset. This only happens when a '100-continue' response is received. It
1053 could be useful to reset the filter context before receiving the true
1054 response.
1055
1056 * 'flt_ops.http_reply': This callback is called when, at any time, HAProxy
1057 decides to stop the processing on a HTTP message and to send an internal
1058 response to the client. This mainly happens when an error or a redirect
1059 occurs.
1060
1061
10623.6.3 REWRITING DATA
1063--------------------
1064
1065The last part, and the trickiest one about the data filtering, is about the data
1066rewriting. For now, the filter API does not offer a lot of functions to handle
1067it. There are only functions to notify HAProxy that the data size has changed to
1068let it update internal state of filters. This is your responsibility to update
1069data itself, i.e. the buffer offsets. For a HTTP message, you also must update
1070'msg->next' and 'msg->chunk_len' values accordingly:
1071
1072 * 'flt_change_next_size': This function must be called when a filter alter
1073 incoming data. It updates 'nxt' offset value of all its predecessors. Do not
1074 call this function when a filter change the size of incoming data leads to
1075 an undefined behavior.
1076
1077 unsigned int avail = MIN(msg->chunk_len + msg->next, chn->buf->i) -
1078 flt_rsp_next(filter);
1079
1080 if (avail > 10 and /* ...Some condition... */) {
1081 /* Move the buffer forward to have buf->p pointing on unparsed
1082 * data */
1083 b_adv(msg->chn->buf, flt_rsp_nxt(filter));
1084
1085 /* Skip first 10 bytes. To simplify this example, we consider a
1086 * non-wrapping buffer */
1087 memmove(buf->p + 10, buf->p, avail - 10);
1088
1089 /* Restore buf->p value */
1090 b_rew(msg->chn->buf, flt_rsp_nxt(filter));
1091
1092 /* Now update other filters */
1093 flt_change_next_size(filter, msg->chn, -10);
1094
1095 /* Update the buffer state */
1096 buf->i -= 10;
1097
1098 /* And update the HTTP message state */
1099 msg->chunk_len -= 10;
1100
1101 return (avail - 10);
1102 }
1103 else
1104 return 0; /* Wait for more data */
1105
1106
1107 * 'flt_change_forward_size': This function must be called when a filter alter
1108 parsed data. It updates offset values ('nxt' and 'fwd') of all filters. Do
1109 not call this function when a filter change the size of parsed data leads to
1110 an undefined behavior.
1111
1112 /* len is the number of bytes of forwardable data */
1113 if (len > 10 and /* ...Some condition... */) {
1114 /* Move the buffer forward to have buf->p pointing on non-forwarded
1115 * data */
1116 b_adv(msg->chn->buf, flt_rsp_fwd(filter));
1117
1118 /* Skip first 10 bytes. To simplify this example, we consider a
1119 * non-wrapping buffer */
1120 memmove(buf->p + 10, buf->p, len - 10);
1121
1122 /* Restore buf->p value */
1123 b_rew(msg->chn->buf, flt_rsp_fwd(filter));
1124
1125 /* Now update other filters */
1126 flt_change_forward_size(filter, msg->chn, -10);
1127
1128 /* Update the buffer state */
1129 buf->i -= 10;
1130
1131 /* And update the HTTP message state */
1132 msg->next -= 10;
1133
1134 return (len - 10);
1135 }
1136 else
1137 return 0; /* Wait for more data */
1138
1139
1140TODO: implement all the stuff to easily rewrite data. For HTTP messages, this
1141 requires to have a chunked message. Else the size of data cannot be
1142 changed.
1143
1144
1145
1146
11474. FAQ
1148------
1149
11504.1. Detect multiple declarations of the same filter
1151----------------------------------------------------
1152
1153TODO