blob: da73acce592d0fd636a5159e36763c7003bd344d [file] [log] [blame]
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001 -----------------------------------------
Willy Tarreau8ab65c22021-02-26 22:49:10 +01002 Filters Guide - version 2.4
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01003 ( Last update: 2021-02-24 )
Christopher Fauletc3fe5332016-04-07 15:30:10 +02004 ------------------------------------------
5 Author : Christopher Faulet
6 Contact : christopher dot faulet at capflam dot org
7
8
9ABSTRACT
10--------
11
12The filters support is a new feature of HAProxy 1.7. It is a way to extend
13HAProxy without touching its core code and, in certain extent, without knowing
14its internals. This feature will ease contributions, reducing impact of
15changes. Another advantage will be to simplify HAProxy by replacing some parts
16by filters. As we will see, and as an example, the HTTP compression is the first
17feature moved in a filter.
18
Christopher Faulet74d7b6e2021-02-24 21:58:43 +010019This document describes how to write a filter and what to keep in mind to do
20so. It also talks about the known limits and the pitfalls to avoid.
Christopher Fauletc3fe5332016-04-07 15:30:10 +020021
22As said, filters are quite new for now. The API is not freezed and will be
23updated/modified/improved/extended as needed.
24
25
26
27SUMMARY
28-------
29
30 1. Filters introduction
31 2. How to use filters
32 3. How to write a new filter
33 3.1. API Overview
34 3.2. Defining the filter name and its configuration
35 3.3. Managing the filter lifecycle
Christopher Faulet71a6a8e2017-07-27 16:33:28 +020036 3.3.1. Dealing with threads
Christopher Faulet9adb0a52016-06-21 11:50:49 +020037 3.4. Handling the streams activity
Christopher Fauletc3fe5332016-04-07 15:30:10 +020038 3.5. Analyzing the channels activity
39 3.6. Filtering the data exchanged
40 4. FAQ
41
42
43
441. FILTERS INTRODUCTION
45-----------------------
46
47First of all, to fully understand how filters work and how to create one, it is
48best to know, at least from a distance, what is a proxy (frontend/backend), a
49stream and a channel in HAProxy and how these entities are linked to each other.
50doc/internals/entities.pdf is a good overview.
51
52Then, to support filters, many callbacks has been added to HAProxy at different
53places, mainly around channel analyzers. Their purpose is to allow filters to
54be involved in the data processing, from the stream creation/destruction to
55the data forwarding. Depending of what it should do, a filter can implement all
56or part of these callbacks. For now, existing callbacks are focused on
Christopher Faulet74d7b6e2021-02-24 21:58:43 +010057streams. But future improvements could enlarge filters scope. For instance, it
Christopher Fauletc3fe5332016-04-07 15:30:10 +020058could be useful to handle events at the connection level.
59
60In HAProxy configuration file, a filter is declared in a proxy section, except
61default. So the configuration corresponding to a filter declaration is attached
62to a specific proxy, and will be shared by all its instances. it is opaque from
63the HAProxy point of view, this is the filter responsibility to manage it. For
64each filter declaration matches a uniq configuration. Several declarations of
65the same filter in the same proxy will be handle as different filters by
66HAProxy.
67
68A filter instance is represented by a partially opaque context (or a state)
69attached to a stream and passed as arguments to callbacks. Through this context,
70filter instances are stateful. Depending the filter is declared in a frontend or
71a backend section, its instances will be created, respectively, when a stream is
72created or when a backend is selected. Their behaviors will also be
73different. Only instances of filters declared in a frontend section will be
74aware of the creation and the destruction of the stream, and will take part in
75the channels analyzing before the backend is defined.
76
77It is important to remember the configuration of a filter is shared by all its
78instances, while the context of an instance is owned by a uniq stream.
79
80Filters are designed to be chained. It is possible to declare several filters in
81the same proxy section. The declaration order is important because filters will
82be called one after the other respecting this order. Frontend and backend
83filters are also chained, frontend ones called first. Even if the filters
84processing is serialized, each filter will bahave as it was alone (unless it was
85developed to be aware of other filters). For all that, some constraints are
86imposed to filters, especially when data exchanged between the client and the
Miroslav Zagoracd80f5c02020-03-26 20:45:04 +010087server are processed. We will discuss again these constraints when we will tackle
Christopher Fauletc3fe5332016-04-07 15:30:10 +020088the subject of writing a filter.
89
90
91
922. HOW TO USE FILTERS
93---------------------
94
Christopher Faulet74d7b6e2021-02-24 21:58:43 +010095To use a filter, the parameter 'filter' should be used, followed by the filter
96name and, optionally, its configuration in the desired listen, frontend or
97backend section. For instance :
Christopher Fauletc3fe5332016-04-07 15:30:10 +020098
99 listen test
100 ...
101 filter trace name TST
102 ...
103
104
105See doc/configuration.txt for a formal definition of the parameter 'filter'.
106Note that additional parameters on the filter line must be parsed by the filter
107itself.
108
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100109The list of available filters is reported by 'haproxy -vv' :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200110
111 $> haproxy -vv
112 HA-Proxy version 1.7-dev2-3a1d4a-33 2016/03/21
113 Copyright 2000-2016 Willy Tarreau <willy@haproxy.org>
114
115 [...]
116
117 Available filters :
118 [COMP] compression
119 [TRACE] trace
120
121
122Multiple filter lines can be used in a proxy section to chain filters. Filters
123will be called in the declaration order.
124
Miroslav Zagoracd80f5c02020-03-26 20:45:04 +0100125Some filters can support implicit declarations in certain circumstances
Ilya Shipitsin2075ca82020-03-06 23:22:22 +0500126(without the filter line). This is not recommended for new features but are
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200127useful for existing ones moved in a filter, for backward compatibility
Miroslav Zagoracd80f5c02020-03-26 20:45:04 +0100128reasons. Implicit declarations are supported when there is only one filter used
129on a proxy. When several filters are used, explicit declarations are mandatory.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200130The HTTP compression filter is one of these filters. Alone, using 'compression'
131keywords is enough to use it. But when at least a second filter is used, a
132filter line must be added.
133
Ilya Shipitsin2075ca82020-03-06 23:22:22 +0500134 # filter line is optional
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200135 listen t1
136 bind *:80
137 compression algo gzip
138 compression offload
139 server srv x.x.x.x:80
140
141 # filter line is mandatory for the compression filter
142 listen t2
143 bind *:81
144 filter trace name T2
145 filter compression
146 compression algo gzip
147 compression offload
148 server srv x.x.x.x:80
149
150
151
152
1533. HOW TO WRITE A NEW FILTER
154----------------------------
155
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100156To write a filter, there are 2 header files to explore :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200157
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100158 * include/haproxy/filters-t.h : This is the main header file, containing all
159 important structures to use. It represents the
160 filter API.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200161
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100162 * include/haproxy/filters.h : This header file contains helper functions that
163 may be used. It also contains the internal API
164 used by HAProxy to handle filters.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200165
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100166To ease the filters integration, it is better to follow some conventions :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200167
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100168 * Use 'flt_' prefix to name the filter (e.g flt_http_comp or flt_trace).
169
170 * Keep everything related to the filter in a same file.
171
172The filter 'trace' can be used as a template to write new filter. It is a good
173start to see how filters really work.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200174
1753.1 API OVERVIEW
176----------------
177
178Writing a filter can be summarized to write functions and attach them to the
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100179existing callbacks. Available callbacks are listed in the following structure :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200180
181 struct flt_ops {
182 /*
183 * Callbacks to manage the filter lifecycle
184 */
Christopher Faulet71a6a8e2017-07-27 16:33:28 +0200185 int (*init) (struct proxy *p, struct flt_conf *fconf);
186 void (*deinit) (struct proxy *p, struct flt_conf *fconf);
187 int (*check) (struct proxy *p, struct flt_conf *fconf);
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100188 int (*init_per_thread) (struct proxy *p, struct flt_conf *fconf);
189 void (*deinit_per_thread)(struct proxy *p, struct flt_conf *fconf);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200190
191 /*
192 * Stream callbacks
193 */
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200194 int (*attach) (struct stream *s, struct filter *f);
195 int (*stream_start) (struct stream *s, struct filter *f);
196 int (*stream_set_backend)(struct stream *s, struct filter *f, struct proxy *be);
197 void (*stream_stop) (struct stream *s, struct filter *f);
198 void (*detach) (struct stream *s, struct filter *f);
Christopher Fauleta00d8172016-11-10 14:58:05 +0100199 void (*check_timeouts) (struct stream *s, struct filter *f);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200200
201 /*
202 * Channel callbacks
203 */
204 int (*channel_start_analyze)(struct stream *s, struct filter *f,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200205 struct channel *chn);
206 int (*channel_pre_analyze) (struct stream *s, struct filter *f,
207 struct channel *chn,
208 unsigned int an_bit);
209 int (*channel_post_analyze) (struct stream *s, struct filter *f,
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200210 struct channel *chn,
211 unsigned int an_bit);
212 int (*channel_end_analyze) (struct stream *s, struct filter *f,
213 struct channel *chn);
214
215 /*
216 * HTTP callbacks
217 */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200218 int (*http_headers) (struct stream *s, struct filter *f,
219 struct http_msg *msg);
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100220 int (*http_payload) (struct stream *s, struct filter *f,
221 struct http_msg *msg, unsigned int offset,
222 unsigned int len);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200223 int (*http_end) (struct stream *s, struct filter *f,
224 struct http_msg *msg);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200225
226 void (*http_reset) (struct stream *s, struct filter *f,
227 struct http_msg *msg);
228 void (*http_reply) (struct stream *s, struct filter *f,
229 short status,
Willy Tarreau83061a82018-07-13 11:56:34 +0200230 const struct buffer *msg);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200231
232 /*
233 * TCP callbacks
234 */
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100235 int (*tcp_payload) (struct stream *s, struct filter *f,
236 struct channel *chn, unsigned int offset,
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200237 unsigned int len);
238 };
239
240
241We will explain in following parts when these callbacks are called and what they
242should do.
243
244Filters are declared in proxy sections. So each proxy have an ordered list of
245filters, possibly empty if no filter is used. When the configuration of a proxy
246is parsed, each filter line represents an entry in this list. In the structure
247'proxy', the filters configurations are stored in the field 'filter_configs',
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100248each one of type 'struct flt_conf *' :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200249
250 /*
251 * Structure representing the filter configuration, attached to a proxy and
252 * accessible from a filter when instantiated in a stream
253 */
254 struct flt_conf {
255 const char *id; /* The filter id */
256 struct flt_ops *ops; /* The filter callbacks */
257 void *conf; /* The filter configuration */
258 struct list list; /* Next filter for the same proxy */
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100259 unsigned int flags; /* FLT_CFG_FL_* */
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200260 };
261
262 * 'flt_conf.id' is an identifier, defined by the filter. It can be
263 NULL. HAProxy does not use this field. Filters can use it in log messages or
264 as a uniq identifier to check multiple declarations. It is the filter
265 responsibility to free it, if necessary.
266
267 * 'flt_conf.conf' is opaque. It is the internal configuration of a filter,
268 generally allocated and filled by its parsing function (See § 3.2). It is
269 the filter responsibility to free it.
270
271 * 'flt_conf.ops' references the callbacks implemented by the filter. This
272 field must be set during the parsing phase (See § 3.2) and can be refine
273 during the initialization phase (See § 3.3). If it is dynamically allocated,
274 it is the filter responsibility to free it.
275
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100276 * 'flt_conf.flags' is a bitfield to specify the filter capabilities. For now,
277 only FLT_CFG_FL_HTX may be set when a filter is able to process HTX
278 streams. If not set, the filter is excluded from the HTTP filtering.
279
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200280
281The filter configuration is global and shared by all its instances. A filter
282instance is created in the context of a stream and attached to this stream. in
283the structure 'stream', the field 'strm_flt' is the state of all filter
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100284instances attached to a stream :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200285
286 /*
Miroslav Zagoracd80f5c02020-03-26 20:45:04 +0100287 * Structure representing the "global" state of filters attached to a
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200288 * stream.
289 */
290 struct strm_flt {
291 struct list filters; /* List of filters attached to a stream */
292 struct filter *current[2]; /* From which filter resume processing, for a specific channel.
293 * This is used for resumable callbacks only,
294 * If NULL, we start from the first filter.
295 * 0: request channel, 1: response channel */
296 unsigned short flags; /* STRM_FL_* */
Joseph Herlant02cedc42018-11-13 19:45:17 -0800297 unsigned char nb_req_data_filters; /* Number of data filters registered on the request channel */
298 unsigned char nb_rsp_data_filters; /* Number of data filters registered on the response channel */
Ilya Shipitsind7a988c2021-03-04 23:26:15 +0500299 unsigned long long offset[2]; /* gloal offset of input data already filtered for a specific channel
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100300 * 0: request channel, 1: response channel */
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200301 };
302
303
304Filter instances attached to a stream are stored in the field
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100305'strm_flt.filters', each instance is of type 'struct filter *' :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200306
307 /*
Miroslav Zagoracd80f5c02020-03-26 20:45:04 +0100308 * Structure representing a filter instance attached to a stream
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200309 *
310 * 2D-Array fields are used to store info per channel. The first index
311 * stands for the request channel, and the second one for the response
Ilya Shipitsin2075ca82020-03-06 23:22:22 +0500312 * channel. Especially, <next> and <fwd> are offsets representing amount of
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200313 * data that the filter are, respectively, parsed and forwarded on a
314 * channel. Filters can access these values using FLT_NXT and FLT_FWD
315 * macros.
316 */
317 struct filter {
318 struct flt_conf *config; /* the filter's configuration */
319 void *ctx; /* The filter context (opaque) */
320 unsigned short flags; /* FLT_FL_* */
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100321 unsigned long long offset[2]; /* Offset of input data already filtered for a specific channel
322 * 0: request channel, 1: response channel */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200323 unsigned int pre_analyzers; /* bit field indicating analyzers to
324 * pre-process */
325 unsigned int post_analyzers; /* bit field indicating analyzers to
326 * post-process */
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200327 struct list list; /* Next filter for the same proxy/stream */
328 };
329
330 * 'filter.config' is the filter configuration previously described. All
331 instances of a filter share it.
332
333 * 'filter.ctx' is an opaque context. It is managed by the filter, so it is its
334 responsibility to free it.
335
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200336 * 'filter.pre_analyzers and 'filter.post_analyzers will be described later
337 (See § 3.5).
338
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100339 * 'filter.offset' will be described later (See § 3.6).
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200340
341
3423.2. DEFINING THE FILTER NAME AND ITS CONFIGURATION
343---------------------------------------------------
344
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100345During the filter development, the first thing to do is to add it in the
346supported filters. To do so, its name must be registered as a valid keyword on
347the filter line :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200348
349 /* Declare the filter parser for "my_filter" keyword */
350 static struct flt_kw_list flt_kws = { "MY_FILTER_SCOPE", { }, {
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200351 { "my_filter", parse_my_filter_cfg, NULL /* private data */ },
352 { NULL, NULL, NULL },
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200353 }
354 };
Willy Tarreau0108d902018-11-25 19:14:37 +0100355 INITCALL1(STG_REGISTER, flt_register_keywords, &flt_kws);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200356
357
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100358Then the filter internal configuration must be defined. For instance :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200359
360 struct my_filter_config {
361 struct proxy *proxy;
362 char *name;
363 /* ... */
364 };
365
366
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100367All callbacks implemented by the filter must then be declared. Here, a global
368variable is used :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200369
370 struct flt_ops my_filter_ops {
371 .init = my_filter_init,
372 .deinit = my_filter_deinit,
373 .check = my_filter_config_check,
374
375 /* ... */
376 };
377
378
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100379Finally, the function to parse the filter configuration must be written, here
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200380'parse_my_filter_cfg'. This function must parse all remaining keywords on the
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100381filter line :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200382
383 /* Return -1 on error, else 0 */
384 static int
385 parse_my_filter_cfg(char **args, int *cur_arg, struct proxy *px,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200386 struct flt_conf *flt_conf, char **err, void *private)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200387 {
388 struct my_filter_config *my_conf;
389 int pos = *cur_arg;
390
391 /* Allocate the internal configuration used by the filter */
392 my_conf = calloc(1, sizeof(*my_conf));
393 if (!my_conf) {
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100394 memprintf(err, "%s : out of memory", args[*cur_arg]);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200395 return -1;
396 }
397 my_conf->proxy = px;
398
399 /* ... */
400
401 /* Parse all keywords supported by the filter and fill the internal
402 * configuration */
403 pos++; /* Skip the filter name */
404 while (*args[pos]) {
405 if (!strcmp(args[pos], "name")) {
406 if (!*args[pos + 1]) {
407 memprintf(err, "'%s' : '%s' option without value",
408 args[*cur_arg], args[pos]);
409 goto error;
410 }
411 my_conf->name = strdup(args[pos + 1]);
412 if (!my_conf->name) {
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100413 memprintf(err, "%s : out of memory", args[*cur_arg]);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200414 goto error;
415 }
416 pos += 2;
417 }
418
419 /* ... parse other keywords ... */
420 }
421 *cur_arg = pos;
422
423 /* Set callbacks supported by the filter */
424 flt_conf->ops = &my_filter_ops;
425
426 /* Last, save the internal configuration */
427 flt_conf->conf = my_conf;
428 return 0;
429
430 error:
431 if (my_conf->name)
432 free(my_conf->name);
433 free(my_conf);
434 return -1;
435 }
436
437
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100438WARNING : In this parsing function, 'flt_conf->ops' must be initialized. All
439 arguments of the filter line must also be parsed. This is mandatory.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200440
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100441In the previous example, the filter lne should be read as follows :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200442
443 filter my_filter name MY_NAME ...
444
445
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100446Optionally, by implementing the 'flt_ops.check' callback, an extra set is added
447to check the internal configuration of the filter after the parsing phase, when
448the HAProxy configuration is fully defined. For instance :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200449
450 /* Check configuration of a trace filter for a specified proxy.
451 * Return 1 on error, else 0. */
452 static int
453 my_filter_config_check(struct proxy *px, struct flt_conf *my_conf)
454 {
455 if (px->mode != PR_MODE_HTTP) {
456 Alert("The filter 'my_filter' cannot be used in non-HTTP mode.\n");
457 return 1;
458 }
459
460 /* ... */
461
462 return 0;
463 }
464
465
466
4673.3. MANAGING THE FILTER LIFECYCLE
468----------------------------------
469
470Once the configuration parsed and checked, filters are ready to by used. There
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100471are two main callbacks to manage the filter lifecycle :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200472
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100473 * 'flt_ops.init' : It initializes the filter for a proxy. This callback may be
474 defined to finish the filter configuration.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200475
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100476 * 'flt_ops.deinit' : It cleans up what the parsing function and the init
477 callback have done. This callback is useful to release
478 memory allocated for the filter configuration.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200479
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100480Here is an example :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200481
482 /* Initialize the filter. Returns -1 on error, else 0. */
483 static int
484 my_filter_init(struct proxy *px, struct flt_conf *fconf)
485 {
486 struct my_filter_config *my_conf = fconf->conf;
487
488 /* ... */
489
490 return 0;
491 }
492
Ilya Shipitsin2075ca82020-03-06 23:22:22 +0500493 /* Free resources allocated by the trace filter. */
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200494 static void
495 my_filter_deinit(struct proxy *px, struct flt_conf *fconf)
496 {
497 struct my_filter_config *my_conf = fconf->conf;
498
499 if (my_conf) {
500 free(my_conf->name);
501 /* ... */
502 free(my_conf);
503 }
504 fconf->conf = NULL;
505 }
506
507
Christopher Faulet71a6a8e2017-07-27 16:33:28 +02005083.3.1 DEALING WITH THREADS
509--------------------------
510
511When HAProxy is compiled with the threads support and started with more that one
512thread (global.nbthread > 1), then it is possible to manage the filter per
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100513thread with following callbacks :
Christopher Faulet71a6a8e2017-07-27 16:33:28 +0200514
515 * 'flt_ops.init_per_thread': It initializes the filter for each thread. It
516 works the same way than 'flt_ops.init' but in the
517 context of a thread. This callback is called
518 after the thread creation.
519
520 * 'flt_ops.deinit_per_thread': It cleans up what the init_per_thread callback
521 have done. It is called in the context of a
522 thread, before exiting it.
523
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100524It is the filter responsibility to deal with concurrency. check, init and deinit
525callbacks are called on the main thread. All others are called on a "worker"
526thread (not always the same). It is also the filter responsibility to know if
527HAProxy is started with more than one thread. If it is started with one thread
528(or compiled without the threads support), these callbacks will be silently
529ignored (in this case, global.nbthread will be always equal to one).
Christopher Faulet71a6a8e2017-07-27 16:33:28 +0200530
531
Christopher Faulet9adb0a52016-06-21 11:50:49 +02005323.4. HANDLING THE STREAMS ACTIVITY
533-----------------------------------
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200534
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100535It may be interesting to handle streams activity. For now, there is three
536callbacks that should define to do so :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200537
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100538 * 'flt_ops.stream_start' : It is called when a stream is started. This
539 callback can fail by returning a negative value. It
540 will be considered as a critical error by HAProxy
541 which disabled the listener for a short time.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200542
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100543 * 'flt_ops.stream_set_backend' : It is called when a backend is set for a
544 stream. This callbacks will be called for all
545 filters attached to a stream (frontend and
546 backend). Note this callback is not called if
547 the frontend and the backend are the same.
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200548
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100549 * 'flt_ops.stream_stop' : It is called when a stream is stopped. This callback
550 always succeed. Anyway, it is too late to return an
551 error.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200552
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100553For instance :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200554
555 /* Called when a stream is created. Returns -1 on error, else 0. */
556 static int
557 my_filter_stream_start(struct stream *s, struct filter *filter)
558 {
559 struct my_filter_config *my_conf = FLT_CONF(filter);
560
561 /* ... */
562
563 return 0;
564 }
565
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200566 /* Called when a backend is set for a stream */
567 static int
568 my_filter_stream_set_backend(struct stream *s, struct filter *filter,
569 struct proxy *be)
570 {
571 struct my_filter_config *my_conf = FLT_CONF(filter);
572
573 /* ... */
574
575 return 0;
576 }
577
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200578 /* Called when a stream is destroyed */
579 static void
580 my_filter_stream_stop(struct stream *s, struct filter *filter)
581 {
582 struct my_filter_config *my_conf = FLT_CONF(filter);
583
584 /* ... */
585 }
586
587
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100588WARNING : Handling the streams creation and destruction is only possible for
589 filters defined on proxies with the frontend capability.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200590
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200591In addition, it is possible to handle creation and destruction of filter
592instances using following callbacks:
593
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100594 * 'flt_ops.attach' : It is called after a filter instance creation, when it is
595 attached to a stream. This happens when the stream is
596 started for filters defined on the stream's frontend and
597 when the backend is set for filters declared on the
598 stream's backend. It is possible to ignore the filter, if
599 needed, by returning 0. This could be useful to have
600 conditional filtering.
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200601
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100602 * 'flt_ops.detach' : It is called when a filter instance is detached from a
603 stream, before its destruction. This happens when the
604 stream is stopped for filters defined on the stream's
605 frontend and when the analyze ends for filters defined on
606 the stream's backend.
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200607
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100608For instance :
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200609
610 /* Called when a filter instance is created and attach to a stream */
611 static int
612 my_filter_attach(struct stream *s, struct filter *filter)
613 {
614 struct my_filter_config *my_conf = FLT_CONF(filter);
615
616 if (/* ... */)
617 return 0; /* Ignore the filter here */
618 return 1;
619 }
620
621 /* Called when a filter instance is detach from a stream, just before its
622 * destruction */
623 static void
624 my_filter_detach(struct stream *s, struct filter *filter)
625 {
626 struct my_filter_config *my_conf = FLT_CONF(filter);
627
628 /* ... */
629 }
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200630
Ilya Shipitsind7a988c2021-03-04 23:26:15 +0500631Finally, it may be interesting to notify the filter when the stream is woken up
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100632because of an expired timer. This could let a chance to check some internal
633timeouts, if any. To do so the following callback must be used :
Christopher Fauleta00d8172016-11-10 14:58:05 +0100634
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100635 * 'flt_opt.check_timeouts' : It is called when a stream is woken up because of
636 an expired timer.
Christopher Fauleta00d8172016-11-10 14:58:05 +0100637
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100638For instance :
Christopher Fauleta00d8172016-11-10 14:58:05 +0100639
640 /* Called when a stream is woken up because of an expired timer */
641 static void
642 my_filter_check_timeouts(struct stream *s, struct filter *filter)
643 {
644 struct my_filter_config *my_conf = FLT_CONF(filter);
645
646 /* ... */
647 }
648
649
Christopher Fauletc3fe5332016-04-07 15:30:10 +02006503.5. ANALYZING THE CHANNELS ACTIVITY
651------------------------------------
652
653The main purpose of filters is to take part in the channels analyzing. To do so,
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200654there is 2 callbacks, 'flt_ops.channel_pre_analyze' and
655'flt_ops.channel_post_analyze', called respectively before and after each
Miroslav Zagoracd80f5c02020-03-26 20:45:04 +0100656analyzer attached to a channel, except analyzers responsible for the data
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100657forwarding (TCP or HTTP). Concretely, on the request channel, these callbacks
658could be called before following analyzers :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200659
660 * tcp_inspect_request (AN_REQ_INSPECT_FE and AN_REQ_INSPECT_BE)
661 * http_wait_for_request (AN_REQ_WAIT_HTTP)
662 * http_wait_for_request_body (AN_REQ_HTTP_BODY)
663 * http_process_req_common (AN_REQ_HTTP_PROCESS_FE)
664 * process_switching_rules (AN_REQ_SWITCHING_RULES)
665 * http_process_req_ common (AN_REQ_HTTP_PROCESS_BE)
666 * http_process_tarpit (AN_REQ_HTTP_TARPIT)
667 * process_server_rules (AN_REQ_SRV_RULES)
668 * http_process_request (AN_REQ_HTTP_INNER)
669 * tcp_persist_rdp_cookie (AN_REQ_PRST_RDP_COOKIE)
670 * process_sticking_rules (AN_REQ_STICKING_RULES)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200671
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100672And on the response channel :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200673
674 * tcp_inspect_response (AN_RES_INSPECT)
675 * http_wait_for_response (AN_RES_WAIT_HTTP)
676 * process_store_rules (AN_RES_STORE_RULES)
677 * http_process_res_common (AN_RES_HTTP_PROCESS_BE)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200678
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200679Unlike the other callbacks previously seen before, 'flt_ops.channel_pre_analyze'
680can interrupt the stream processing. So a filter can decide to not execute the
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200681analyzer that follows and wait the next iteration. If there are more than one
682filter, following ones are skipped. On the next iteration, the filtering resumes
683where it was stopped, i.e. on the filter that has previously stopped the
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200684processing. So it is possible for a filter to stop the stream processing on a
685specific analyzer for a while before continuing. Moreover, this callback can be
686called many times for the same analyzer, until it finishes its processing. For
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100687instance :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200688
689 /* Called before a processing happens on a given channel.
690 * Returns a negative value if an error occurs, 0 if it needs to wait,
691 * any other value otherwise. */
692 static int
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200693 my_filter_chn_pre_analyze(struct stream *s, struct filter *filter,
694 struct channel *chn, unsigned an_bit)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200695 {
696 struct my_filter_config *my_conf = FLT_CONF(filter);
697
698 switch (an_bit) {
699 case AN_REQ_WAIT_HTTP:
700 if (/* wait that a condition is verified before continuing */)
701 return 0;
702 break;
703 /* ... * /
704 }
705 return 1;
706 }
707
708 * 'an_bit' is the analyzer id. All analyzers are listed in
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100709 'include/haproxy/channels-t.h'.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200710
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100711 * 'chn' is the channel on which the analyzing is done. It is possible to
Ilya Shipitsind7a988c2021-03-04 23:26:15 +0500712 determine if it is the request or the response channel by testing if
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100713 CF_ISRESP flag is set :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200714
715 │ ((chn->flags & CF_ISRESP) == CF_ISRESP)
716
717
718In previous example, the stream processing is blocked before receipt of the HTTP
719request until a condition is verified.
720
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200721'flt_ops.channel_post_analyze', for its part, is not resumable. It returns a
722negative value if an error occurs, any other value otherwise. It is called when
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100723a filterable analyzer finishes its processing, so once for the same analyzer.
724For instance :
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200725
726 /* Called after a processing happens on a given channel.
727 * Returns a negative value if an error occurs, any other
728 * value otherwise. */
729 static int
730 my_filter_chn_post_analyze(struct stream *s, struct filter *filter,
731 struct channel *chn, unsigned an_bit)
732 {
733 struct my_filter_config *my_conf = FLT_CONF(filter);
734 struct http_msg *msg;
735
736 switch (an_bit) {
737 case AN_REQ_WAIT_HTTP:
738 if (/* A test on received headers before any other treatment */) {
739 msg = ((chn->flags & CF_ISRESP) ? &s->txn->rsp : &s->txn->req);
740 txn->status = 400;
741 msg->msg_state = HTTP_MSG_ERROR;
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100742 http_reply_and_close(s, s->txn->status, http_error_message(s));
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200743 return -1; /* This is an error ! */
744 }
745 break;
746 /* ... * /
747 }
748 return 1;
749 }
750
751
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100752Pre and post analyzer callbacks of a filter are not automatically called. They
753must be regiesterd explicitly on analyzers, updating the value of
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200754'filter.pre_analyzers' and 'filter.post_analyzers' bit fields. All analyzer bits
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100755are listed in 'include/types/channels.h'. Here is an example :
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200756
757 static int
758 my_filter_stream_start(struct stream *s, struct filter *filter)
759 {
760 /* ... * /
761
762 /* Register the pre analyzer callback on all request and response
763 * analyzers */
764 filter->pre_analyzers |= (AN_REQ_ALL | AN_RES_ALL)
765
766 /* Register the post analyzer callback of only on AN_REQ_WAIT_HTTP and
767 * AN_RES_WAIT_HTTP analyzers */
768 filter->post_analyzers |= (AN_REQ_WAIT_HTTP | AN_RES_WAIT_HTTP)
769
770 /* ... * /
771 return 0;
772 }
773
774
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200775To surround activity of a filter during the channel analyzing, two new analyzers
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100776has been added :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200777
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100778 * 'flt_start_analyze' (AN_REQ/RES_FLT_START_FE/AN_REQ_RES_FLT_START_BE) : For
Christopher Faulet0184ea72017-01-05 14:06:34 +0100779 a specific filter, this analyzer is called before any call to the
780 'channel_analyze' callback. From the filter point of view, it calls the
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200781 'flt_ops.channel_start_analyze' callback.
782
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100783 * 'flt_end_analyze' (AN_REQ/RES_FLT_END) : For a specific filter, this
784 analyzer is called when all other analyzers have finished their
785 processing. From the filter point of view, it calls the
786 'flt_ops.channel_end_analyze' callback.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200787
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100788These analyzers are called only once per streams.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200789
790'flt_ops.channel_start_analyze' and 'flt_ops.channel_end_analyze' callbacks can
791interrupt the stream processing, as 'flt_ops.channel_analyze'. Here is an
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100792example :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200793
794 /* Called when analyze starts for a given channel
795 * Returns a negative value if an error occurs, 0 if it needs to wait,
796 * any other value otherwise. */
797 static int
798 my_filter_chn_start_analyze(struct stream *s, struct filter *filter,
799 struct channel *chn)
800 {
801 struct my_filter_config *my_conf = FLT_CONF(filter);
802
803 /* ... TODO ... */
804
805 return 1;
806 }
807
808 /* Called when analyze ends for a given channel
809 * Returns a negative value if an error occurs, 0 if it needs to wait,
810 * any other value otherwise. */
811 static int
812 my_filter_chn_end_analyze(struct stream *s, struct filter *filter,
813 struct channel *chn)
814 {
815 struct my_filter_config *my_conf = FLT_CONF(filter);
816
817 /* ... TODO ... */
818
819 return 1;
820 }
821
822
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100823Workflow on channels can be summarized as following :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200824
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200825 FE: Called for filters defined on the stream's frontend
826 BE: Called for filters defined on the stream's backend
827
828 +------->---------+
829 | | |
830 +----------------------+ | +----------------------+
831 | flt_ops.attach (FE) | | | flt_ops.attach (BE) |
832 +----------------------+ | +----------------------+
833 | | |
834 V | V
835 +--------------------------+ | +------------------------------------+
836 | flt_ops.stream_start (FE)| | | flt_ops.stream_set_backend (FE+BE) |
837 +--------------------------+ | +------------------------------------+
838 | | |
839 ... | ...
840 | | |
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100841 | ^ |
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200842 | --+ | | --+
843 +------<----------+ | | +--------<--------+ |
844 | | | | | | |
845 V | | | V | |
846+-------------------------------+ | | | +-------------------------------+ | |
847| flt_start_analyze (FE) +-+ | | | flt_start_analyze (BE) +-+ |
848|(flt_ops.channel_start_analyze)| | F | |(flt_ops.channel_start_analyze)| |
849+---------------+---------------+ | R | +-------------------------------+ |
850 | | O | | |
851 +------<---------+ | N ^ +--------<-------+ | B
852 | | | T | | | | A
853+---------------|------------+ | | E | +---------------|------------+ | | C
854|+--------------V-------------+ | | N | |+--------------V-------------+ | | K
855||+----------------------------+ | | D | ||+----------------------------+ | | E
856|||flt_ops.channel_pre_analyze | | | | |||flt_ops.channel_pre_analyze | | | N
857||| V | | | | ||| V | | | D
858||| analyzer (FE) +-+ | | ||| analyzer (FE+BE) +-+ |
859+|| V | | | +|| V | |
860 +|flt_ops.channel_post_analyze| | | +|flt_ops.channel_post_analyze| |
861 +----------------------------+ | | +----------------------------+ |
862 | --+ | | |
863 +------------>------------+ ... |
864 | |
865 [ data filtering (see below) ] |
866 | |
867 ... |
868 | |
869 +--------<--------+ |
870 | | |
871 V | |
872 +-------------------------------+ | |
873 | flt_end_analyze (FE+BE) +-+ |
874 | (flt_ops.channel_end_analyze) | |
875 +---------------+---------------+ |
876 | --+
877 V
878 +----------------------+
879 | flt_ops.detach (BE) |
880 +----------------------+
881 |
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100882 V
883 +--------------------------+
884 | flt_ops.stream_stop (FE) |
885 +--------------------------+
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200886 |
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100887 V
888 +----------------------+
889 | flt_ops.detach (FE) |
890 +----------------------+
Christopher Faulet9adb0a52016-06-21 11:50:49 +0200891 |
892 V
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200893
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200894By zooming on an analyzer box we have:
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200895
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200896 ...
897 |
898 V
899 |
900 +-----------<-----------+
901 | |
902 +-----------------+--------------------+ |
903 | | | |
904 | +--------<---------+ | |
905 | | | | |
906 | V | | |
907 | flt_ops.channel_pre_analyze ->-+ | ^
908 | | | |
909 | | | |
910 | V | |
911 | analyzer --------->-----+--+
912 | | |
913 | | |
914 | V |
915 | flt_ops.channel_post_analyze |
916 | | |
917 | | |
918 +-----------------+--------------------+
919 |
920 V
921 ...
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200922
923
924 3.6. FILTERING THE DATA EXCHANGED
925-----------------------------------
926
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100927WARNING : To fully understand this part, it is important to be aware on how the
928 buffers work in HAProxy. For the HTTP part, it is also important to
929 understand how data are parsed and structured, and how the internal
930 representation, called HTX, works. See doc/internals/buffer-api.txt
931 and doc/internals/htx-api.txt for details.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200932
933An extended feature of the filters is the data filtering. By default a filter
934does not look into data exchanged between the client and the server because it
935is expensive. Indeed, instead of forwarding data without any processing, each
936byte need to be buffered.
937
938So, to enable the data filtering on a channel, at any time, in one of previous
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100939callbacks, 'register_data_filter' function must be called. And conversely, to
940disable it, 'unregister_data_filter' function must be called. For instance :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200941
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200942 my_filter_http_headers(struct stream *s, struct filter *filter,
943 struct http_msg *msg)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200944 {
945 struct my_filter_config *my_conf = FLT_CONF(filter);
946
947 /* 'chn' must be the request channel */
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200948 if (!(msg->chn->flags & CF_ISRESP)) {
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100949 struct htx *htx;
950 struct ist hdr;
951 struct http_hdr_ctx ctx;
952
953 htx = htxbuf(msg->chn->buf);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200954
955 /* Enable the data filtering for the request if 'X-Filter' header
956 * is set to 'true'. */
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100957 hdr = ist("X-Filter);
958 ctx.blk = NULL;
959 if (http_find_header(htx, hdr, &ctx, 0) &&
960 ctx.value.len >= 4 && memcmp(ctx.value.ptr, "true", 4) == 0)
Christopher Fauletf34b28a2016-05-11 17:29:14 +0200961 register_data_filter(s, chn, filter);
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200962 }
963
964 return 1;
965 }
966
967Here, the data filtering is enabled if the HTTP header 'X-Filter' is found and
968set to 'true'.
969
970If several filters are declared, the evaluation order remains the same,
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100971regardless the order of the registrations to the data filtering. Data
972registrations must be performed before the data forwarding step. However, a
973filter may be unregistered from the data filtering at any time.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200974
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100975Depending on the stream type, TCP or HTTP, the way to handle data filtering is
976different. HTTP data are structured while TCP data are raw. And there are more
977callbacks for HTTP streams to fully handle all steps of an HTTP transaction. But
978the main part is the same. The data filtering is performed in one callback,
979called in loop on input data starting at a specific offset for a given
980length. Data analyzed by a filter are considered as forwarded from its point of
981view. Because filters are chained, a filter never analyzes more data than its
982predecessors. Thus only data analyzed by the last filter are effectively
983forwarded. This means, at any time, any filter may choose to not analyze all
984available data (available from its point of view), blocking the data forwarding.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200985
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100986Internally, filters own 2 offsets representing the number of bytes already
987analyzed in the available input data, one per channel. There is also an offset
988couple at the stream level, in the strm_flt object, representing the total
Ilya Shipitsind7a988c2021-03-04 23:26:15 +0500989number of bytes already forwarded. These offsets may be retrieved and updated
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100990using following macros :
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200991
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100992 * FLT_OFF(flt, chn)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200993
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100994 * FLT_STRM_OFF(s, chn)
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200995
Christopher Faulet74d7b6e2021-02-24 21:58:43 +0100996where 'flt' is the 'struct filter' passed as argument in all callbacks, 's' the
997filtered stream and 'chn' is the considered channel. However, there is no reason
998for a filter to use these macros or take care of these offsets.
Christopher Fauletc3fe5332016-04-07 15:30:10 +0200999
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001000
10013.6.1 FILTERING DATA ON TCP STREAMS
1002-----------------------------------
1003
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001004The TCP data filtering for TCP streams is the easy case, because HAProxy do not
1005parse these data. Data are stored in raw in the buffer. So there is only one
1006callback to consider:
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001007
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001008 * 'flt_ops.tcp_payload : This callback is called when input data are
1009 available. If not defined, all available data will be considered as analyzed
1010 and forwarded from the filter point of view.
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001011
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001012This callback is called only if the filter is registered to analyze TCP
1013data. Here is an example :
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001014
1015 /* Returns a negative value if an error occurs, else the number of
1016 * consumed bytes. */
1017 static int
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001018 my_filter_tcp_payload(struct stream *s, struct filter *filter,
1019 struct channel *chn, unsigned int offset,
1020 unsigned int len)
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001021 {
1022 struct my_filter_config *my_conf = FLT_CONF(filter);
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001023 int ret = len;
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001024
1025 /* Do not parse more than 'my_conf->max_parse' bytes at a time */
1026 if (my_conf->max_parse != 0 && ret > my_conf->max_parse)
1027 ret = my_conf->max_parse;
1028
1029 /* if available data are not completely parsed, wake up the stream to
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001030 * be sure to not freeze it. The best is probably to set a
1031 * chn->analyse_exp timer */
1032 if (ret != len)
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001033 task_wakeup(s->task, TASK_WOKEN_MSG);
1034 return ret;
1035 }
1036
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001037But it is important to note that tunnelled data of an HTTP stream may also be
1038filtered via this callback. Tunnelled data are data exchange after an HTTP tunnel
1039is established between the client and the server, via an HTTP CONNECT or via a
1040protocol upgrade. In this case, the data are structured. Of course, to do so,
1041the filter must be able to parse HTX data and must have the FLT_CFG_FL_HTX flag
1042set. At any time, the IS_HTX_STRM() macros may be used on the stream to know if
1043it is an HTX stream or a TCP stream.
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001044
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001045
Christopher Faulet74d7b6e2021-02-24 21:58:43 +010010463.6.2 FILTERING DATA ON HTTP STREAMS
1047------------------------------------
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001048
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001049The HTTP data filtering is a bit more complex because HAProxy data are
1050structutred and represented to an internal format, called HTX. So basically
1051there is the HTTP counterpart to the previous callback :
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001052
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001053 * 'flt_ops.http_payload' : This callback is called when input data are
1054 available. If not defined, all available data will be considered as analyzed
1055 and forwarded for the filter.
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001056
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001057But the prototype for this callbacks is slightly different. Instead of having
1058the channel as parameter, we have the HTTP message (struct http_msg). This
1059callback is called only if the filter is registered to analyze TCP data. Here is
1060an example :
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001061
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001062 /* Returns a negative value if an error occurs, else the number of
1063 * consumed bytes. */
1064 static int
1065 my_filter_http_payload(struct stream *s, struct filter *filter,
1066 struct http_msg *msg, unsigned int offset,
1067 unsigned int len)
1068 {
1069 struct my_filter_config *my_conf = FLT_CONF(filter);
1070 struct htx *htx = htxbuf(&msg->chn->buf);
1071 struct htx_ret htxret = htx_find_offset(htx, offset);
1072 struct htx_blk *blk;
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001073
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001074 blk = htxret.blk;
1075 offset = htxret.ret;
1076 for (; blk; blk = htx_get_next_blk(blk, htx)) {
1077 enum htx_blk_type type = htx_get_blk_type(blk);
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001078
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001079 if (type == HTX_BLK_UNUSED)
1080 continue;
1081 else if (type == HTX_BLK_DATA) {
1082 /* filter data */
1083 }
1084 else
1085 break;
1086 }
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001087
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001088 return len;
1089 }
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001090
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001091In addition, there are two others callbacks :
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001092
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001093 * 'flt_ops.http_headers' : This callback is called just before the HTTP body
1094 forwarding and after any processing on the request/response HTTP
1095 headers. When defined, this callback is always called for HTTP streams
1096 (i.e. without needs of a registration on data filtering).
1097 Here is an example :
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001098
1099
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001100 /* Returns a negative value if an error occurs, 0 if it needs to wait,
1101 * any other value otherwise. */
1102 static int
1103 my_filter_http_headers(struct stream *s, struct filter *filter,
1104 struct http_msg *msg)
1105 {
1106 struct my_filter_config *my_conf = FLT_CONF(filter);
1107 struct htx *htx = htxbuf(&msg->chn->buf);
1108 struct htx_sl *sl = http_get_stline(htx);
1109 int32_t pos;
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001110
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001111 for (pos = htx_get_first(htx); pos != -1; pos = htx_get_next(htx, pos)) {
1112 struct htx_blk *blk = htx_get_blk(htx, pos);
1113 enum htx_blk_type type = htx_get_blk_type(blk);
1114 struct ist n, v;
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001115
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001116 if (type == HTX_BLK_EOH)
1117 break;
1118 if (type != HTX_BLK_HDR)
1119 continue;
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001120
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001121 n = htx_get_blk_name(htx, blk);
1122 v = htx_get_blk_value(htx, blk);
1123 /* Do something on the header name/value */
1124 }
Christopher Fauletf34b28a2016-05-11 17:29:14 +02001125
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001126 return 1;
1127 }
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001128
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001129 * 'flt_ops.http_end' : This callback is called when the whole HTTP message was
1130 processed. It may interrupt the stream processing. So, it could be used to
1131 synchronize the HTTP request with the HTTP response, for instance :
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001132
1133 /* Returns a negative value if an error occurs, 0 if it needs to wait,
1134 * any other value otherwise. */
1135 static int
1136 my_filter_http_end(struct stream *s, struct filter *filter,
1137 struct http_msg *msg)
1138 {
1139 struct my_filter_ctx *my_ctx = filter->ctx;
1140
1141
1142 if (!(msg->chn->flags & CF_ISRESP)) /* The request */
1143 my_ctx->end_of_req = 1;
1144 else /* The response */
1145 my_ctx->end_of_rsp = 1;
1146
1147 /* Both the request and the response are finished */
1148 if (my_ctx->end_of_req == 1 && my_ctx->end_of_rsp == 1)
1149 return 1;
1150
1151 /* Wait */
1152 return 0;
1153 }
1154
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001155Then, to finish, there are 2 informational callbacks :
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001156
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001157 * 'flt_ops.http_reset' : This callback is called when an HTTP message is
1158 reset. This happens either when a 1xx informational response is received, or
Olivier Houcharda254a372019-04-05 15:30:12 +02001159 if we're retrying to send the request to the server after it failed. It
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001160 could be useful to reset the filter context before receiving the true
1161 response.
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001162 By checking s->txn->status, it is possible to know why this callback is
1163 called. If it's a 1xx, we're called because of an informational
1164 message. Otherwise, it is a L7 retry.
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001165
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001166 * 'flt_ops.http_reply' : This callback is called when, at any time, HAProxy
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001167 decides to stop the processing on a HTTP message and to send an internal
1168 response to the client. This mainly happens when an error or a redirect
1169 occurs.
1170
1171
11723.6.3 REWRITING DATA
1173--------------------
1174
1175The last part, and the trickiest one about the data filtering, is about the data
1176rewriting. For now, the filter API does not offer a lot of functions to handle
1177it. There are only functions to notify HAProxy that the data size has changed to
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001178let it update internal state of filters. This is the developer responsibility to
1179update data itself, i.e. the buffer offsets, using following function :
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001180
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001181 * 'flt_update_offsets()' : This function must be called when a filter alter
1182 incoming data. It updates offsets of the stream and of all filters
Ilya Shipitsind7a988c2021-03-04 23:26:15 +05001183 preceding the calling one. Do not call this function when a filter change
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001184 the size of incoming data leads to an undefined behavior.
Christopher Fauletc3fe5332016-04-07 15:30:10 +02001185
Christopher Faulet74d7b6e2021-02-24 21:58:43 +01001186A good example of filter changing the data size is the HTTP compression filter.