Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 1 | ----------------------------------------- |
Willy Tarreau | 65e94d1 | 2018-08-02 18:12:50 +0200 | [diff] [blame] | 2 | Filters Guide - version 1.9 |
Christopher Faulet | 71a6a8e | 2017-07-27 16:33:28 +0200 | [diff] [blame] | 3 | ( Last update: 2017-07-27 ) |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 4 | ------------------------------------------ |
| 5 | Author : Christopher Faulet |
| 6 | Contact : christopher dot faulet at capflam dot org |
| 7 | |
| 8 | |
| 9 | ABSTRACT |
| 10 | -------- |
| 11 | |
| 12 | The filters support is a new feature of HAProxy 1.7. It is a way to extend |
| 13 | HAProxy without touching its core code and, in certain extent, without knowing |
| 14 | its internals. This feature will ease contributions, reducing impact of |
| 15 | changes. Another advantage will be to simplify HAProxy by replacing some parts |
| 16 | by filters. As we will see, and as an example, the HTTP compression is the first |
| 17 | feature moved in a filter. |
| 18 | |
| 19 | This document describes how to write a filter and what you have to keep in mind |
| 20 | to do so. It also talks about the known limits and the pitfalls to avoid. |
| 21 | |
| 22 | As said, filters are quite new for now. The API is not freezed and will be |
| 23 | updated/modified/improved/extended as needed. |
| 24 | |
| 25 | |
| 26 | |
| 27 | SUMMARY |
| 28 | ------- |
| 29 | |
| 30 | 1. Filters introduction |
| 31 | 2. How to use filters |
| 32 | 3. How to write a new filter |
| 33 | 3.1. API Overview |
| 34 | 3.2. Defining the filter name and its configuration |
| 35 | 3.3. Managing the filter lifecycle |
Christopher Faulet | 71a6a8e | 2017-07-27 16:33:28 +0200 | [diff] [blame] | 36 | 3.3.1. Dealing with threads |
Christopher Faulet | 9adb0a5 | 2016-06-21 11:50:49 +0200 | [diff] [blame] | 37 | 3.4. Handling the streams activity |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 38 | 3.5. Analyzing the channels activity |
| 39 | 3.6. Filtering the data exchanged |
| 40 | 4. FAQ |
| 41 | |
| 42 | |
| 43 | |
| 44 | 1. FILTERS INTRODUCTION |
| 45 | ----------------------- |
| 46 | |
| 47 | First of all, to fully understand how filters work and how to create one, it is |
| 48 | best to know, at least from a distance, what is a proxy (frontend/backend), a |
| 49 | stream and a channel in HAProxy and how these entities are linked to each other. |
| 50 | doc/internals/entities.pdf is a good overview. |
| 51 | |
| 52 | Then, to support filters, many callbacks has been added to HAProxy at different |
| 53 | places, mainly around channel analyzers. Their purpose is to allow filters to |
| 54 | be involved in the data processing, from the stream creation/destruction to |
| 55 | the data forwarding. Depending of what it should do, a filter can implement all |
| 56 | or part of these callbacks. For now, existing callbacks are focused on |
| 57 | streams. But futur improvements could enlarge filters scope. For example, it |
| 58 | could be useful to handle events at the connection level. |
| 59 | |
| 60 | In HAProxy configuration file, a filter is declared in a proxy section, except |
| 61 | default. So the configuration corresponding to a filter declaration is attached |
| 62 | to a specific proxy, and will be shared by all its instances. it is opaque from |
| 63 | the HAProxy point of view, this is the filter responsibility to manage it. For |
| 64 | each filter declaration matches a uniq configuration. Several declarations of |
| 65 | the same filter in the same proxy will be handle as different filters by |
| 66 | HAProxy. |
| 67 | |
| 68 | A filter instance is represented by a partially opaque context (or a state) |
| 69 | attached to a stream and passed as arguments to callbacks. Through this context, |
| 70 | filter instances are stateful. Depending the filter is declared in a frontend or |
| 71 | a backend section, its instances will be created, respectively, when a stream is |
| 72 | created or when a backend is selected. Their behaviors will also be |
| 73 | different. Only instances of filters declared in a frontend section will be |
| 74 | aware of the creation and the destruction of the stream, and will take part in |
| 75 | the channels analyzing before the backend is defined. |
| 76 | |
| 77 | It is important to remember the configuration of a filter is shared by all its |
| 78 | instances, while the context of an instance is owned by a uniq stream. |
| 79 | |
| 80 | Filters are designed to be chained. It is possible to declare several filters in |
| 81 | the same proxy section. The declaration order is important because filters will |
| 82 | be called one after the other respecting this order. Frontend and backend |
| 83 | filters are also chained, frontend ones called first. Even if the filters |
| 84 | processing is serialized, each filter will bahave as it was alone (unless it was |
| 85 | developed to be aware of other filters). For all that, some constraints are |
| 86 | imposed to filters, especially when data exchanged between the client and the |
| 87 | server are processed. We will dicuss again these contraints when we will tackle |
| 88 | the subject of writing a filter. |
| 89 | |
| 90 | |
| 91 | |
| 92 | 2. HOW TO USE FILTERS |
| 93 | --------------------- |
| 94 | |
| 95 | To use a filter, you must use the parameter 'filter' followed by the filter name |
| 96 | and, optionnaly, its configuration in the desired listen, frontend or backend |
| 97 | section. For example: |
| 98 | |
| 99 | listen test |
| 100 | ... |
| 101 | filter trace name TST |
| 102 | ... |
| 103 | |
| 104 | |
| 105 | See doc/configuration.txt for a formal definition of the parameter 'filter'. |
| 106 | Note that additional parameters on the filter line must be parsed by the filter |
| 107 | itself. |
| 108 | |
| 109 | The list of available filters is reported by 'haproxy -vv': |
| 110 | |
| 111 | $> haproxy -vv |
| 112 | HA-Proxy version 1.7-dev2-3a1d4a-33 2016/03/21 |
| 113 | Copyright 2000-2016 Willy Tarreau <willy@haproxy.org> |
| 114 | |
| 115 | [...] |
| 116 | |
| 117 | Available filters : |
| 118 | [COMP] compression |
| 119 | [TRACE] trace |
| 120 | |
| 121 | |
| 122 | Multiple filter lines can be used in a proxy section to chain filters. Filters |
| 123 | will be called in the declaration order. |
| 124 | |
| 125 | Some filters can support implicit declarartions in certain circumstances |
| 126 | (without the filter line). This is not recommanded for new features but are |
| 127 | useful for existing ones moved in a filter, for backward compatibility |
| 128 | reasons. Implicit declarartions are supported when there is only one filter used |
| 129 | on a proxy. When several filters are used, explicit declarartions are mandatory. |
| 130 | The HTTP compression filter is one of these filters. Alone, using 'compression' |
| 131 | keywords is enough to use it. But when at least a second filter is used, a |
| 132 | filter line must be added. |
| 133 | |
| 134 | # filter line is optionnal |
| 135 | listen t1 |
| 136 | bind *:80 |
| 137 | compression algo gzip |
| 138 | compression offload |
| 139 | server srv x.x.x.x:80 |
| 140 | |
| 141 | # filter line is mandatory for the compression filter |
| 142 | listen t2 |
| 143 | bind *:81 |
| 144 | filter trace name T2 |
| 145 | filter compression |
| 146 | compression algo gzip |
| 147 | compression offload |
| 148 | server srv x.x.x.x:80 |
| 149 | |
| 150 | |
| 151 | |
| 152 | |
| 153 | 3. HOW TO WRITE A NEW FILTER |
| 154 | ---------------------------- |
| 155 | |
| 156 | If you want to write a filter, there are 2 header files that you must know: |
| 157 | |
| 158 | * include/types/filters.h: This is the main header file, containing all |
| 159 | important structures you will use. It represents |
| 160 | the filter API. |
| 161 | * include/proto/filters.h: This header file contains helper functions that |
| 162 | you may need to use. It also contains the internal |
| 163 | API used by HAProxy to handle filters. |
| 164 | |
| 165 | To ease the filters integration, it is better to follow some conventions: |
| 166 | |
| 167 | * Use 'flt_' prefix to name your filter (e.g: flt_http_comp or flt_trace). |
| 168 | * Keep everything related to your filter in a same file. |
| 169 | |
| 170 | The filter 'trace' can be used as a template to write your own filter. It is a |
| 171 | good start to see how filters really work. |
| 172 | |
| 173 | 3.1 API OVERVIEW |
| 174 | ---------------- |
| 175 | |
| 176 | Writing a filter can be summarized to write functions and attach them to the |
| 177 | existing callbacks. Available callbacks are listed in the following structure: |
| 178 | |
| 179 | struct flt_ops { |
| 180 | /* |
| 181 | * Callbacks to manage the filter lifecycle |
| 182 | */ |
Christopher Faulet | 71a6a8e | 2017-07-27 16:33:28 +0200 | [diff] [blame] | 183 | int (*init) (struct proxy *p, struct flt_conf *fconf); |
| 184 | void (*deinit) (struct proxy *p, struct flt_conf *fconf); |
| 185 | int (*check) (struct proxy *p, struct flt_conf *fconf); |
| 186 | int (*init_per_thread) (struct proxy *p, struct flt_conf *fconf); |
| 187 | void (*deinit_per_thread)(struct proxy *p, struct flt_conf *fconf); |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 188 | |
| 189 | /* |
| 190 | * Stream callbacks |
| 191 | */ |
Christopher Faulet | 9adb0a5 | 2016-06-21 11:50:49 +0200 | [diff] [blame] | 192 | int (*attach) (struct stream *s, struct filter *f); |
| 193 | int (*stream_start) (struct stream *s, struct filter *f); |
| 194 | int (*stream_set_backend)(struct stream *s, struct filter *f, struct proxy *be); |
| 195 | void (*stream_stop) (struct stream *s, struct filter *f); |
| 196 | void (*detach) (struct stream *s, struct filter *f); |
Christopher Faulet | a00d817 | 2016-11-10 14:58:05 +0100 | [diff] [blame] | 197 | void (*check_timeouts) (struct stream *s, struct filter *f); |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 198 | |
| 199 | /* |
| 200 | * Channel callbacks |
| 201 | */ |
| 202 | int (*channel_start_analyze)(struct stream *s, struct filter *f, |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 203 | struct channel *chn); |
| 204 | int (*channel_pre_analyze) (struct stream *s, struct filter *f, |
| 205 | struct channel *chn, |
| 206 | unsigned int an_bit); |
| 207 | int (*channel_post_analyze) (struct stream *s, struct filter *f, |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 208 | struct channel *chn, |
| 209 | unsigned int an_bit); |
| 210 | int (*channel_end_analyze) (struct stream *s, struct filter *f, |
| 211 | struct channel *chn); |
| 212 | |
| 213 | /* |
| 214 | * HTTP callbacks |
| 215 | */ |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 216 | int (*http_headers) (struct stream *s, struct filter *f, |
| 217 | struct http_msg *msg); |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 218 | int (*http_data) (struct stream *s, struct filter *f, |
| 219 | struct http_msg *msg); |
| 220 | int (*http_chunk_trailers)(struct stream *s, struct filter *f, |
| 221 | struct http_msg *msg); |
| 222 | int (*http_end) (struct stream *s, struct filter *f, |
| 223 | struct http_msg *msg); |
| 224 | int (*http_forward_data) (struct stream *s, struct filter *f, |
| 225 | struct http_msg *msg, |
| 226 | unsigned int len); |
| 227 | |
| 228 | void (*http_reset) (struct stream *s, struct filter *f, |
| 229 | struct http_msg *msg); |
| 230 | void (*http_reply) (struct stream *s, struct filter *f, |
| 231 | short status, |
Willy Tarreau | 83061a8 | 2018-07-13 11:56:34 +0200 | [diff] [blame] | 232 | const struct buffer *msg); |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 233 | |
| 234 | /* |
| 235 | * TCP callbacks |
| 236 | */ |
| 237 | int (*tcp_data) (struct stream *s, struct filter *f, |
| 238 | struct channel *chn); |
| 239 | int (*tcp_forward_data)(struct stream *s, struct filter *f, |
| 240 | struct channel *chn, |
| 241 | unsigned int len); |
| 242 | }; |
| 243 | |
| 244 | |
| 245 | We will explain in following parts when these callbacks are called and what they |
| 246 | should do. |
| 247 | |
| 248 | Filters are declared in proxy sections. So each proxy have an ordered list of |
| 249 | filters, possibly empty if no filter is used. When the configuration of a proxy |
| 250 | is parsed, each filter line represents an entry in this list. In the structure |
| 251 | 'proxy', the filters configurations are stored in the field 'filter_configs', |
| 252 | each one of type 'struct flt_conf *': |
| 253 | |
| 254 | /* |
| 255 | * Structure representing the filter configuration, attached to a proxy and |
| 256 | * accessible from a filter when instantiated in a stream |
| 257 | */ |
| 258 | struct flt_conf { |
| 259 | const char *id; /* The filter id */ |
| 260 | struct flt_ops *ops; /* The filter callbacks */ |
| 261 | void *conf; /* The filter configuration */ |
| 262 | struct list list; /* Next filter for the same proxy */ |
| 263 | }; |
| 264 | |
| 265 | * 'flt_conf.id' is an identifier, defined by the filter. It can be |
| 266 | NULL. HAProxy does not use this field. Filters can use it in log messages or |
| 267 | as a uniq identifier to check multiple declarations. It is the filter |
| 268 | responsibility to free it, if necessary. |
| 269 | |
| 270 | * 'flt_conf.conf' is opaque. It is the internal configuration of a filter, |
| 271 | generally allocated and filled by its parsing function (See § 3.2). It is |
| 272 | the filter responsibility to free it. |
| 273 | |
| 274 | * 'flt_conf.ops' references the callbacks implemented by the filter. This |
| 275 | field must be set during the parsing phase (See § 3.2) and can be refine |
| 276 | during the initialization phase (See § 3.3). If it is dynamically allocated, |
| 277 | it is the filter responsibility to free it. |
| 278 | |
| 279 | |
| 280 | The filter configuration is global and shared by all its instances. A filter |
| 281 | instance is created in the context of a stream and attached to this stream. in |
| 282 | the structure 'stream', the field 'strm_flt' is the state of all filter |
| 283 | instances attached to a stream: |
| 284 | |
| 285 | /* |
| 286 | * Structure reprensenting the "global" state of filters attached to a |
| 287 | * stream. |
| 288 | */ |
| 289 | struct strm_flt { |
| 290 | struct list filters; /* List of filters attached to a stream */ |
| 291 | struct filter *current[2]; /* From which filter resume processing, for a specific channel. |
| 292 | * This is used for resumable callbacks only, |
| 293 | * If NULL, we start from the first filter. |
| 294 | * 0: request channel, 1: response channel */ |
| 295 | unsigned short flags; /* STRM_FL_* */ |
Joseph Herlant | 02cedc4 | 2018-11-13 19:45:17 -0800 | [diff] [blame] | 296 | unsigned char nb_req_data_filters; /* Number of data filters registered on the request channel */ |
| 297 | unsigned char nb_rsp_data_filters; /* Number of data filters registered on the response channel */ |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 298 | }; |
| 299 | |
| 300 | |
| 301 | Filter instances attached to a stream are stored in the field |
| 302 | 'strm_flt.filters', each instance is of type 'struct filter *': |
| 303 | |
| 304 | /* |
| 305 | * Structure reprensenting a filter instance attached to a stream |
| 306 | * |
| 307 | * 2D-Array fields are used to store info per channel. The first index |
| 308 | * stands for the request channel, and the second one for the response |
| 309 | * channel. Especially, <next> and <fwd> are offets representing amount of |
| 310 | * data that the filter are, respectively, parsed and forwarded on a |
| 311 | * channel. Filters can access these values using FLT_NXT and FLT_FWD |
| 312 | * macros. |
| 313 | */ |
| 314 | struct filter { |
| 315 | struct flt_conf *config; /* the filter's configuration */ |
| 316 | void *ctx; /* The filter context (opaque) */ |
| 317 | unsigned short flags; /* FLT_FL_* */ |
| 318 | unsigned int next[2]; /* Offset, relative to buf->p, to the next |
| 319 | * byte to parse for a specific channel |
| 320 | * 0: request channel, 1: response channel */ |
| 321 | unsigned int fwd[2]; /* Offset, relative to buf->p, to the next |
| 322 | * byte to forward for a specific channel |
| 323 | * 0: request channel, 1: response channel */ |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 324 | unsigned int pre_analyzers; /* bit field indicating analyzers to |
| 325 | * pre-process */ |
| 326 | unsigned int post_analyzers; /* bit field indicating analyzers to |
| 327 | * post-process */ |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 328 | struct list list; /* Next filter for the same proxy/stream */ |
| 329 | }; |
| 330 | |
| 331 | * 'filter.config' is the filter configuration previously described. All |
| 332 | instances of a filter share it. |
| 333 | |
| 334 | * 'filter.ctx' is an opaque context. It is managed by the filter, so it is its |
| 335 | responsibility to free it. |
| 336 | |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 337 | * 'filter.pre_analyzers and 'filter.post_analyzers will be described later |
| 338 | (See § 3.5). |
| 339 | |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 340 | * 'filter.next' and 'filter.fwd' will be described later (See § 3.6). |
| 341 | |
| 342 | |
| 343 | 3.2. DEFINING THE FILTER NAME AND ITS CONFIGURATION |
| 344 | --------------------------------------------------- |
| 345 | |
| 346 | When you write a filter, the first thing to do is to add it in the supported |
| 347 | filters. To do so, you must register its name as a valid keyword on the filter |
| 348 | line: |
| 349 | |
| 350 | /* Declare the filter parser for "my_filter" keyword */ |
| 351 | static struct flt_kw_list flt_kws = { "MY_FILTER_SCOPE", { }, { |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 352 | { "my_filter", parse_my_filter_cfg, NULL /* private data */ }, |
| 353 | { NULL, NULL, NULL }, |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 354 | } |
| 355 | }; |
Willy Tarreau | 0108d90 | 2018-11-25 19:14:37 +0100 | [diff] [blame] | 356 | INITCALL1(STG_REGISTER, flt_register_keywords, &flt_kws); |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 357 | |
| 358 | |
| 359 | Then you must define the internal configuration your filter will use. For |
| 360 | example: |
| 361 | |
| 362 | struct my_filter_config { |
| 363 | struct proxy *proxy; |
| 364 | char *name; |
| 365 | /* ... */ |
| 366 | }; |
| 367 | |
| 368 | |
| 369 | You also must list all callbacks implemented by your filter. Here, we use a |
| 370 | global variable: |
| 371 | |
| 372 | struct flt_ops my_filter_ops { |
| 373 | .init = my_filter_init, |
| 374 | .deinit = my_filter_deinit, |
| 375 | .check = my_filter_config_check, |
| 376 | |
| 377 | /* ... */ |
| 378 | }; |
| 379 | |
| 380 | |
| 381 | Finally, you must define the function to parse your filter configuration, here |
| 382 | 'parse_my_filter_cfg'. This function must parse all remaining keywords on the |
| 383 | filter line: |
| 384 | |
| 385 | /* Return -1 on error, else 0 */ |
| 386 | static int |
| 387 | parse_my_filter_cfg(char **args, int *cur_arg, struct proxy *px, |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 388 | struct flt_conf *flt_conf, char **err, void *private) |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 389 | { |
| 390 | struct my_filter_config *my_conf; |
| 391 | int pos = *cur_arg; |
| 392 | |
| 393 | /* Allocate the internal configuration used by the filter */ |
| 394 | my_conf = calloc(1, sizeof(*my_conf)); |
| 395 | if (!my_conf) { |
| 396 | memprintf(err, "%s: out of memory", args[*cur_arg]); |
| 397 | return -1; |
| 398 | } |
| 399 | my_conf->proxy = px; |
| 400 | |
| 401 | /* ... */ |
| 402 | |
| 403 | /* Parse all keywords supported by the filter and fill the internal |
| 404 | * configuration */ |
| 405 | pos++; /* Skip the filter name */ |
| 406 | while (*args[pos]) { |
| 407 | if (!strcmp(args[pos], "name")) { |
| 408 | if (!*args[pos + 1]) { |
| 409 | memprintf(err, "'%s' : '%s' option without value", |
| 410 | args[*cur_arg], args[pos]); |
| 411 | goto error; |
| 412 | } |
| 413 | my_conf->name = strdup(args[pos + 1]); |
| 414 | if (!my_conf->name) { |
| 415 | memprintf(err, "%s: out of memory", args[*cur_arg]); |
| 416 | goto error; |
| 417 | } |
| 418 | pos += 2; |
| 419 | } |
| 420 | |
| 421 | /* ... parse other keywords ... */ |
| 422 | } |
| 423 | *cur_arg = pos; |
| 424 | |
| 425 | /* Set callbacks supported by the filter */ |
| 426 | flt_conf->ops = &my_filter_ops; |
| 427 | |
| 428 | /* Last, save the internal configuration */ |
| 429 | flt_conf->conf = my_conf; |
| 430 | return 0; |
| 431 | |
| 432 | error: |
| 433 | if (my_conf->name) |
| 434 | free(my_conf->name); |
| 435 | free(my_conf); |
| 436 | return -1; |
| 437 | } |
| 438 | |
| 439 | |
| 440 | WARNING: In your parsing function, you must define 'flt_conf->ops'. You must |
| 441 | also parse all arguments on the filter line. This is mandatory. |
| 442 | |
| 443 | In the previous example, we expect to read a filter line as follows: |
| 444 | |
| 445 | filter my_filter name MY_NAME ... |
| 446 | |
| 447 | |
| 448 | Optionnaly, by implementing the 'flt_ops.check' callback, you add a step to |
| 449 | check the internal configuration of your filter after the parsing phase, when |
| 450 | the HAProxy configuration is fully defined. For example: |
| 451 | |
| 452 | /* Check configuration of a trace filter for a specified proxy. |
| 453 | * Return 1 on error, else 0. */ |
| 454 | static int |
| 455 | my_filter_config_check(struct proxy *px, struct flt_conf *my_conf) |
| 456 | { |
| 457 | if (px->mode != PR_MODE_HTTP) { |
| 458 | Alert("The filter 'my_filter' cannot be used in non-HTTP mode.\n"); |
| 459 | return 1; |
| 460 | } |
| 461 | |
| 462 | /* ... */ |
| 463 | |
| 464 | return 0; |
| 465 | } |
| 466 | |
| 467 | |
| 468 | |
| 469 | 3.3. MANAGING THE FILTER LIFECYCLE |
| 470 | ---------------------------------- |
| 471 | |
| 472 | Once the configuration parsed and checked, filters are ready to by used. There |
Christopher Faulet | 71a6a8e | 2017-07-27 16:33:28 +0200 | [diff] [blame] | 473 | are two main callbacks to manage the filter lifecycle: |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 474 | |
| 475 | * 'flt_ops.init': It initializes the filter for a proxy. You may define this |
| 476 | callback if you need to complete your filter configuration. |
| 477 | |
| 478 | * 'flt_ops.deinit': It cleans up what the parsing function and the init |
| 479 | callback have done. This callback is useful to release |
| 480 | memory allocated for the filter configuration. |
| 481 | |
| 482 | Here is an example: |
| 483 | |
| 484 | /* Initialize the filter. Returns -1 on error, else 0. */ |
| 485 | static int |
| 486 | my_filter_init(struct proxy *px, struct flt_conf *fconf) |
| 487 | { |
| 488 | struct my_filter_config *my_conf = fconf->conf; |
| 489 | |
| 490 | /* ... */ |
| 491 | |
| 492 | return 0; |
| 493 | } |
| 494 | |
| 495 | /* Free ressources allocated by the trace filter. */ |
| 496 | static void |
| 497 | my_filter_deinit(struct proxy *px, struct flt_conf *fconf) |
| 498 | { |
| 499 | struct my_filter_config *my_conf = fconf->conf; |
| 500 | |
| 501 | if (my_conf) { |
| 502 | free(my_conf->name); |
| 503 | /* ... */ |
| 504 | free(my_conf); |
| 505 | } |
| 506 | fconf->conf = NULL; |
| 507 | } |
| 508 | |
| 509 | |
| 510 | TODO: Add callbacks to handle creation/destruction of filter instances. And |
| 511 | document it. |
| 512 | |
| 513 | |
Christopher Faulet | 71a6a8e | 2017-07-27 16:33:28 +0200 | [diff] [blame] | 514 | 3.3.1 DEALING WITH THREADS |
| 515 | -------------------------- |
| 516 | |
| 517 | When HAProxy is compiled with the threads support and started with more that one |
| 518 | thread (global.nbthread > 1), then it is possible to manage the filter per |
| 519 | thread with following callbacks: |
| 520 | |
| 521 | * 'flt_ops.init_per_thread': It initializes the filter for each thread. It |
| 522 | works the same way than 'flt_ops.init' but in the |
| 523 | context of a thread. This callback is called |
| 524 | after the thread creation. |
| 525 | |
| 526 | * 'flt_ops.deinit_per_thread': It cleans up what the init_per_thread callback |
| 527 | have done. It is called in the context of a |
| 528 | thread, before exiting it. |
| 529 | |
| 530 | This is the filter's responsibility to deal with concurrency. check, init and |
| 531 | deinit callbacks are called on the main thread. All others are called on a |
| 532 | "worker" thread (not always the same). This is also the filter's responsibility |
| 533 | to know if HAProxy is started with more than one thread. If it is started with |
| 534 | one thread (or compiled without the threads support), these callbacks will be |
| 535 | silently ignored (in this case, global.nbthread will be always equal to one). |
| 536 | |
| 537 | |
Christopher Faulet | 9adb0a5 | 2016-06-21 11:50:49 +0200 | [diff] [blame] | 538 | 3.4. HANDLING THE STREAMS ACTIVITY |
| 539 | ----------------------------------- |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 540 | |
Joseph Herlant | 02cedc4 | 2018-11-13 19:45:17 -0800 | [diff] [blame] | 541 | You may be interested to handle streams activity. For now, there is three |
Christopher Faulet | 9adb0a5 | 2016-06-21 11:50:49 +0200 | [diff] [blame] | 542 | callbacks that you should define to do so: |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 543 | |
| 544 | * 'flt_ops.stream_start': It is called when a stream is started. This callback |
| 545 | can fail by returning a negative value. It will be |
| 546 | considered as a critical error by HAProxy which |
| 547 | disabled the listener for a short time. |
| 548 | |
Christopher Faulet | 9adb0a5 | 2016-06-21 11:50:49 +0200 | [diff] [blame] | 549 | * 'flt_ops.stream_set_backend': It is called when a backend is set for a |
| 550 | stream. This callbacks will be called for all |
| 551 | filters attached to a stream (frontend and |
| 552 | backend). Note this callback is not called if |
| 553 | the frontend and the backend are the same. |
| 554 | |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 555 | * 'flt_ops.stream_stop': It is called when a stream is stopped. This callback |
| 556 | always succeed. Anyway, it is too late to return an |
| 557 | error. |
| 558 | |
| 559 | For example: |
| 560 | |
| 561 | /* Called when a stream is created. Returns -1 on error, else 0. */ |
| 562 | static int |
| 563 | my_filter_stream_start(struct stream *s, struct filter *filter) |
| 564 | { |
| 565 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 566 | |
| 567 | /* ... */ |
| 568 | |
| 569 | return 0; |
| 570 | } |
| 571 | |
Christopher Faulet | 9adb0a5 | 2016-06-21 11:50:49 +0200 | [diff] [blame] | 572 | /* Called when a backend is set for a stream */ |
| 573 | static int |
| 574 | my_filter_stream_set_backend(struct stream *s, struct filter *filter, |
| 575 | struct proxy *be) |
| 576 | { |
| 577 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 578 | |
| 579 | /* ... */ |
| 580 | |
| 581 | return 0; |
| 582 | } |
| 583 | |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 584 | /* Called when a stream is destroyed */ |
| 585 | static void |
| 586 | my_filter_stream_stop(struct stream *s, struct filter *filter) |
| 587 | { |
| 588 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 589 | |
| 590 | /* ... */ |
| 591 | } |
| 592 | |
| 593 | |
| 594 | WARNING: Handling the streams creation and destuction is only possible for |
| 595 | filters defined on proxies with the frontend capability. |
| 596 | |
Christopher Faulet | 9adb0a5 | 2016-06-21 11:50:49 +0200 | [diff] [blame] | 597 | In addition, it is possible to handle creation and destruction of filter |
| 598 | instances using following callbacks: |
| 599 | |
| 600 | * 'flt_ops.attach': It is called after a filter instance creation, when it is |
| 601 | attached to a stream. This happens when the stream is |
| 602 | started for filters defined on the stream's frontend and |
| 603 | when the backend is set for filters declared on the |
| 604 | stream's backend. It is possible to ignore the filter, if |
| 605 | needed, by returning 0. This could be useful to have |
| 606 | conditional filtering. |
| 607 | |
| 608 | * 'flt_ops.detach': It is called when a filter instance is detached from a |
| 609 | stream, before its destruction. This happens when the |
| 610 | stream is stopped for filters defined on the stream's |
| 611 | frontend and when the analyze ends for filters defined on |
| 612 | the stream's backend. |
| 613 | |
| 614 | For example: |
| 615 | |
| 616 | /* Called when a filter instance is created and attach to a stream */ |
| 617 | static int |
| 618 | my_filter_attach(struct stream *s, struct filter *filter) |
| 619 | { |
| 620 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 621 | |
| 622 | if (/* ... */) |
| 623 | return 0; /* Ignore the filter here */ |
| 624 | return 1; |
| 625 | } |
| 626 | |
| 627 | /* Called when a filter instance is detach from a stream, just before its |
| 628 | * destruction */ |
| 629 | static void |
| 630 | my_filter_detach(struct stream *s, struct filter *filter) |
| 631 | { |
| 632 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 633 | |
| 634 | /* ... */ |
| 635 | } |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 636 | |
Christopher Faulet | a00d817 | 2016-11-10 14:58:05 +0100 | [diff] [blame] | 637 | Finally, you may be interested to be notified when the stream is woken up |
| 638 | because of an expired timer. This could let you a chance to check your own |
| 639 | timeouts, if any. To do so you can use the following callback: |
| 640 | |
| 641 | * 'flt_opt.check_timeouts': It is called when a stream is woken up because |
| 642 | of an expired timer. |
| 643 | |
| 644 | For example: |
| 645 | |
| 646 | /* Called when a stream is woken up because of an expired timer */ |
| 647 | static void |
| 648 | my_filter_check_timeouts(struct stream *s, struct filter *filter) |
| 649 | { |
| 650 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 651 | |
| 652 | /* ... */ |
| 653 | } |
| 654 | |
| 655 | |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 656 | 3.5. ANALYZING THE CHANNELS ACTIVITY |
| 657 | ------------------------------------ |
| 658 | |
| 659 | The main purpose of filters is to take part in the channels analyzing. To do so, |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 660 | there is 2 callbacks, 'flt_ops.channel_pre_analyze' and |
| 661 | 'flt_ops.channel_post_analyze', called respectively before and after each |
| 662 | analyzer attached to a channel, execpt analyzers responsible for the data |
| 663 | parsing/forwarding (TCP or HTTP data). Concretely, on the request channel, these |
| 664 | callbacks could be called before following analyzers: |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 665 | |
| 666 | * tcp_inspect_request (AN_REQ_INSPECT_FE and AN_REQ_INSPECT_BE) |
| 667 | * http_wait_for_request (AN_REQ_WAIT_HTTP) |
| 668 | * http_wait_for_request_body (AN_REQ_HTTP_BODY) |
| 669 | * http_process_req_common (AN_REQ_HTTP_PROCESS_FE) |
| 670 | * process_switching_rules (AN_REQ_SWITCHING_RULES) |
| 671 | * http_process_req_ common (AN_REQ_HTTP_PROCESS_BE) |
| 672 | * http_process_tarpit (AN_REQ_HTTP_TARPIT) |
| 673 | * process_server_rules (AN_REQ_SRV_RULES) |
| 674 | * http_process_request (AN_REQ_HTTP_INNER) |
| 675 | * tcp_persist_rdp_cookie (AN_REQ_PRST_RDP_COOKIE) |
| 676 | * process_sticking_rules (AN_REQ_STICKING_RULES) |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 677 | |
| 678 | And on the response channel: |
| 679 | |
| 680 | * tcp_inspect_response (AN_RES_INSPECT) |
| 681 | * http_wait_for_response (AN_RES_WAIT_HTTP) |
| 682 | * process_store_rules (AN_RES_STORE_RULES) |
| 683 | * http_process_res_common (AN_RES_HTTP_PROCESS_BE) |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 684 | |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 685 | Unlike the other callbacks previously seen before, 'flt_ops.channel_pre_analyze' |
| 686 | can interrupt the stream processing. So a filter can decide to not execute the |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 687 | analyzer that follows and wait the next iteration. If there are more than one |
| 688 | filter, following ones are skipped. On the next iteration, the filtering resumes |
| 689 | where it was stopped, i.e. on the filter that has previously stopped the |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 690 | processing. So it is possible for a filter to stop the stream processing on a |
| 691 | specific analyzer for a while before continuing. Moreover, this callback can be |
| 692 | called many times for the same analyzer, until it finishes its processing. For |
| 693 | example: |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 694 | |
| 695 | /* Called before a processing happens on a given channel. |
| 696 | * Returns a negative value if an error occurs, 0 if it needs to wait, |
| 697 | * any other value otherwise. */ |
| 698 | static int |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 699 | my_filter_chn_pre_analyze(struct stream *s, struct filter *filter, |
| 700 | struct channel *chn, unsigned an_bit) |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 701 | { |
| 702 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 703 | |
| 704 | switch (an_bit) { |
| 705 | case AN_REQ_WAIT_HTTP: |
| 706 | if (/* wait that a condition is verified before continuing */) |
| 707 | return 0; |
| 708 | break; |
| 709 | /* ... * / |
| 710 | } |
| 711 | return 1; |
| 712 | } |
| 713 | |
| 714 | * 'an_bit' is the analyzer id. All analyzers are listed in |
| 715 | 'include/types/channels.h'. |
| 716 | |
| 717 | * 'chn' is the channel on which the analyzing is done. You can know if it is |
| 718 | the request or the response channel by testing if CF_ISRESP flag is set: |
| 719 | |
| 720 | │ ((chn->flags & CF_ISRESP) == CF_ISRESP) |
| 721 | |
| 722 | |
| 723 | In previous example, the stream processing is blocked before receipt of the HTTP |
| 724 | request until a condition is verified. |
| 725 | |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 726 | 'flt_ops.channel_post_analyze', for its part, is not resumable. It returns a |
| 727 | negative value if an error occurs, any other value otherwise. It is called when |
| 728 | a filterable analyzer finishes its processing. So it called once for the same |
| 729 | analyzer. For example: |
| 730 | |
| 731 | /* Called after a processing happens on a given channel. |
| 732 | * Returns a negative value if an error occurs, any other |
| 733 | * value otherwise. */ |
| 734 | static int |
| 735 | my_filter_chn_post_analyze(struct stream *s, struct filter *filter, |
| 736 | struct channel *chn, unsigned an_bit) |
| 737 | { |
| 738 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 739 | struct http_msg *msg; |
| 740 | |
| 741 | switch (an_bit) { |
| 742 | case AN_REQ_WAIT_HTTP: |
| 743 | if (/* A test on received headers before any other treatment */) { |
| 744 | msg = ((chn->flags & CF_ISRESP) ? &s->txn->rsp : &s->txn->req); |
| 745 | txn->status = 400; |
| 746 | msg->msg_state = HTTP_MSG_ERROR; |
| 747 | http_reply_and_close(s, s->txn->status, |
| 748 | http_error_message(s, HTTP_ERR_400)); |
| 749 | return -1; /* This is an error ! */ |
| 750 | } |
| 751 | break; |
| 752 | /* ... * / |
| 753 | } |
| 754 | return 1; |
| 755 | } |
| 756 | |
| 757 | |
| 758 | Pre and post analyzer callbacks of a filter are not automatically called. You |
| 759 | must register it explicitly on analyzers, updating the value of |
| 760 | 'filter.pre_analyzers' and 'filter.post_analyzers' bit fields. All analyzer bits |
| 761 | are listed in 'include/types/channels.h'. Here is an example: |
| 762 | |
| 763 | static int |
| 764 | my_filter_stream_start(struct stream *s, struct filter *filter) |
| 765 | { |
| 766 | /* ... * / |
| 767 | |
| 768 | /* Register the pre analyzer callback on all request and response |
| 769 | * analyzers */ |
| 770 | filter->pre_analyzers |= (AN_REQ_ALL | AN_RES_ALL) |
| 771 | |
| 772 | /* Register the post analyzer callback of only on AN_REQ_WAIT_HTTP and |
| 773 | * AN_RES_WAIT_HTTP analyzers */ |
| 774 | filter->post_analyzers |= (AN_REQ_WAIT_HTTP | AN_RES_WAIT_HTTP) |
| 775 | |
| 776 | /* ... * / |
| 777 | return 0; |
| 778 | } |
| 779 | |
| 780 | |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 781 | To surround activity of a filter during the channel analyzing, two new analyzers |
| 782 | has been added: |
| 783 | |
Christopher Faulet | 0184ea7 | 2017-01-05 14:06:34 +0100 | [diff] [blame] | 784 | * 'flt_start_analyze' (AN_REQ/RES_FLT_START_FE/AN_REQ_RES_FLT_START_BE): For |
| 785 | a specific filter, this analyzer is called before any call to the |
| 786 | 'channel_analyze' callback. From the filter point of view, it calls the |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 787 | 'flt_ops.channel_start_analyze' callback. |
| 788 | |
Christopher Faulet | 0184ea7 | 2017-01-05 14:06:34 +0100 | [diff] [blame] | 789 | * 'flt_end_analyze' (AN_REQ/RES_FLT_END): For a specific filter, this analyzer |
| 790 | is called when all other analyzers have finished their processing. From the |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 791 | filter point of view, it calls the 'flt_ops.channel_end_analyze' callback. |
| 792 | |
| 793 | For TCP streams, these analyzers are called only once. For HTTP streams, if the |
| 794 | client connection is kept alive, this happens at each request/response roundtip. |
| 795 | |
| 796 | 'flt_ops.channel_start_analyze' and 'flt_ops.channel_end_analyze' callbacks can |
| 797 | interrupt the stream processing, as 'flt_ops.channel_analyze'. Here is an |
| 798 | example: |
| 799 | |
| 800 | /* Called when analyze starts for a given channel |
| 801 | * Returns a negative value if an error occurs, 0 if it needs to wait, |
| 802 | * any other value otherwise. */ |
| 803 | static int |
| 804 | my_filter_chn_start_analyze(struct stream *s, struct filter *filter, |
| 805 | struct channel *chn) |
| 806 | { |
| 807 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 808 | |
| 809 | /* ... TODO ... */ |
| 810 | |
| 811 | return 1; |
| 812 | } |
| 813 | |
| 814 | /* Called when analyze ends for a given channel |
| 815 | * Returns a negative value if an error occurs, 0 if it needs to wait, |
| 816 | * any other value otherwise. */ |
| 817 | static int |
| 818 | my_filter_chn_end_analyze(struct stream *s, struct filter *filter, |
| 819 | struct channel *chn) |
| 820 | { |
| 821 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 822 | |
| 823 | /* ... TODO ... */ |
| 824 | |
| 825 | return 1; |
| 826 | } |
| 827 | |
| 828 | |
| 829 | Workflow on channels can be summarized as following: |
| 830 | |
Christopher Faulet | 9adb0a5 | 2016-06-21 11:50:49 +0200 | [diff] [blame] | 831 | FE: Called for filters defined on the stream's frontend |
| 832 | BE: Called for filters defined on the stream's backend |
| 833 | |
| 834 | +------->---------+ |
| 835 | | | | |
| 836 | +----------------------+ | +----------------------+ |
| 837 | | flt_ops.attach (FE) | | | flt_ops.attach (BE) | |
| 838 | +----------------------+ | +----------------------+ |
| 839 | | | | |
| 840 | V | V |
| 841 | +--------------------------+ | +------------------------------------+ |
| 842 | | flt_ops.stream_start (FE)| | | flt_ops.stream_set_backend (FE+BE) | |
| 843 | +--------------------------+ | +------------------------------------+ |
| 844 | | | | |
| 845 | ... | ... |
| 846 | | | | |
| 847 | +-<-- [1] ^ | |
| 848 | | --+ | | --+ |
| 849 | +------<----------+ | | +--------<--------+ | |
| 850 | | | | | | | | |
| 851 | V | | | V | | |
| 852 | +-------------------------------+ | | | +-------------------------------+ | | |
| 853 | | flt_start_analyze (FE) +-+ | | | flt_start_analyze (BE) +-+ | |
| 854 | |(flt_ops.channel_start_analyze)| | F | |(flt_ops.channel_start_analyze)| | |
| 855 | +---------------+---------------+ | R | +-------------------------------+ | |
| 856 | | | O | | | |
| 857 | +------<---------+ | N ^ +--------<-------+ | B |
| 858 | | | | T | | | | A |
| 859 | +---------------|------------+ | | E | +---------------|------------+ | | C |
| 860 | |+--------------V-------------+ | | N | |+--------------V-------------+ | | K |
| 861 | ||+----------------------------+ | | D | ||+----------------------------+ | | E |
| 862 | |||flt_ops.channel_pre_analyze | | | | |||flt_ops.channel_pre_analyze | | | N |
| 863 | ||| V | | | | ||| V | | | D |
| 864 | ||| analyzer (FE) +-+ | | ||| analyzer (FE+BE) +-+ | |
| 865 | +|| V | | | +|| V | | |
| 866 | +|flt_ops.channel_post_analyze| | | +|flt_ops.channel_post_analyze| | |
| 867 | +----------------------------+ | | +----------------------------+ | |
| 868 | | --+ | | | |
| 869 | +------------>------------+ ... | |
| 870 | | | |
| 871 | [ data filtering (see below) ] | |
| 872 | | | |
| 873 | ... | |
| 874 | | | |
| 875 | +--------<--------+ | |
| 876 | | | | |
| 877 | V | | |
| 878 | +-------------------------------+ | | |
| 879 | | flt_end_analyze (FE+BE) +-+ | |
| 880 | | (flt_ops.channel_end_analyze) | | |
| 881 | +---------------+---------------+ | |
| 882 | | --+ |
| 883 | V |
| 884 | +----------------------+ |
| 885 | | flt_ops.detach (BE) | |
| 886 | +----------------------+ |
| 887 | | |
| 888 | If HTTP stream, go back to [1] --<--+ |
| 889 | | |
| 890 | ... |
| 891 | | |
| 892 | V |
| 893 | +--------------------------+ |
| 894 | | flt_ops.stream_stop (FE) | |
| 895 | +--------------------------+ |
| 896 | | |
| 897 | V |
| 898 | +----------------------+ |
| 899 | | flt_ops.detach (FE) | |
| 900 | +----------------------+ |
| 901 | | |
| 902 | V |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 903 | |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 904 | By zooming on an analyzer box we have: |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 905 | |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 906 | ... |
| 907 | | |
| 908 | V |
| 909 | | |
| 910 | +-----------<-----------+ |
| 911 | | | |
| 912 | +-----------------+--------------------+ | |
| 913 | | | | | |
| 914 | | +--------<---------+ | | |
| 915 | | | | | | |
| 916 | | V | | | |
| 917 | | flt_ops.channel_pre_analyze ->-+ | ^ |
| 918 | | | | | |
| 919 | | | | | |
| 920 | | V | | |
| 921 | | analyzer --------->-----+--+ |
| 922 | | | | |
| 923 | | | | |
| 924 | | V | |
| 925 | | flt_ops.channel_post_analyze | |
| 926 | | | | |
| 927 | | | | |
| 928 | +-----------------+--------------------+ |
| 929 | | |
| 930 | V |
| 931 | ... |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 932 | |
| 933 | |
| 934 | 3.6. FILTERING THE DATA EXCHANGED |
| 935 | ----------------------------------- |
| 936 | |
| 937 | WARNING: To fully understand this part, you must be aware on how the buffers |
| 938 | work in HAProxy. In particular, you must be comfortable with the idea |
| 939 | of circular buffers. See doc/internals/buffer-operations.txt and |
| 940 | doc/internals/buffer-ops.fig for details. |
| 941 | doc/internals/body-parsing.txt could also be useful. |
| 942 | |
| 943 | An extended feature of the filters is the data filtering. By default a filter |
| 944 | does not look into data exchanged between the client and the server because it |
| 945 | is expensive. Indeed, instead of forwarding data without any processing, each |
| 946 | byte need to be buffered. |
| 947 | |
| 948 | So, to enable the data filtering on a channel, at any time, in one of previous |
| 949 | callbacks, you should call 'register_data_filter' function. And conversely, to |
| 950 | disable it, you should call 'unregister_data_filter' function. For example: |
| 951 | |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 952 | my_filter_http_headers(struct stream *s, struct filter *filter, |
| 953 | struct http_msg *msg) |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 954 | { |
| 955 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 956 | |
| 957 | /* 'chn' must be the request channel */ |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 958 | if (!(msg->chn->flags & CF_ISRESP)) { |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 959 | struct http_txn *txn = s->txn; |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 960 | struct buffer *req = msg->chn->buf; |
| 961 | struct hdr_ctx ctx; |
| 962 | |
| 963 | /* Enable the data filtering for the request if 'X-Filter' header |
| 964 | * is set to 'true'. */ |
| 965 | if (http_find_header2("X-Filter", 8, req->p, &txn->hdr_idx, &ctx) && |
| 966 | ctx.vlen >= 3 && memcmp(ctx.line + ctx.val, "true", 4) == 0) |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 967 | register_data_filter(s, chn, filter); |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 968 | } |
| 969 | |
| 970 | return 1; |
| 971 | } |
| 972 | |
| 973 | Here, the data filtering is enabled if the HTTP header 'X-Filter' is found and |
| 974 | set to 'true'. |
| 975 | |
| 976 | If several filters are declared, the evaluation order remains the same, |
| 977 | regardless the order of the registrations to the data filtering. |
| 978 | |
| 979 | Depending on the stream type, TCP or HTTP, the way to handle data filtering will |
| 980 | be slightly different. Among other things, for HTTP streams, there are more |
| 981 | callbacks to help you to fully handle all steps of an HTTP transaction. But the |
| 982 | basis is the same. The data filtering is done in 2 stages: |
| 983 | |
| 984 | * The data parsing: At this stage, filters will analyze input data on a |
| 985 | channel. Once a filter has parsed some data, it cannot parse it again. At |
| 986 | any time, a filter can choose to not parse all available data. So, it is |
| 987 | possible for a filter to retain data for a while. Because filters are |
| 988 | chained, a filter cannot parse more data than its predecessors. Thus only |
| 989 | data considered as parsed by the last filter will be available to the next |
| 990 | stage, the data forwarding. |
| 991 | |
| 992 | * The data forwarding: At this stage, filters will decide how much data |
| 993 | HAProxy can forward among those considered as parsed at the previous |
| 994 | stage. Once a filter has marked data as forwardable, it cannot analyze it |
| 995 | anymore. At any time, a filter can choose to not forward all parsed |
| 996 | data. So, it is possible for a filter to retain data for a while. Because |
| 997 | filters are chained, a filter cannot forward more data than its |
| 998 | predecessors. Thus only data marked as forwardable by the last filter will |
| 999 | be actually forwarded by HAProxy. |
| 1000 | |
| 1001 | Internally, filters own 2 offsets, relatively to 'buf->p', representing the |
| 1002 | number of bytes already parsed in the available input data and the number of |
| 1003 | bytes considered as forwarded. We will call these offsets, respectively, 'nxt' |
| 1004 | and 'fwd'. Following macros reference these offsets: |
| 1005 | |
| 1006 | * FLT_NXT(flt, chn), flt_req_nxt(flt) and flt_rsp_nxt(flt) |
| 1007 | |
| 1008 | * FLT_FWD(flt, chn), flt_req_fwd(flt) and flt_rsp_fwd(flt) |
| 1009 | |
| 1010 | where 'flt' is the 'struct filter' passed as argument in all callbacks and 'chn' |
| 1011 | is the considered channel. |
| 1012 | |
| 1013 | Using these offsets, following operations on buffers are possible: |
| 1014 | |
| 1015 | chn->buf->p + FLT_NXT(flt, chn) // the pointer on parsable data for |
| 1016 | // the filter 'flt' on the channel 'chn'. |
| 1017 | // Everything between chn->buf->p and 'nxt' offset was already parsed |
| 1018 | // by the filter. |
| 1019 | |
| 1020 | chn->buf->i - FLT_NXT(flt, chn) // the number of bytes of parsable data for |
| 1021 | // the filter 'flt' on the channel 'chn'. |
| 1022 | |
| 1023 | chn->buf->p + FLT_FWD(flt, chn) // the pointer on forwardable data for |
| 1024 | // the filter 'flt' on the channel 'chn'. |
| 1025 | // Everything between chn->buf->p and 'fwd' offset was already forwarded |
| 1026 | // by the filter. |
| 1027 | |
| 1028 | |
| 1029 | Note that at any time, for a filter, 'nxt' offset is always greater or equal to |
| 1030 | 'fwd' offset. |
| 1031 | |
| 1032 | TODO: Add schema with buffer states when there is 2 filters that analyze data. |
| 1033 | |
| 1034 | |
| 1035 | 3.6.1 FILTERING DATA ON TCP STREAMS |
| 1036 | ----------------------------------- |
| 1037 | |
| 1038 | The TCP data filtering is the easy case, because HAProxy do not parse these |
| 1039 | data. So you have only two callbacks that you need to consider: |
| 1040 | |
| 1041 | * 'flt_ops.tcp_data': This callback is called when unparsed data are |
| 1042 | available. If not defined, all available data will be considered as parsed |
| 1043 | for the filter. |
| 1044 | |
| 1045 | * 'flt_ops.tcp_forward_data': This callback is called when parsed data are |
| 1046 | available. If not defined, all parsed data will be considered as forwarded |
| 1047 | for the filter. |
| 1048 | |
| 1049 | Here is an example: |
| 1050 | |
| 1051 | /* Returns a negative value if an error occurs, else the number of |
| 1052 | * consumed bytes. */ |
| 1053 | static int |
| 1054 | my_filter_tcp_data(struct stream *s, struct filter *filter, |
| 1055 | struct channel *chn) |
| 1056 | { |
| 1057 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 1058 | int avail = chn->buf->i - FLT_NXT(filter, chn); |
| 1059 | int ret = avail; |
| 1060 | |
| 1061 | /* Do not parse more than 'my_conf->max_parse' bytes at a time */ |
| 1062 | if (my_conf->max_parse != 0 && ret > my_conf->max_parse) |
| 1063 | ret = my_conf->max_parse; |
| 1064 | |
| 1065 | /* if available data are not completely parsed, wake up the stream to |
| 1066 | * be sure to not freeze it. */ |
| 1067 | if (ret != avail) |
| 1068 | task_wakeup(s->task, TASK_WOKEN_MSG); |
| 1069 | return ret; |
| 1070 | } |
| 1071 | |
| 1072 | |
| 1073 | /* Returns a negative value if an error occurs, else * or the number of |
| 1074 | * forwarded bytes. */ |
| 1075 | static int |
| 1076 | my_filter_tcp_forward_data(struct stream *s, struct filter *filter, |
| 1077 | struct channel *chn, unsigned int len) |
| 1078 | { |
| 1079 | struct my_filter_config *my_conf = FLT_CONF(filter); |
| 1080 | int ret = len; |
| 1081 | |
| 1082 | /* Do not forward more than 'my_conf->max_forward' bytes at a time */ |
| 1083 | if (my_conf->max_forward != 0 && ret > my_conf->max_forward) |
| 1084 | ret = my_conf->max_forward; |
| 1085 | |
| 1086 | /* if parsed data are not completely forwarded, wake up the stream to |
| 1087 | * be sure to not freeze it. */ |
| 1088 | if (ret != len) |
| 1089 | task_wakeup(s->task, TASK_WOKEN_MSG); |
| 1090 | return ret; |
| 1091 | } |
| 1092 | |
| 1093 | |
| 1094 | |
| 1095 | 3.6.2 FILTERING DATA ON HTTP STREAMS |
| 1096 | ------------------------------------ |
| 1097 | |
| 1098 | The HTTP data filtering is a bit tricky because HAProxy will parse the body |
| 1099 | structure, especially chunked body. So basically there is the HTTP counterpart |
| 1100 | to the previous callbacks: |
| 1101 | |
| 1102 | * 'flt_ops.http_data': This callback is called when unparsed data are |
| 1103 | available. If not defined, all available data will be considered as parsed |
| 1104 | for the filter. |
| 1105 | |
| 1106 | * 'flt_ops.http_forward_data': This callback is called when parsed data are |
| 1107 | available. If not defined, all parsed data will be considered as forwarded |
| 1108 | for the filter. |
| 1109 | |
| 1110 | But the prototype for these callbacks is slightly different. Instead of having |
| 1111 | the channel as parameter, we have the HTTP message (struct http_msg). You need |
| 1112 | to be careful when you use 'http_msg.chunk_len' size. This value is the number |
| 1113 | of bytes remaining to parse in the HTTP body (or the chunk for chunked |
| 1114 | messages). The HTTP parser of HAProxy uses it to have the number of bytes that |
| 1115 | it could consume: |
| 1116 | |
| 1117 | /* Available input data in the current chunk from the HAProxy point of view. |
| 1118 | * msg->next bytes were already parsed. Without data filtering, HAProxy |
| 1119 | * will consume all of it. */ |
| 1120 | Bytes = MIN(msg->chunk_len, chn->buf->i - msg->next); |
| 1121 | |
| 1122 | |
| 1123 | But in your filter, you need to recompute it: |
| 1124 | |
| 1125 | /* Available input data in the current chunk from the filter point of view. |
| 1126 | * 'nxt' bytes were already parsed. */ |
| 1127 | Bytes = MIN(msg->chunk_len + msg->next, chn->buf->i) - FLT_NXT(flt, chn); |
| 1128 | |
| 1129 | |
Christopher Faulet | f34b28a | 2016-05-11 17:29:14 +0200 | [diff] [blame] | 1130 | In addition to these callbacks, there are three others: |
| 1131 | |
| 1132 | * 'flt_ops.http_headers': This callback is called just before the HTTP body |
| 1133 | parsing and after any processing on the request/response HTTP headers. When |
| 1134 | defined, this callback is always called for HTTP streams (i.e. without needs |
| 1135 | of a registration on data filtering). |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 1136 | |
| 1137 | * 'flt_ops.http_end': This callback is called when the whole HTTP |
| 1138 | request/response is processed. It can interrupt the stream processing. So, |
| 1139 | it could be used to synchronize the HTTP request with the HTTP response, for |
| 1140 | example: |
| 1141 | |
| 1142 | /* Returns a negative value if an error occurs, 0 if it needs to wait, |
| 1143 | * any other value otherwise. */ |
| 1144 | static int |
| 1145 | my_filter_http_end(struct stream *s, struct filter *filter, |
| 1146 | struct http_msg *msg) |
| 1147 | { |
| 1148 | struct my_filter_ctx *my_ctx = filter->ctx; |
| 1149 | |
| 1150 | |
| 1151 | if (!(msg->chn->flags & CF_ISRESP)) /* The request */ |
| 1152 | my_ctx->end_of_req = 1; |
| 1153 | else /* The response */ |
| 1154 | my_ctx->end_of_rsp = 1; |
| 1155 | |
| 1156 | /* Both the request and the response are finished */ |
| 1157 | if (my_ctx->end_of_req == 1 && my_ctx->end_of_rsp == 1) |
| 1158 | return 1; |
| 1159 | |
| 1160 | /* Wait */ |
| 1161 | return 0; |
| 1162 | } |
| 1163 | |
| 1164 | |
| 1165 | * 'flt_ops.http_chunk_trailers': This callback is called for chunked HTTP |
| 1166 | messages only when all chunks were parsed. HTTP trailers can be parsed into |
| 1167 | several passes. This callback will be called each time. The number of bytes |
| 1168 | parsed by HAProxy at each iteration is stored in 'msg->sol'. |
| 1169 | |
| 1170 | Then, to finish, there are 2 informational callbacks: |
| 1171 | |
| 1172 | * 'flt_ops.http_reset': This callback is called when a HTTP message is |
| 1173 | reset. This only happens when a '100-continue' response is received. It |
| 1174 | could be useful to reset the filter context before receiving the true |
| 1175 | response. |
| 1176 | |
| 1177 | * 'flt_ops.http_reply': This callback is called when, at any time, HAProxy |
| 1178 | decides to stop the processing on a HTTP message and to send an internal |
| 1179 | response to the client. This mainly happens when an error or a redirect |
| 1180 | occurs. |
| 1181 | |
| 1182 | |
| 1183 | 3.6.3 REWRITING DATA |
| 1184 | -------------------- |
| 1185 | |
| 1186 | The last part, and the trickiest one about the data filtering, is about the data |
| 1187 | rewriting. For now, the filter API does not offer a lot of functions to handle |
| 1188 | it. There are only functions to notify HAProxy that the data size has changed to |
| 1189 | let it update internal state of filters. This is your responsibility to update |
| 1190 | data itself, i.e. the buffer offsets. For a HTTP message, you also must update |
| 1191 | 'msg->next' and 'msg->chunk_len' values accordingly: |
| 1192 | |
| 1193 | * 'flt_change_next_size': This function must be called when a filter alter |
| 1194 | incoming data. It updates 'nxt' offset value of all its predecessors. Do not |
| 1195 | call this function when a filter change the size of incoming data leads to |
| 1196 | an undefined behavior. |
| 1197 | |
| 1198 | unsigned int avail = MIN(msg->chunk_len + msg->next, chn->buf->i) - |
| 1199 | flt_rsp_next(filter); |
| 1200 | |
| 1201 | if (avail > 10 and /* ...Some condition... */) { |
| 1202 | /* Move the buffer forward to have buf->p pointing on unparsed |
| 1203 | * data */ |
Willy Tarreau | bcbd393 | 2018-06-06 07:13:22 +0200 | [diff] [blame] | 1204 | c_adv(msg->chn, flt_rsp_nxt(filter)); |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 1205 | |
| 1206 | /* Skip first 10 bytes. To simplify this example, we consider a |
| 1207 | * non-wrapping buffer */ |
| 1208 | memmove(buf->p + 10, buf->p, avail - 10); |
| 1209 | |
| 1210 | /* Restore buf->p value */ |
Willy Tarreau | bcbd393 | 2018-06-06 07:13:22 +0200 | [diff] [blame] | 1211 | c_rew(msg->chn, flt_rsp_nxt(filter)); |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 1212 | |
| 1213 | /* Now update other filters */ |
| 1214 | flt_change_next_size(filter, msg->chn, -10); |
| 1215 | |
| 1216 | /* Update the buffer state */ |
| 1217 | buf->i -= 10; |
| 1218 | |
| 1219 | /* And update the HTTP message state */ |
| 1220 | msg->chunk_len -= 10; |
| 1221 | |
| 1222 | return (avail - 10); |
| 1223 | } |
| 1224 | else |
| 1225 | return 0; /* Wait for more data */ |
| 1226 | |
| 1227 | |
| 1228 | * 'flt_change_forward_size': This function must be called when a filter alter |
| 1229 | parsed data. It updates offset values ('nxt' and 'fwd') of all filters. Do |
| 1230 | not call this function when a filter change the size of parsed data leads to |
| 1231 | an undefined behavior. |
| 1232 | |
| 1233 | /* len is the number of bytes of forwardable data */ |
| 1234 | if (len > 10 and /* ...Some condition... */) { |
| 1235 | /* Move the buffer forward to have buf->p pointing on non-forwarded |
| 1236 | * data */ |
Willy Tarreau | bcbd393 | 2018-06-06 07:13:22 +0200 | [diff] [blame] | 1237 | c_adv(msg->chn, flt_rsp_fwd(filter)); |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 1238 | |
| 1239 | /* Skip first 10 bytes. To simplify this example, we consider a |
| 1240 | * non-wrapping buffer */ |
| 1241 | memmove(buf->p + 10, buf->p, len - 10); |
| 1242 | |
| 1243 | /* Restore buf->p value */ |
Willy Tarreau | bcbd393 | 2018-06-06 07:13:22 +0200 | [diff] [blame] | 1244 | c_rew(msg->chn, flt_rsp_fwd(filter)); |
Christopher Faulet | c3fe533 | 2016-04-07 15:30:10 +0200 | [diff] [blame] | 1245 | |
| 1246 | /* Now update other filters */ |
| 1247 | flt_change_forward_size(filter, msg->chn, -10); |
| 1248 | |
| 1249 | /* Update the buffer state */ |
| 1250 | buf->i -= 10; |
| 1251 | |
| 1252 | /* And update the HTTP message state */ |
| 1253 | msg->next -= 10; |
| 1254 | |
| 1255 | return (len - 10); |
| 1256 | } |
| 1257 | else |
| 1258 | return 0; /* Wait for more data */ |
| 1259 | |
| 1260 | |
| 1261 | TODO: implement all the stuff to easily rewrite data. For HTTP messages, this |
| 1262 | requires to have a chunked message. Else the size of data cannot be |
| 1263 | changed. |
| 1264 | |
| 1265 | |
| 1266 | |
| 1267 | |
| 1268 | 4. FAQ |
| 1269 | ------ |
| 1270 | |
| 1271 | 4.1. Detect multiple declarations of the same filter |
| 1272 | ---------------------------------------------------- |
| 1273 | |
| 1274 | TODO |