DOC: filters: Add filters documentation The configuration documention has been updated. Doc about the filter line has been added and a new chapter (§. 9) has been created to list and document supported filters (for now, flt_trace and flt_http_comp). The developer documentation about filters has also been added. The is a "pre" version. Incoming changes in the filter API will require an update. This documentation requires a deeper review and some TODO need to be complete.

commit: c3fe5330be4c2d15ea50ae3a2d01e9287461d13c [log] [tgz]
author: Christopher Faulet <cfaulet@qualys.com> Thu Apr 07 15:30:10 2016 +0200
committer: Willy Tarreau <w@1wt.eu> Thu Apr 21 07:01:41 2016 +0200
tree: 2e7fe9079d3f853a6bce3c42096d0322abceeec2
parent: 00e818aa587b254453b4c5be7358c72069b972c4 [diff]
diff --git a/doc/configuration.txt b/doc/configuration.txt
index 7a3fff1..5a815a1 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt

@@ -103,6 +103,10 @@
 8.8.      Capturing HTTP headers
 8.9.      Examples of logs
 
+9.    Supported filters
+9.1.      Trace
+9.2.      HTTP compression
+
 
 1. Quick reminder about HTTP
 ----------------------------
@@ -1724,6 +1728,7 @@
 -- keyword -------------------------- defaults - frontend - listen -- backend -
 errorloc303                               X          X         X         X
 force-persist                             -          X         X         X
+filter                                    -          X         X         X
 fullconn                                  X          -         X         X
 grace                                     X          X         X         X
 hash-type                                 X          -         X         X
@@ -2616,6 +2621,7 @@
         compression algo gzip
         compression type text/html text/plain
 
+
 contimeout <timeout> (deprecated)
   Set the maximum time to wait for a connection attempt to a server to succeed.
   May be used in sections :   defaults | frontend | listen | backend
@@ -3174,6 +3180,38 @@
   See also : "option redispatch", "ignore-persist", "persist",
              and section 7 about ACL usage.
 
+
+filter <name> [param*]
+  Add the filter <name> in the filter list attached to the proxy.
+  May be used in sections :   defaults | frontend | listen | backend
+                                 no    |   yes    |   yes  |   yes
+  Arguments :
+    <name>     is the name of the filter. Officially supported filters are
+               referenced in section 9.
+
+    <param*>   is a list of parameters accpeted by the filter <name>. The
+               parsing of these parameters are the responsibility of the
+               filter. Please refer to the documention of the corresponding
+               filter (section 9) from all details on the supported parameters.
+
+  Multiple occurrences of the filter line can be used for the same proxy. The
+  same filter can be referenced many times if needed.
+
+  Example:
+      listen
+        bind *:80
+
+        filter trace name BEFORE-HTTP-COMP
+        filter compression
+        filter trace name AFTER-HTTP-COMP
+
+        compression algo gzip
+        compression offload
+
+        server srv1 192.168.0.1:80
+
+  See also : section 9.
+
 
 fullconn <conns>
   Specify at what backend load the servers will reach their maxconn
@@ -15476,6 +15514,59 @@
        connection because of too many already established.
 
 
+9. Supported filters
+--------------------
+
+Here are listed officially supported filters with the list of parameters they
+accept. Depending on compile options, some of these filters might be
+unavailable. The list of available filters is reported in haproxy -vv.
+
+See also : "filter"
+
+9.1. Trace
+----------
+
+filter trace [name <name>] [random-parsing] [random-forwarding]
+
+  Arguments:
+    <name>               is an arbitrary name that will be reported in
+                         messages. If no name is provided, "TRACE" is used.
+
+    <random-parsing>     enables the random parsing of data exchanged between
+                         the client and the server. By default, this filter
+                         parses all available data. With this parameter, it
+                         only parses a random amount of the available data.
+
+    <random-forwarding>  enables the random forwading of parsed data. By
+                         default, this filter forwards all previously parsed
+                         data. With this parameter, it only forwards a random
+                         amount of the parsed data.
+
+This filter can be used as a base to develop new filters. It defines all
+callbacks and print a message on the standard error stream (stderr) with useful
+information for all of them. It may be useful to debug the activity of other
+filters or, quite simply, HAProxy's activity.
+
+Using <random-parsing> and/or <random-forwarding> parameters is a good way to
+tests the behavior of a filter that parses data exchanged between a client and
+a server by adding some latencies in the processing.
+
+
+9.2. HTTP compression
+---------------------
+
+filter compression
+
+The HTTP compression has been moved in a filter in HAProxy 1.7. "compression"
+keyword must still be used to enable and configure the HTTP compression. And
+when no other filter is used, it is enough. But it is mandatory to explicitly
+use a filter line to enable the HTTP compression when two or more filters are
+used for the same listener/frontend/backend. This is important to know the
+filters evaluation order.
+
+See also : "compression"
+
+
 /*
  * Local variables:
  *  fill-column: 79

diff --git a/doc/internals/filters.txt b/doc/internals/filters.txt
new file mode 100644
index 0000000..7101343
--- /dev/null
+++ b/doc/internals/filters.txt

@@ -0,0 +1,1058 @@
+                   -----------------------------------------
+                          Filters Guide - version 0.1
+                          ( Last update: 2016-04-18 )
+                   ------------------------------------------
+                          Author : Christopher Faulet
+              Contact : christopher dot faulet at capflam dot org
+
+
+ABSTRACT
+--------
+
+The filters support is a new feature of HAProxy 1.7. It is a way to extend
+HAProxy without touching its core code and, in certain extent, without knowing
+its internals. This feature will ease contributions, reducing impact of
+changes. Another advantage will be to simplify HAProxy by replacing some parts
+by filters. As we will see, and as an example, the HTTP compression is the first
+feature moved in a filter.
+
+This document describes how to write a filter and what you have to keep in mind
+to do so. It also talks about the known limits and the pitfalls to avoid.
+
+As said, filters are quite new for now. The API is not freezed and will be
+updated/modified/improved/extended as needed.
+
+
+
+SUMMARY
+-------
+
+  1.    Filters introduction
+  2.    How to use filters
+  3.    How to write a new filter
+  3.1.      API Overview
+  3.2.      Defining the filter name and its configuration
+  3.3.      Managing the filter lifecycle
+  3.4.      Handling the streams creation and desctruction
+  3.5.      Analyzing the channels activity
+  3.6.      Filtering the data exchanged
+  4.    FAQ
+
+
+
+1. FILTERS INTRODUCTION
+-----------------------
+
+First of all, to fully understand how filters work and how to create one, it is
+best to know, at least from a distance, what is a proxy (frontend/backend), a
+stream and a channel in HAProxy and how these entities are linked to each other.
+doc/internals/entities.pdf is a good overview.
+
+Then, to support filters, many callbacks has been added to HAProxy at different
+places, mainly around channel analyzers. Their purpose is to allow filters to
+be involved in the data processing, from the stream creation/destruction to
+the data forwarding. Depending of what it should do, a filter can implement all
+or part of these callbacks. For now, existing callbacks are focused on
+streams. But futur improvements could enlarge filters scope. For example, it
+could be useful to handle events at the connection level.
+
+In HAProxy configuration file, a filter is declared in a proxy section, except
+default. So the configuration corresponding to a filter declaration is attached
+to a specific proxy, and will be shared by all its instances. it is opaque from
+the HAProxy point of view, this is the filter responsibility to manage it. For
+each filter declaration matches a uniq configuration. Several declarations of
+the same filter in the same proxy will be handle as different filters by
+HAProxy.
+
+A filter instance is represented by a partially opaque context (or a state)
+attached to a stream and passed as arguments to callbacks. Through this context,
+filter instances are stateful. Depending the filter is declared in a frontend or
+a backend section, its instances will be created, respectively, when a stream is
+created or when a backend is selected. Their behaviors will also be
+different. Only instances of filters declared in a frontend section will be
+aware of the creation and the destruction of the stream, and will take part in
+the channels analyzing before the backend is defined.
+
+It is important to remember the configuration of a filter is shared by all its
+instances, while the context of an instance is owned by a uniq stream.
+
+Filters are designed to be chained. It is possible to declare several filters in
+the same proxy section. The declaration order is important because filters will
+be called one after the other respecting this order. Frontend and backend
+filters are also chained, frontend ones called first. Even if the filters
+processing is serialized, each filter will bahave as it was alone (unless it was
+developed to be aware of other filters). For all that, some constraints are
+imposed to filters, especially when data exchanged between the client and the
+server are processed. We will dicuss again these contraints when we will tackle
+the subject of writing a filter.
+
+
+
+2. HOW TO USE FILTERS
+---------------------
+
+To use a filter, you must use the parameter 'filter' followed by the filter name
+and, optionnaly, its configuration in the desired listen, frontend or backend
+section. For example:
+
+    listen test
+        ...
+        filter trace name TST
+        ...
+
+
+See doc/configuration.txt for a formal definition of the parameter 'filter'.
+Note that additional parameters on the filter line must be parsed by the filter
+itself.
+
+The list of available filters is reported by 'haproxy -vv':
+
+    $> haproxy -vv
+    HA-Proxy version 1.7-dev2-3a1d4a-33 2016/03/21
+    Copyright 2000-2016 Willy Tarreau <willy@haproxy.org>
+
+    [...]
+
+    Available filters :
+            [COMP] compression
+            [TRACE] trace
+
+
+Multiple filter lines can be used in a proxy section to chain filters. Filters
+will be called in the declaration order.
+
+Some filters can support implicit declarartions in certain circumstances
+(without the filter line). This is not recommanded for new features but are
+useful for existing ones moved in a filter, for backward compatibility
+reasons. Implicit declarartions are supported when there is only one filter used
+on a proxy. When several filters are used, explicit declarartions are mandatory.
+The HTTP compression filter is one of these filters. Alone, using 'compression'
+keywords is enough to use it. But when at least a second filter is used, a
+filter line must be added.
+
+    # filter line is optionnal
+    listen t1
+        bind *:80
+        compression algo gzip
+        compression offload
+        server srv x.x.x.x:80
+
+    # filter line is mandatory for the compression filter
+    listen t2
+        bind *:81
+        filter trace name T2
+        filter compression
+        compression algo gzip
+        compression offload
+        server srv x.x.x.x:80
+
+
+
+
+3. HOW TO WRITE A NEW FILTER
+----------------------------
+
+If you want to write a filter, there are 2 header files that you must know:
+
+  * include/types/filters.h: This is the main header file, containing all
+                             important structures you will use. It represents
+                             the filter API.
+  * include/proto/filters.h: This header file contains helper functions that
+                             you may need to use. It also contains the internal
+                             API used by HAProxy to handle filters.
+
+To ease the filters integration, it is better to follow some conventions:
+
+  * Use 'flt_' prefix to name your filter (e.g: flt_http_comp or flt_trace).
+  * Keep everything related to your filter in a same file.
+
+The filter 'trace' can be used as a template to write your own filter. It is a
+good start to see how filters really work.
+
+3.1 API OVERVIEW
+----------------
+
+Writing a filter can be summarized to write functions and attach them to the
+existing callbacks. Available callbacks are listed in the following structure:
+
+    struct flt_ops {
+        /*
+         * Callbacks to manage the filter lifecycle
+         */
+        int  (*init)  (struct proxy *p, struct flt_conf *fconf);
+        void (*deinit)(struct proxy *p, struct flt_conf *fconf);
+        int  (*check) (struct proxy *p, struct flt_conf *fconf);
+
+        /*
+         * Stream callbacks
+         */
+        int  (*stream_start)     (struct stream *s, struct filter *f);
+        void (*stream_stop)      (struct stream *s, struct filter *f);
+
+        /*
+         * Channel callbacks
+         */
+        int  (*channel_start_analyze)(struct stream *s, struct filter *f,
+                      struct channel *chn);
+        int  (*channel_analyze)      (struct stream *s, struct filter *f,
+                                      struct channel *chn,
+                                      unsigned int an_bit);
+        int  (*channel_end_analyze)  (struct stream *s, struct filter *f,
+                                      struct channel *chn);
+
+        /*
+         * HTTP callbacks
+         */
+        int  (*http_data)          (struct stream *s, struct filter *f,
+                                    struct http_msg *msg);
+        int  (*http_chunk_trailers)(struct stream *s, struct filter *f,
+                                    struct http_msg *msg);
+        int  (*http_end)           (struct stream *s, struct filter *f,
+                                    struct http_msg *msg);
+        int  (*http_forward_data)  (struct stream *s, struct filter *f,
+                                    struct http_msg *msg,
+                                    unsigned int len);
+
+        void (*http_reset)         (struct stream *s, struct filter *f,
+                                    struct http_msg *msg);
+        void (*http_reply)         (struct stream *s, struct filter *f,
+                                    short status,
+                                    const struct chunk *msg);
+
+        /*
+         * TCP callbacks
+         */
+        int  (*tcp_data)        (struct stream *s, struct filter *f,
+                                 struct channel *chn);
+        int  (*tcp_forward_data)(struct stream *s, struct filter *f,
+                                 struct channel *chn,
+                                 unsigned int len);
+    };
+
+
+We will explain in following parts when these callbacks are called and what they
+should do.
+
+Filters are declared in proxy sections. So each proxy have an ordered list of
+filters, possibly empty if no filter is used. When the configuration of a proxy
+is parsed, each filter line represents an entry in this list. In the structure
+'proxy', the filters configurations are stored in the field 'filter_configs',
+each one of type 'struct flt_conf *':
+
+    /*
+     * Structure representing the filter configuration, attached to a proxy and
+     * accessible from a filter when instantiated in a stream
+     */
+    struct flt_conf {
+        const char     *id;   /* The filter id */
+        struct flt_ops *ops;  /* The filter callbacks */
+        void           *conf; /* The filter configuration */
+        struct list     list; /* Next filter for the same proxy */
+    };
+
+  * 'flt_conf.id' is an identifier, defined by the filter. It can be
+    NULL. HAProxy does not use this field. Filters can use it in log messages or
+    as a uniq identifier to check multiple declarations. It is the filter
+    responsibility to free it, if necessary.
+
+  * 'flt_conf.conf' is opaque. It is the internal configuration of a filter,
+    generally allocated and filled by its parsing function (See § 3.2). It is
+    the filter responsibility to free it.
+
+  * 'flt_conf.ops' references the callbacks implemented by the filter. This
+    field must be set during the parsing phase (See § 3.2) and can be refine
+    during the initialization phase (See § 3.3). If it is dynamically allocated,
+    it is the filter responsibility to free it.
+
+
+The filter configuration is global and shared by all its instances. A filter
+instance is created in the context of a stream and attached to this stream. in
+the structure 'stream', the field 'strm_flt' is the state of all filter
+instances attached to a stream:
+
+    /*
+     * Structure reprensenting the "global" state of filters attached to a
+     * stream.
+     */
+    struct strm_flt {
+        struct list    filters;             /* List of filters attached to a stream */
+        struct filter *current[2];          /* From which filter resume processing, for a specific channel.
+                                             * This is used for resumable callbacks only,
+                                             * If NULL, we start from the first filter.
+                                             * 0: request channel, 1: response channel */
+        unsigned short flags;               /* STRM_FL_* */
+        unsigned char  nb_req_data_filters; /* Number of data filters registerd on the request channel */
+        unsigned char  nb_rsp_data_filters; /* Number of data filters registerd on the response channel */
+    };
+
+
+Filter instances attached to a stream are stored in the field
+'strm_flt.filters', each instance is of type 'struct filter *':
+
+    /*
+     * Structure reprensenting a filter instance attached to a stream
+     *
+     * 2D-Array fields are used to store info per channel. The first index
+     * stands for the request channel, and the second one for the response
+     * channel.  Especially, <next> and <fwd> are offets representing amount of
+     * data that the filter are, respectively, parsed and forwarded on a
+     * channel. Filters can access these values using FLT_NXT and FLT_FWD
+     * macros.
+     */
+    struct filter {
+        struct flt_conf *config; /* the filter's configuration */
+        void           *ctx;     /* The filter context (opaque) */
+        unsigned short  flags;   /* FLT_FL_* */
+        unsigned int    next[2]; /* Offset, relative to buf->p, to the next
+                                  * byte to parse for a specific channel
+                                  * 0: request channel, 1: response channel */
+        unsigned int    fwd[2];  /* Offset, relative to buf->p, to the next
+                                  * byte to forward for a specific channel
+                                  * 0: request channel, 1: response channel */
+        struct list     list;    /* Next filter for the same proxy/stream */
+    };
+
+  * 'filter.config' is the filter configuration previously described. All
+    instances of a filter share it.
+
+  * 'filter.ctx' is an opaque context. It is managed by the filter, so it is its
+    responsibility to free it.
+
+  * 'filter.next' and 'filter.fwd' will be described later (See § 3.6).
+
+
+3.2. DEFINING THE FILTER NAME AND ITS CONFIGURATION
+---------------------------------------------------
+
+When you write a filter, the first thing to do is to add it in the supported
+filters. To do so, you must register its name as a valid keyword on the filter
+line:
+
+    /* Declare the filter parser for "my_filter" keyword */
+    static struct flt_kw_list flt_kws = { "MY_FILTER_SCOPE", { }, {
+            { "my_filter", parse_my_filter_cfg },
+            { NULL, NULL },
+        }
+    };
+
+    __attribute__((constructor))
+    static void
+    __my_filter_init(void)
+    {
+        flt_register_keywords(&flt_kws);
+    }
+
+
+Then you must define the internal configuration your filter will use. For
+example:
+
+    struct my_filter_config {
+        struct proxy *proxy;
+        char         *name;
+        /* ... */
+    };
+
+
+You also must list all callbacks implemented by your filter. Here, we use a
+global variable:
+
+    struct flt_ops my_filter_ops {
+        .init   = my_filter_init,
+        .deinit = my_filter_deinit,
+        .check  = my_filter_config_check,
+
+        /* ... */
+     };
+
+
+Finally, you must define the function to parse your filter configuration, here
+'parse_my_filter_cfg'. This function must parse all remaining keywords on the
+filter line:
+
+    /* Return -1 on error, else 0 */
+    static int
+    parse_my_filter_cfg(char **args, int *cur_arg, struct proxy *px,
+                        struct flt_conf *flt_conf, char **err)
+    {
+        struct my_filter_config *my_conf;
+        int pos = *cur_arg;
+
+        /* Allocate the internal configuration used by the filter */
+        my_conf = calloc(1, sizeof(*my_conf));
+        if (!my_conf) {
+            memprintf(err, "%s: out of memory", args[*cur_arg]);
+            return -1;
+        }
+        my_conf->proxy = px;
+
+        /* ... */
+
+        /* Parse all keywords supported by the filter and fill the internal
+         * configuration */
+        pos++; /* Skip the filter name */
+        while (*args[pos]) {
+            if (!strcmp(args[pos], "name")) {
+                if (!*args[pos + 1]) {
+                    memprintf(err, "'%s' : '%s' option without value",
+                              args[*cur_arg], args[pos]);
+                              goto error;
+                }
+                my_conf->name = strdup(args[pos + 1]);
+                if (!my_conf->name) {
+                    memprintf(err, "%s: out of memory", args[*cur_arg]);
+                    goto error;
+                }
+                pos += 2;
+            }
+
+            /* ... parse other keywords ... */
+        }
+        *cur_arg = pos;
+
+        /* Set callbacks supported by the filter */
+        flt_conf->ops  = &my_filter_ops;
+
+        /* Last, save the internal configuration */
+        flt_conf->conf = my_conf;
+        return 0;
+
+      error:
+         if (my_conf->name)
+            free(my_conf->name);
+        free(my_conf);
+        return -1;
+    }
+
+
+WARNING: In your parsing function, you must define 'flt_conf->ops'. You must
+         also parse all arguments on the filter line. This is mandatory.
+
+In the previous example, we expect to read a filter line as follows:
+
+    filter my_filter name MY_NAME ...
+
+
+Optionnaly, by implementing the 'flt_ops.check' callback, you add a step to
+check the internal configuration of your filter after the parsing phase, when
+the HAProxy configuration is fully defined. For example:
+
+    /* Check configuration of a trace filter for a specified proxy.
+     * Return 1 on error, else 0. */
+    static int
+    my_filter_config_check(struct proxy *px, struct flt_conf *my_conf)
+    {
+        if (px->mode != PR_MODE_HTTP) {
+            Alert("The filter 'my_filter' cannot be used in non-HTTP mode.\n");
+            return 1;
+        }
+
+        /* ... */
+
+        return 0;
+    }
+
+
+
+3.3. MANAGING THE FILTER LIFECYCLE
+----------------------------------
+
+Once the configuration parsed and checked, filters are ready to by used. There
+are two callbacks to manage the filter lifecycle:
+
+  * 'flt_ops.init': It initializes the filter for a proxy. You may define this
+                    callback if you need to complete your filter configuration.
+
+  * 'flt_ops.deinit': It cleans up what the parsing function and the init
+                      callback have done. This callback is useful to release
+                      memory allocated for the filter configuration.
+
+Here is an example:
+
+    /* Initialize the filter. Returns -1 on error, else 0. */
+    static int
+    my_filter_init(struct proxy *px, struct flt_conf *fconf)
+    {
+        struct my_filter_config *my_conf = fconf->conf;
+
+        /* ... */
+
+        return 0;
+    }
+
+    /* Free ressources allocated by the trace filter. */
+    static void
+    my_filter_deinit(struct proxy *px, struct flt_conf *fconf)
+    {
+        struct my_filter_config *my_conf = fconf->conf;
+
+        if (my_conf) {
+            free(my_conf->name);
+            /* ... */
+            free(my_conf);
+        }
+        fconf->conf = NULL;
+    }
+
+
+TODO: Add callbacks to handle creation/destruction of filter instances. And
+      document it.
+
+
+3.4. HANDLING THE STREAMS CREATION AND DESCTRUCTION
+---------------------------------------------------
+
+You may be interessted to handle stream creation and destruction. If so, you
+must define followings callbacks:
+
+  * 'flt_ops.stream_start': It is called when a stream is started. This callback
+                            can fail by returning a negative value. It will be
+                            considered as a critical error by HAProxy which
+                            disabled the listener for a short time.
+
+  * 'flt_ops.stream_stop': It is called when a stream is stopped. This callback
+                           always succeed. Anyway, it is too late to return an
+                           error.
+
+For example:
+
+    /* Called when a stream is created. Returns -1 on error, else 0. */
+    static int
+    my_filter_stream_start(struct stream *s, struct filter *filter)
+    {
+         struct my_filter_config *my_conf = FLT_CONF(filter);
+
+         /* ... */
+
+         return 0;
+    }
+
+    /* Called when a stream is destroyed */
+    static void
+    my_filter_stream_stop(struct stream *s, struct filter *filter)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+         /* ... */
+    }
+
+
+WARNING: Handling the streams creation and destuction is only possible for
+         filters defined on proxies with the frontend capability.
+
+
+3.5. ANALYZING THE CHANNELS ACTIVITY
+------------------------------------
+
+The main purpose of filters is to take part in the channels analyzing. To do so,
+there is a callback, 'flt_ops.channel_analyze', called before each analyzer
+attached to a channel, execpt analyzers responsible for the data
+parsing/forwarding (TCP data or HTTP body). Concretely, on the request channel,
+'flt_ops.channel_analyze' could be called before following analyzers:
+
+  * tcp_inspect_request        (AN_REQ_INSPECT_FE and AN_REQ_INSPECT_BE)
+  * http_wait_for_request      (AN_REQ_WAIT_HTTP)
+  * http_wait_for_request_body (AN_REQ_HTTP_BODY)
+  * http_process_req_common    (AN_REQ_HTTP_PROCESS_FE)
+  * process_switching_rules    (AN_REQ_SWITCHING_RULES)
+  * http_process_req_ common   (AN_REQ_HTTP_PROCESS_BE)
+  * http_process_tarpit        (AN_REQ_HTTP_TARPIT)
+  * process_server_rules       (AN_REQ_SRV_RULES)
+  * http_process_request       (AN_REQ_HTTP_INNER)
+  * tcp_persist_rdp_cookie     (AN_REQ_PRST_RDP_COOKIE)
+  * process_sticking_rules     (AN_REQ_STICKING_RULES)
+  * flt_analyze_http_headers   (AN_FLT_HTTP_HDRS)
+
+And on the response channel:
+
+  * tcp_inspect_response     (AN_RES_INSPECT)
+  * http_wait_for_response   (AN_RES_WAIT_HTTP)
+  * process_store_rules      (AN_RES_STORE_RULES)
+  * http_process_res_common  (AN_RES_HTTP_PROCESS_BE)
+  * flt_analyze_http_headers (AN_FLT_HTTP_HDRS)
+
+Note that 'flt_analyze_http_headers' (AN_FLT_HTTP_HDRS) is a new analyzer. It
+has been added to let filters analyze HTTP headers after all processing, just
+before the data parsing/forwarding.
+
+Unlike the other callbacks previously seen before, 'flt_ops.channel_analyze' can
+interrupt the stream processing. So a filter can decide to not execute the
+analyzer that follows and wait the next iteration. If there are more than one
+filter, following ones are skipped. On the next iteration, the filtering resumes
+where it was stopped, i.e. on the filter that has previously stopped the
+processing. So it is possible for a filter to stop the stream processing for a
+while before continuing. For example:
+
+    /* Called before a processing happens on a given channel.
+     * Returns a negative value if an error occurs, 0 if it needs to wait,
+     * any other value otherwise. */
+    static int
+    my_filter_chn_analyze(struct stream *s, struct filter *filter,
+                          struct channel *chn, unsigned an_bit)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        switch (an_bit) {
+            case AN_REQ_WAIT_HTTP:
+                if (/* wait that a condition is verified before continuing */)
+                    return 0;
+                break;
+            /* ... * /
+        }
+        return 1;
+    }
+
+  * 'an_bit' is the analyzer id. All analyzers are listed in
+    'include/types/channels.h'.
+
+  * 'chn' is the channel on which the analyzing is done. You can know if it is
+    the request or the response channel by testing if CF_ISRESP flag is set:
+
+      │ ((chn->flags & CF_ISRESP) == CF_ISRESP)
+
+
+In previous example, the stream processing is blocked before receipt of the HTTP
+request until a condition is verified.
+
+To surround activity of a filter during the channel analyzing, two new analyzers
+has been added:
+
+  * 'flt_start_analyze' (AN_FLT_START_FE/AN_FLT_START_BE): For a specific
+    filter, this analyzer is called before any call to the 'channel_analyze'
+    callback. From the filter point of view, it calls the
+    'flt_ops.channel_start_analyze' callback.
+
+  * 'flt_end_analyze' (AN_FLT_END): For a specific filter, this analyzer is
+    called when all other analyzers have finished their processing. From the
+    filter point of view, it calls the 'flt_ops.channel_end_analyze' callback.
+
+For TCP streams, these analyzers are called only once. For HTTP streams, if the
+client connection is kept alive, this happens at each request/response roundtip.
+
+'flt_ops.channel_start_analyze' and 'flt_ops.channel_end_analyze' callbacks can
+interrupt the stream processing, as 'flt_ops.channel_analyze'. Here is an
+example:
+
+    /* Called when analyze starts for a given channel
+     * Returns a negative value if an error occurs, 0 if it needs to wait,
+     * any other value otherwise. */
+    static int
+    my_filter_chn_start_analyze(struct stream *s, struct filter *filter,
+                                struct channel *chn)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        /* ... TODO ... */
+
+        return 1;
+    }
+
+    /* Called when analyze ends for a given channel
+     * Returns a negative value if an error occurs, 0 if it needs to wait,
+     * any other value otherwise. */
+    static int
+    my_filter_chn_end_analyze(struct stream *s, struct filter *filter,
+                              struct channel *chn)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        /* ... TODO ... */
+
+        return 1;
+    }
+
+
+Workflow on channels can be summarized as following:
+
+                |
+     +----------+-----------+
+     | flt_ops.stream_start |
+     +----------+-----------+
+                |
+               ...
+                |
+                +-<-- [1]                 +------->---------+
+                |                 --+     |                 |                 --+
+                +------<----------+ |     |                 +--------<--------+ |
+                |                 | |     |                 |                 | |
+                V                 | |     |                 V                 | |
++-------------------------------+ | |     | +-------------------------------+ | |
+|      flt_start_analyze        +-+ |     | |      flt_start_analyze        +-+ |
+|(flt_ops.channel_start_analyze)|   | F   | |(flt_ops.channel_start_analyze)|   |
++---------------+---------------+   | R   | +---------------+---------------+   |
+                |                   | O   |                 |                   |
+                +------<--------+   | N   ^                 +--------<-------+  | B
+                |               |   | T   |                 |                |  | A
++---------------+----------+    |   | E   | +---------------+----------+     |  | C
+|+--------------V-----------+   |   | N   | |+--------------V-----------+    |  | K
+||+--------------------------+  |   | D   | ||+--------------------------+   |  | E
+|||  flt_ops.channel_analyze |  |   |     | |||  flt_ops.channel_analyze |   |  | N
++||             V            +--+   |     | +||             V            +---+  | D
+ +|          analyzer        |      |     |  +|          analyzer        |      |
+  +-------------+------------+      |     |   +-------------+------------+      |
+                |                 --+     |                 |                   |
+                +------------>------------+                ...                  |
+                                                            |                   |
+                                                [ data filtering (see below) ]  |
+                                                            |                   |
+                                                           ...                  |
+                                                            |                   |
+                                                            +--------<--------+ |
+                                                            |                 | |
+                                                            V                 | |
+                                            +-------------------------------+ | |
+                                            |       flt_end_analyze         +-+ |
+                                            | (flt_ops.channel_end_analyze) |   |
+                                            +---------------+---------------+   |
+                                                            |                 --+
+                        If HTTP stream, go back to [1] --<--+
+                                                            |
+                                                           ...
+                                                            |
+                                                 +----------+-----------+
+                                                 | flt_ops.stream_stop  |
+                                                 +----------+-----------+
+                                                            |
+                                                            V
+
+
+TODO: Add pre/post analyzer callbacks with a mask. So, this part will be
+      massively refactored very soon.
+
+
+ 3.6. FILTERING THE DATA EXCHANGED
+-----------------------------------
+
+WARNING: To fully understand this part, you must be aware on how the buffers
+         work in HAProxy. In particular, you must be comfortable with the idea
+         of circular buffers. See doc/internals/buffer-operations.txt and
+         doc/internals/buffer-ops.fig for details.
+         doc/internals/body-parsing.txt could also be useful.
+
+An extended feature of the filters is the data filtering. By default a filter
+does not look into data exchanged between the client and the server because it
+is expensive. Indeed, instead of forwarding data without any processing, each
+byte need to be buffered.
+
+So, to enable the data filtering on a channel, at any time, in one of previous
+callbacks, you should call 'register_data_filter' function. And conversely, to
+disable it, you should call 'unregister_data_filter' function. For example:
+
+    my_filter_chn_analyze(struct stream *s, struct filter *filter,
+                          struct channel *chn, unsigned an_bit)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        /* 'chn' must be the request channel */
+        if (!(chn->flags & CF_ISRESP) && an_bit == AN_FLT_HTTP_HDRS) {
+            struct http_txn *txn = s->txn;
+            struct http_msg *msg = &txn->req;
+            struct buffer   *req = msg->chn->buf;
+            struct hdr_ctx   ctx;
+
+            /* Enable the data filtering for the request if 'X-Filter' header
+             * is set to 'true'. */
+            if (http_find_header2("X-Filter", 8, req->p, &txn->hdr_idx, &ctx) &&
+                ctx.vlen >= 3 && memcmp(ctx.line + ctx.val, "true", 4) == 0)
+                register_data_filter(s, chn_filter);
+        }
+
+        return 1;
+    }
+
+Here, the data filtering is enabled if the HTTP header 'X-Filter' is found and
+set to 'true'.
+
+If several filters are declared, the evaluation order remains the same,
+regardless the order of the registrations to the data filtering.
+
+Depending on the stream type, TCP or HTTP, the way to handle data filtering will
+be slightly different. Among other things, for HTTP streams, there are more
+callbacks to help you to fully handle all steps of an HTTP transaction. But the
+basis is the same. The data filtering is done in 2 stages:
+
+  * The data parsing: At this stage, filters will analyze input data on a
+    channel. Once a filter has parsed some data, it cannot parse it again. At
+    any time, a filter can choose to not parse all available data. So, it is
+    possible for a filter to retain data for a while. Because filters are
+    chained, a filter cannot parse more data than its predecessors. Thus only
+    data considered as parsed by the last filter will be available to the next
+    stage, the data forwarding.
+
+  * The data forwarding: At this stage, filters will decide how much data
+    HAProxy can forward among those considered as parsed at the previous
+    stage. Once a filter has marked data as forwardable, it cannot analyze it
+    anymore. At any time, a filter can choose to not forward all parsed
+    data. So, it is possible for a filter to retain data for a while. Because
+    filters are chained, a filter cannot forward more data than its
+    predecessors. Thus only data marked as forwardable by the last filter will
+    be actually forwarded by HAProxy.
+
+Internally, filters own 2 offsets, relatively to 'buf->p', representing the
+number of bytes already parsed in the available input data and the number of
+bytes considered as forwarded. We will call these offsets, respectively, 'nxt'
+and 'fwd'. Following macros reference these offsets:
+
+  * FLT_NXT(flt, chn), flt_req_nxt(flt) and flt_rsp_nxt(flt)
+
+  * FLT_FWD(flt, chn), flt_req_fwd(flt) and flt_rsp_fwd(flt)
+
+where 'flt' is the 'struct filter' passed as argument in all callbacks and 'chn'
+is the considered channel.
+
+Using these offsets, following operations on buffers are possible:
+
+    chn->buf->p + FLT_NXT(flt, chn) // the pointer on parsable data for
+                                    // the filter 'flt' on the channel 'chn'.
+    // Everything between chn->buf->p and 'nxt' offset was already parsed
+    // by the filter.
+
+    chn->buf->i - FLT_NXT(flt, chn) // the number of bytes of parsable data for
+                                    // the filter 'flt' on the channel 'chn'.
+
+    chn->buf->p + FLT_FWD(flt, chn) // the pointer on forwardable data for
+                                    // the filter 'flt' on the channel 'chn'.
+    // Everything between chn->buf->p and 'fwd' offset was already forwarded
+    // by the filter.
+
+
+Note that at any time, for a filter, 'nxt' offset is always greater or equal to
+'fwd' offset.
+
+TODO: Add schema with buffer states when there is 2 filters that analyze data.
+
+
+3.6.1 FILTERING DATA ON TCP STREAMS
+-----------------------------------
+
+The TCP data filtering is the easy case, because HAProxy do not parse these
+data. So you have only two callbacks that you need to consider:
+
+  * 'flt_ops.tcp_data': This callback is called when unparsed data are
+    available. If not defined, all available data will be considered as parsed
+    for the filter.
+
+  * 'flt_ops.tcp_forward_data': This callback is called when parsed data are
+    available.  If not defined, all parsed data will be considered as forwarded
+    for the filter.
+
+Here is an example:
+
+    /* Returns a negative value if an error occurs, else the number of
+     * consumed bytes. */
+    static int
+    my_filter_tcp_data(struct stream *s, struct filter *filter,
+                       struct channel *chn)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+        int avail = chn->buf->i - FLT_NXT(filter, chn);
+        int ret   = avail;
+
+        /* Do not parse more than 'my_conf->max_parse' bytes at a time */
+        if (my_conf->max_parse != 0 && ret > my_conf->max_parse)
+            ret = my_conf->max_parse;
+
+        /* if available data are not completely parsed, wake up the stream to
+         * be sure to not freeze it. */
+        if (ret != avail)
+            task_wakeup(s->task, TASK_WOKEN_MSG);
+        return ret;
+    }
+
+
+    /* Returns a negative value if an error occurs, else * or the number of
+     * forwarded bytes. */
+    static int
+    my_filter_tcp_forward_data(struct stream *s, struct filter *filter,
+                               struct channel *chn, unsigned int len)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+        int ret = len;
+
+        /* Do not forward more than 'my_conf->max_forward' bytes at a time */
+        if (my_conf->max_forward != 0 && ret > my_conf->max_forward)
+            ret = my_conf->max_forward;
+
+        /* if parsed data are not completely forwarded, wake up the stream to
+         * be sure to not freeze it. */
+        if (ret != len)
+            task_wakeup(s->task, TASK_WOKEN_MSG);
+        return ret;
+    }
+
+
+
+3.6.2 FILTERING DATA ON HTTP STREAMS
+------------------------------------
+
+The HTTP data filtering is a bit tricky because HAProxy will parse the body
+structure, especially chunked body. So basically there is the HTTP counterpart
+to the previous callbacks:
+
+  * 'flt_ops.http_data': This callback is called when unparsed data are
+    available. If not defined, all available data will be considered as parsed
+    for the filter.
+
+  * 'flt_ops.http_forward_data': This callback is called when parsed data are
+    available.  If not defined, all parsed data will be considered as forwarded
+    for the filter.
+
+But the prototype for these callbacks is slightly different. Instead of having
+the channel as parameter, we have the HTTP message (struct http_msg). You need
+to be careful when you use 'http_msg.chunk_len' size. This value is the number
+of bytes remaining to parse in the HTTP body (or the chunk for chunked
+messages). The HTTP parser of HAProxy uses it to have the number of bytes that
+it could consume:
+
+    /* Available input data in the current chunk from the HAProxy point of view.
+     * msg->next bytes were already parsed. Without data filtering, HAProxy
+     * will consume all of it. */
+    Bytes = MIN(msg->chunk_len, chn->buf->i - msg->next);
+
+
+But in your filter, you need to recompute it:
+
+    /* Available input data in the current chunk from the filter point of view.
+     * 'nxt' bytes were already parsed. */
+    Bytes = MIN(msg->chunk_len + msg->next, chn->buf->i) - FLT_NXT(flt, chn);
+
+
+In addition to these callbacks, there are two other:
+
+  * 'flt_ops.http_end': This callback is called when the whole HTTP
+    request/response is processed. It can interrupt the stream processing. So,
+    it could be used to synchronize the HTTP request with the HTTP response, for
+    example:
+
+        /* Returns a negative value if an error occurs, 0 if it needs to wait,
+         * any other value otherwise. */
+         static int
+         my_filter_http_end(struct stream *s, struct filter *filter,
+                            struct http_msg *msg)
+         {
+             struct my_filter_ctx *my_ctx = filter->ctx;
+
+
+            if (!(msg->chn->flags & CF_ISRESP)) /* The request */
+                my_ctx->end_of_req = 1;
+            else /* The response */
+                my_ctx->end_of_rsp = 1;
+
+            /* Both the request and the response are finished */
+            if (my_ctx->end_of_req == 1 && my_ctx->end_of_rsp == 1)
+                return 1;
+
+            /* Wait */
+            return 0;
+        }
+
+
+  * 'flt_ops.http_chunk_trailers': This callback is called for chunked HTTP
+    messages only when all chunks were parsed. HTTP trailers can be parsed into
+    several passes. This callback will be called each time. The number of bytes
+    parsed by HAProxy at each iteration is stored in 'msg->sol'.
+
+Then, to finish, there are 2 informational callbacks:
+
+  * 'flt_ops.http_reset': This callback is called when a HTTP message is
+    reset. This only happens when a '100-continue' response is received. It
+    could be useful to reset the filter context before receiving the true
+    response.
+
+  * 'flt_ops.http_reply': This callback is called when, at any time, HAProxy
+    decides to stop the processing on a HTTP message and to send an internal
+    response to the client. This mainly happens when an error or a redirect
+    occurs.
+
+
+3.6.3 REWRITING DATA
+--------------------
+
+The last part, and the trickiest one about the data filtering, is about the data
+rewriting. For now, the filter API does not offer a lot of functions to handle
+it. There are only functions to notify HAProxy that the data size has changed to
+let it update internal state of filters. This is your responsibility to update
+data itself, i.e. the buffer offsets. For a HTTP message, you also must update
+'msg->next' and 'msg->chunk_len' values accordingly:
+
+  * 'flt_change_next_size': This function must be called when a filter alter
+    incoming data. It updates 'nxt' offset value of all its predecessors. Do not
+    call this function when a filter change the size of incoming data leads to
+    an undefined behavior.
+
+        unsigned int avail = MIN(msg->chunk_len + msg->next, chn->buf->i) -
+            flt_rsp_next(filter);
+
+        if (avail > 10 and /* ...Some condition... */) {
+            /* Move the buffer forward to have buf->p pointing on unparsed
+             * data */
+            b_adv(msg->chn->buf, flt_rsp_nxt(filter));
+
+            /* Skip first 10 bytes. To simplify this example, we consider a
+             * non-wrapping buffer  */
+            memmove(buf->p + 10, buf->p, avail - 10);
+
+            /* Restore buf->p value */
+            b_rew(msg->chn->buf, flt_rsp_nxt(filter));
+
+            /* Now update other filters */
+            flt_change_next_size(filter, msg->chn, -10);
+
+            /* Update the buffer state */
+            buf->i -= 10;
+
+            /* And update the HTTP message state */
+            msg->chunk_len -= 10;
+
+            return (avail - 10);
+        }
+        else
+            return 0; /* Wait for more data */
+
+
+  * 'flt_change_forward_size': This function must be called when a filter alter
+    parsed data. It updates offset values ('nxt' and 'fwd') of all filters. Do
+    not call this function when a filter change the size of parsed data leads to
+    an undefined behavior.
+
+        /* len is the number of bytes of forwardable data */
+        if (len > 10 and /* ...Some condition... */) {
+            /* Move the buffer forward to have buf->p pointing on non-forwarded
+             * data */
+            b_adv(msg->chn->buf, flt_rsp_fwd(filter));
+
+            /* Skip first 10 bytes. To simplify this example, we consider a
+             * non-wrapping buffer  */
+            memmove(buf->p + 10, buf->p, len - 10);
+
+            /* Restore buf->p value */
+            b_rew(msg->chn->buf, flt_rsp_fwd(filter));
+
+            /* Now update other filters */
+            flt_change_forward_size(filter, msg->chn, -10);
+
+            /* Update the buffer state */
+            buf->i -= 10;
+
+            /* And update the HTTP message state */
+            msg->next -= 10;
+
+            return (len - 10);
+        }
+        else
+            return 0; /* Wait for more data */
+
+
+TODO: implement all the stuff to easily rewrite data. For HTTP messages, this
+      requires to have a chunked message. Else the size of data cannot be
+      changed.
+
+
+
+
+4. FAQ
+------
+
+4.1. Detect multiple declarations of the same filter
+----------------------------------------------------
+
+TODO
commit	c3fe5330be4c2d15ea50ae3a2d01e9287461d13c	[log] [tgz]
author	Christopher Faulet <cfaulet@qualys.com>	Thu Apr 07 15:30:10 2016 +0200
committer	Willy Tarreau <w@1wt.eu>	Thu Apr 21 07:01:41 2016 +0200
tree	2e7fe9079d3f853a6bce3c42096d0322abceeec2
parent	00e818aa587b254453b4c5be7358c72069b972c4 [diff]