blob: a2024bc9d2dbf37d7ba48b51ff7f9ff6f5fc87ba [file] [log] [blame]
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +01001/*
2 * Elastic Binary Trees - generic macros and structures.
Willy Tarreau1fb6c872008-05-16 19:48:20 +02003 * Version 4.0
Willy Tarreau70bcfb72008-01-27 02:21:53 +01004 * (C) 2002-2008 - Willy Tarreau <w@1wt.eu>
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +01005 *
6 * This program is free software; you can redistribute it and/or modify
7 * it under the terms of the GNU General Public License as published by
8 * the Free Software Foundation; either version 2 of the License, or
9 * (at your option) any later version.
10 *
11 * This program is distributed in the hope that it will be useful,
12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14 * GNU General Public License for more details.
15 *
16 * You should have received a copy of the GNU General Public License
17 * along with this program; if not, write to the Free Software
18 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
19 *
20 *
21 * Short history :
22 *
23 * 2007/09/28: full support for the duplicates tree => v3
24 * 2007/07/08: merge back cleanups from kernel version.
25 * 2007/07/01: merge into Linux Kernel (try 1).
26 * 2007/05/27: version 2: compact everything into one single struct
27 * 2007/05/18: adapted the structure to support embedded nodes
28 * 2007/05/13: adapted to mempools v2.
29 */
30
31
32
33/*
34 General idea:
35 -------------
36 In a radix binary tree, we may have up to 2N-1 nodes for N keys if all of
37 them are leaves. If we find a way to differentiate intermediate nodes (later
38 called "nodes") and final nodes (later called "leaves"), and we associate
39 them by two, it is possible to build sort of a self-contained radix tree with
40 intermediate nodes always present. It will not be as cheap as the ultree for
41 optimal cases as shown below, but the optimal case almost never happens :
42
43 Eg, to store 8, 10, 12, 13, 14 :
44
45 ultree this theorical tree
46
47 8 8
48 / \ / \
49 10 12 10 12
50 / \ / \
51 13 14 12 14
52 / \
53 12 13
54
55 Note that on real-world tests (with a scheduler), is was verified that the
56 case with data on an intermediate node never happens. This is because the
57 data spectrum is too large for such coincidences to happen. It would require
58 for instance that a task has its expiration time at an exact second, with
59 other tasks sharing that second. This is too rare to try to optimize for it.
60
61 What is interesting is that the node will only be added above the leaf when
62 necessary, which implies that it will always remain somewhere above it. So
63 both the leaf and the node can share the exact value of the leaf, because
64 when going down the node, the bit mask will be applied to comparisons. So we
65 are tempted to have one single key shared between the node and the leaf.
66
67 The bit only serves the nodes, and the dups only serve the leaves. So we can
68 put a lot of information in common. This results in one single entity with
69 two branch pointers and two parent pointers, one for the node part, and one
70 for the leaf part :
71
72 node's leaf's
73 parent parent
74 | |
75 [node] [leaf]
76 / \
77 left right
78 branch branch
79
80 The node may very well refer to its leaf counterpart in one of its branches,
81 indicating that its own leaf is just below it :
82
83 node's
84 parent
85 |
86 [node]
87 / \
88 left [leaf]
89 branch
90
91 Adding keys in such a tree simply consists in inserting nodes between
92 other nodes and/or leaves :
93
94 [root]
95 |
96 [node2]
97 / \
98 [leaf1] [node3]
99 / \
100 [leaf2] [leaf3]
101
102 On this diagram, we notice that [node2] and [leaf2] have been pulled away
103 from each other due to the insertion of [node3], just as if there would be
104 an elastic between both parts. This elastic-like behaviour gave its name to
105 the tree : "Elastic Binary Tree", or "EBtree". The entity which associates a
106 node part and a leaf part will be called an "EB node".
107
108 We also notice on the diagram that there is a root entity required to attach
109 the tree. It only contains two branches and there is nothing above it. This
110 is an "EB root". Some will note that [leaf1] has no [node1]. One property of
111 the EBtree is that all nodes have their branches filled, and that if a node
112 has only one branch, it does not need to exist. Here, [leaf1] was added
113 below [root] and did not need any node.
114
115 An EB node contains :
116 - a pointer to the node's parent (node_p)
117 - a pointer to the leaf's parent (leaf_p)
118 - two branches pointing to lower nodes or leaves (branches)
119 - a bit position (bit)
120 - an optional key.
121
122 The key here is optional because it's used only during insertion, in order
123 to classify the nodes. Nothing else in the tree structure requires knowledge
124 of the key. This makes it possible to write type-agnostic primitives for
125 everything, and type-specific insertion primitives. This has led to consider
126 two types of EB nodes. The type-agnostic ones will serve as a header for the
127 other ones, and will simply be called "struct eb_node". The other ones will
128 have their type indicated in the structure name. Eg: "struct eb32_node" for
129 nodes carrying 32 bit keys.
130
131 We will also node that the two branches in a node serve exactly the same
132 purpose as an EB root. For this reason, a "struct eb_root" will be used as
133 well inside the struct eb_node. In order to ease pointer manipulation and
134 ROOT detection when walking upwards, all the pointers inside an eb_node will
135 point to the eb_root part of the referenced EB nodes, relying on the same
136 principle as the linked lists in Linux.
137
138 Another important point to note, is that when walking inside a tree, it is
139 very convenient to know where a node is attached in its parent, and what
140 type of branch it has below it (leaf or node). In order to simplify the
141 operations and to speed up the processing, it was decided in this specific
142 implementation to use the lowest bit from the pointer to designate the side
143 of the upper pointers (left/right) and the type of a branch (leaf/node).
144 This practise is not mandatory by design, but an implementation-specific
145 optimisation permitted on all platforms on which data must be aligned. All
146 known 32 bit platforms align their integers and pointers to 32 bits, leaving
147 the two lower bits unused. So, we say that the pointers are "tagged". And
148 since they designate pointers to root parts, we simply call them
149 "tagged root pointers", or "eb_troot" in the code.
150
151 Duplicate keys are stored in a special manner. When inserting a key, if
152 the same one is found, then an incremental binary tree is built at this
153 place from these keys. This ensures that no special case has to be written
154 to handle duplicates when walking through the tree or when deleting entries.
155 It also guarantees that duplicates will be walked in the exact same order
156 they were inserted. This is very important when trying to achieve fair
157 processing distribution for instance.
158
159 Algorithmic complexity can be derived from 3 variables :
160 - the number of possible different keys in the tree : P
161 - the number of entries in the tree : N
162 - the number of duplicates for one key : D
163
164 Note that this tree is deliberately NOT balanced. For this reason, the worst
165 case may happen with a small tree (eg: 32 distinct keys of one bit). BUT,
166 the operations required to manage such data are so much cheap that they make
167 it worth using it even under such conditions. For instance, a balanced tree
168 may require only 6 levels to store those 32 keys when this tree will
169 require 32. But if per-level operations are 5 times cheaper, it wins.
170
171 Minimal, Maximal and Average times are specified in number of operations.
172 Minimal is given for best condition, Maximal for worst condition, and the
173 average is reported for a tree containing random keys. An operation
174 generally consists in jumping from one node to the other.
175
176 Complexity :
177 - lookup : min=1, max=log(P), avg=log(N)
178 - insertion from root : min=1, max=log(P), avg=log(N)
179 - insertion of dups : min=1, max=log(D), avg=log(D)/2 after lookup
180 - deletion : min=1, max=1, avg=1
181 - prev/next : min=1, max=log(P), avg=2 :
182 N/2 nodes need 1 hop => 1*N/2
183 N/4 nodes need 2 hops => 2*N/4
184 N/8 nodes need 3 hops => 3*N/8
185 ...
186 N/x nodes need log(x) hops => log2(x)*N/x
187 Total cost for all N nodes : sum[i=1..N](log2(i)*N/i) = N*sum[i=1..N](log2(i)/i)
188 Average cost across N nodes = total / N = sum[i=1..N](log2(i)/i) = 2
189
190 This design is currently limited to only two branches per node. Most of the
191 tree descent algorithm would be compatible with more branches (eg: 4, to cut
192 the height in half), but this would probably require more complex operations
193 and the deletion algorithm would be problematic.
194
195 Useful properties :
196 - a node is always added above the leaf it is tied to, and never can get
197 below nor in another branch. This implies that leaves directly attached
198 to the root do not use their node part, which is indicated by a NULL
199 value in node_p. This also enhances the cache efficiency when walking
200 down the tree, because when the leaf is reached, its node part will
201 already have been visited (unless it's the first leaf in the tree).
202
203 - pointers to lower nodes or leaves are stored in "branch" pointers. Only
204 the root node may have a NULL in either branch, it is not possible for
205 other branches. Since the nodes are attached to the left branch of the
206 root, it is not possible to see a NULL left branch when walking up a
207 tree. Thus, an empty tree is immediately identified by a NULL left
208 branch at the root. Conversely, the one and only way to identify the
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200209 root node is to check that it right branch is NULL. Note that the
210 NULL pointer may have a few low-order bits set.
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100211
212 - a node connected to its own leaf will have branch[0|1] pointing to
213 itself, and leaf_p pointing to itself.
214
215 - a node can never have node_p pointing to itself.
216
217 - a node is linked in a tree if and only if it has a non-null leaf_p.
218
219 - a node can never have both branches equal, except for the root which can
220 have them both NULL.
221
222 - deletion only applies to leaves. When a leaf is deleted, its parent must
223 be released too (unless it's the root), and its sibling must attach to
224 the grand-parent, replacing the parent. Also, when a leaf is deleted,
225 the node tied to this leaf will be removed and must be released too. If
226 this node is different from the leaf's parent, the freshly released
227 leaf's parent will be used to replace the node which must go. A released
228 node will never be used anymore, so there's no point in tracking it.
229
230 - the bit index in a node indicates the bit position in the key which is
231 represented by the branches. That means that a node with (bit == 0) is
232 just above two leaves. Negative bit values are used to build a duplicate
233 tree. The first node above two identical leaves gets (bit == -1). This
234 value logarithmically decreases as the duplicate tree grows. During
235 duplicate insertion, a node is inserted above the highest bit value (the
236 lowest absolute value) in the tree during the right-sided walk. If bit
237 -1 is not encountered (highest < -1), we insert above last leaf.
238 Otherwise, we insert above the node with the highest value which was not
239 equal to the one of its parent + 1.
240
241 - the "eb_next" primitive walks from left to right, which means from lower
242 to higher keys. It returns duplicates in the order they were inserted.
243 The "eb_first" primitive returns the left-most entry.
244
245 - the "eb_prev" primitive walks from right to left, which means from
246 higher to lower keys. It returns duplicates in the opposite order they
247 were inserted. The "eb_last" primitive returns the right-most entry.
248
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200249 - a tree which has 1 in the lower bit of its root's right branch is a
250 tree with unique nodes. This means that when a node is inserted with
251 a key which already exists will not be inserted, and the previous
252 entry will be returned.
253
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100254 */
255
Willy Tarreauf56fd8a2007-11-19 18:43:04 +0100256#ifndef _COMMON_EBTREE_H
257#define _COMMON_EBTREE_H
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100258
259#include <stdlib.h>
Willy Tarreaube68e6a2007-11-21 10:34:15 +0100260#include <common/config.h>
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100261
262/* Note: we never need to run fls on null keys, so we can optimize the fls
263 * function by removing a conditional jump.
264 */
265#if defined(__i386__)
266static inline int flsnz(int x)
267{
268 int r;
269 __asm__("bsrl %1,%0\n"
270 : "=r" (r) : "rm" (x));
271 return r+1;
272}
273#else
274// returns 1 to 32 for 1<<0 to 1<<31. Undefined for 0.
275#define flsnz(___a) ({ \
276 register int ___x, ___bits = 0; \
277 ___x = (___a); \
278 if (___x & 0xffff0000) { ___x &= 0xffff0000; ___bits += 16;} \
279 if (___x & 0xff00ff00) { ___x &= 0xff00ff00; ___bits += 8;} \
280 if (___x & 0xf0f0f0f0) { ___x &= 0xf0f0f0f0; ___bits += 4;} \
281 if (___x & 0xcccccccc) { ___x &= 0xcccccccc; ___bits += 2;} \
282 if (___x & 0xaaaaaaaa) { ___x &= 0xaaaaaaaa; ___bits += 1;} \
283 ___bits + 1; \
284 })
285#endif
286
287static inline int fls64(unsigned long long x)
288{
289 unsigned int h;
290 unsigned int bits = 32;
291
292 h = x >> 32;
293 if (!h) {
294 h = x;
295 bits = 0;
296 }
297 return flsnz(h) + bits;
298}
299
300#define fls_auto(x) ((sizeof(x) > 4) ? fls64(x) : flsnz(x))
301
302/* Linux-like "container_of". It returns a pointer to the structure of type
303 * <type> which has its member <name> stored at address <ptr>.
304 */
305#ifndef container_of
306#define container_of(ptr, type, name) ((type *)(((void *)(ptr)) - ((long)&((type *)0)->name)))
307#endif
308
309/*
310 * Gcc >= 3 provides the ability for the program to give hints to the compiler
311 * about what branch of an if is most likely to be taken. This helps the
312 * compiler produce the most compact critical paths, which is generally better
313 * for the cache and to reduce the number of jumps. Be very careful not to use
314 * this in inline functions, because the code reordering it causes very often
315 * has a negative impact on the calling functions.
316 */
Willy Tarreau70bcfb72008-01-27 02:21:53 +0100317#if !defined(likely)
318#if __GNUC__ < 3
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100319#define __builtin_expect(x,y) (x)
Willy Tarreau70bcfb72008-01-27 02:21:53 +0100320#define likely(x) (x)
321#define unlikely(x) (x)
322#elif __GNUC__ < 4
323/* gcc 3.x does the best job at this */
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100324#define likely(x) (__builtin_expect((x) != 0, 1))
325#define unlikely(x) (__builtin_expect((x) != 0, 0))
Willy Tarreau70bcfb72008-01-27 02:21:53 +0100326#else
327/* GCC 4.x is stupid, it performs the comparison then compares it to 1,
328 * so we cheat in a dirty way to prevent it from doing this. This will
329 * only work with ints and booleans though.
330 */
331#define likely(x) (x)
Willy Tarreau75875a72008-07-06 15:18:50 +0200332#define unlikely(x) (__builtin_expect((unsigned long)(x), 0))
Willy Tarreau70bcfb72008-01-27 02:21:53 +0100333#endif
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100334#endif
335
Willy Tarreau75cf17e2008-08-29 15:48:49 +0200336/* By default, gcc does not inline large chunks of code, but we want it to
337 * respect our choices.
338 */
339#if !defined(forceinline)
340#if __GNUC__ < 3
341#define forceinline inline
342#else
343#define forceinline inline __attribute__((always_inline))
344#endif
345#endif
346
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100347/* Support passing function parameters in registers. For this, the
348 * CONFIG_EBTREE_REGPARM macro has to be set to the maximal number of registers
349 * allowed. Some functions have intentionally received a regparm lower than
350 * their parameter count, it is in order to avoid register clobbering where
351 * they are called.
352 */
353#ifndef REGPRM1
354#if CONFIG_EBTREE_REGPARM >= 1
355#define REGPRM1 __attribute__((regparm(1)))
356#else
357#define REGPRM1
358#endif
359#endif
360
361#ifndef REGPRM2
362#if CONFIG_EBTREE_REGPARM >= 2
363#define REGPRM2 __attribute__((regparm(2)))
364#else
365#define REGPRM2 REGPRM1
366#endif
367#endif
368
369#ifndef REGPRM3
370#if CONFIG_EBTREE_REGPARM >= 3
371#define REGPRM3 __attribute__((regparm(3)))
372#else
373#define REGPRM3 REGPRM2
374#endif
375#endif
376
377/* Number of bits per node, and number of leaves per node */
378#define EB_NODE_BITS 1
379#define EB_NODE_BRANCHES (1 << EB_NODE_BITS)
380#define EB_NODE_BRANCH_MASK (EB_NODE_BRANCHES - 1)
381
382/* Be careful not to tweak those values. The walking code is optimized for NULL
383 * detection on the assumption that the following values are intact.
384 */
385#define EB_LEFT 0
386#define EB_RGHT 1
387#define EB_LEAF 0
388#define EB_NODE 1
389
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200390/* Tags to set in root->b[EB_RGHT] :
391 * - EB_NORMAL is a normal tree which stores duplicate keys.
392 * - EB_UNIQUE is a tree which stores unique keys.
393 */
394#define EB_NORMAL 0
395#define EB_UNIQUE 1
396
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100397/* This is the same as an eb_node pointer, except that the lower bit embeds
398 * a tag. See eb_dotag()/eb_untag()/eb_gettag(). This tag has two meanings :
399 * - 0=left, 1=right to designate the parent's branch for leaf_p/node_p
400 * - 0=link, 1=leaf to designate the branch's type for branch[]
401 */
402typedef void eb_troot_t;
403
404/* The eb_root connects the node which contains it, to two nodes below it, one
405 * of which may be the same node. At the top of the tree, we use an eb_root
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200406 * too, which always has its right branch NULL (+/1 low-order bits).
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100407 */
408struct eb_root {
409 eb_troot_t *b[EB_NODE_BRANCHES]; /* left and right branches */
410};
411
412/* The eb_node contains the two parts, one for the leaf, which always exists,
413 * and one for the node, which remains unused in the very first node inserted
414 * into the tree. This structure is 20 bytes per node on 32-bit machines. Do
415 * not change the order, benchmarks have shown that it's optimal this way.
416 */
417struct eb_node {
418 struct eb_root branches; /* branches, must be at the beginning */
419 eb_troot_t *node_p; /* link node's parent */
420 eb_troot_t *leaf_p; /* leaf node's parent */
421 int bit; /* link's bit position. */
422};
423
424/* Return the structure of type <type> whose member <member> points to <ptr> */
425#define eb_entry(ptr, type, member) container_of(ptr, type, member)
426
427/* The root of a tree is an eb_root initialized with both pointers NULL.
428 * During its life, only the left pointer will change. The right one will
429 * always remain NULL, which is the way we detect it.
430 */
431#define EB_ROOT \
432 (struct eb_root) { \
433 .b = {[0] = NULL, [1] = NULL }, \
434 }
435
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200436#define EB_ROOT_UNIQUE \
437 (struct eb_root) { \
438 .b = {[0] = NULL, [1] = (void *)1 }, \
439 }
440
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100441#define EB_TREE_HEAD(name) \
442 struct eb_root name = EB_ROOT
443
444
445/***************************************\
446 * Private functions. Not for end-user *
447\***************************************/
448
449/* Converts a root pointer to its equivalent eb_troot_t pointer,
450 * ready to be stored in ->branch[], leaf_p or node_p. NULL is not
451 * conserved. To be used with EB_LEAF, EB_NODE, EB_LEFT or EB_RGHT in <tag>.
452 */
453static inline eb_troot_t *eb_dotag(const struct eb_root *root, const int tag)
454{
455 return (eb_troot_t *)((void *)root + tag);
456}
457
458/* Converts an eb_troot_t pointer pointer to its equivalent eb_root pointer,
459 * for use with pointers from ->branch[], leaf_p or node_p. NULL is conserved
460 * as long as the tree is not corrupted. To be used with EB_LEAF, EB_NODE,
461 * EB_LEFT or EB_RGHT in <tag>.
462 */
463static inline struct eb_root *eb_untag(const eb_troot_t *troot, const int tag)
464{
465 return (struct eb_root *)((void *)troot - tag);
466}
467
468/* returns the tag associated with an eb_troot_t pointer */
469static inline int eb_gettag(eb_troot_t *troot)
470{
471 return (unsigned long)troot & 1;
472}
473
474/* Converts a root pointer to its equivalent eb_troot_t pointer and clears the
475 * tag, no matter what its value was.
476 */
477static inline struct eb_root *eb_clrtag(const eb_troot_t *troot)
478{
479 return (struct eb_root *)((unsigned long)troot & ~1UL);
480}
481
482/* Returns a pointer to the eb_node holding <root> */
483static inline struct eb_node *eb_root_to_node(struct eb_root *root)
484{
485 return container_of(root, struct eb_node, branches);
486}
487
488/* Walks down starting at root pointer <start>, and always walking on side
489 * <side>. It either returns the node hosting the first leaf on that side,
490 * or NULL if no leaf is found. <start> may either be NULL or a branch pointer.
491 * The pointer to the leaf (or NULL) is returned.
492 */
493static inline struct eb_node *eb_walk_down(eb_troot_t *start, unsigned int side)
494{
495 /* A NULL pointer on an empty tree root will be returned as-is */
496 while (eb_gettag(start) == EB_NODE)
497 start = (eb_untag(start, EB_NODE))->b[side];
498 /* NULL is left untouched (root==eb_node, EB_LEAF==0) */
499 return eb_root_to_node(eb_untag(start, EB_LEAF));
500}
501
502/* This function is used to build a tree of duplicates by adding a new node to
503 * a subtree of at least 2 entries. It will probably never be needed inlined,
504 * and it is not for end-user.
505 */
Willy Tarreau75cf17e2008-08-29 15:48:49 +0200506static forceinline struct eb_node *
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100507__eb_insert_dup(struct eb_node *sub, struct eb_node *new)
508{
509 struct eb_node *head = sub;
510
511 struct eb_troot *new_left = eb_dotag(&new->branches, EB_LEFT);
512 struct eb_troot *new_rght = eb_dotag(&new->branches, EB_RGHT);
513 struct eb_troot *new_leaf = eb_dotag(&new->branches, EB_LEAF);
514
515 /* first, identify the deepest hole on the right branch */
516 while (eb_gettag(head->branches.b[EB_RGHT]) != EB_LEAF) {
517 struct eb_node *last = head;
518 head = container_of(eb_untag(head->branches.b[EB_RGHT], EB_NODE),
519 struct eb_node, branches);
520 if (head->bit > last->bit + 1)
521 sub = head; /* there's a hole here */
522 }
523
524 /* Here we have a leaf attached to (head)->b[EB_RGHT] */
525 if (head->bit < -1) {
526 /* A hole exists just before the leaf, we insert there */
527 new->bit = -1;
528 sub = container_of(eb_untag(head->branches.b[EB_RGHT], EB_LEAF),
529 struct eb_node, branches);
530 head->branches.b[EB_RGHT] = eb_dotag(&new->branches, EB_NODE);
531
532 new->node_p = sub->leaf_p;
533 new->leaf_p = new_rght;
534 sub->leaf_p = new_left;
535 new->branches.b[EB_LEFT] = eb_dotag(&sub->branches, EB_LEAF);
536 new->branches.b[EB_RGHT] = new_leaf;
537 return new;
538 } else {
539 int side;
540 /* No hole was found before a leaf. We have to insert above
541 * <sub>. Note that we cannot be certain that <sub> is attached
542 * to the right of its parent, as this is only true if <sub>
543 * is inside the dup tree, not at the head.
544 */
545 new->bit = sub->bit - 1; /* install at the lowest level */
546 side = eb_gettag(sub->node_p);
547 head = container_of(eb_untag(sub->node_p, side), struct eb_node, branches);
548 head->branches.b[side] = eb_dotag(&new->branches, EB_NODE);
549
550 new->node_p = sub->node_p;
551 new->leaf_p = new_rght;
552 sub->node_p = new_left;
553 new->branches.b[EB_LEFT] = eb_dotag(&sub->branches, EB_NODE);
554 new->branches.b[EB_RGHT] = new_leaf;
555 return new;
556 }
557}
558
559
560/**************************************\
561 * Public functions, for the end-user *
562\**************************************/
563
564/* Return the first leaf in the tree starting at <root>, or NULL if none */
565static inline struct eb_node *eb_first(struct eb_root *root)
566{
567 return eb_walk_down(root->b[0], EB_LEFT);
568}
569
570/* Return the last leaf in the tree starting at <root>, or NULL if none */
571static inline struct eb_node *eb_last(struct eb_root *root)
572{
573 return eb_walk_down(root->b[0], EB_RGHT);
574}
575
576/* Return previous leaf node before an existing leaf node, or NULL if none. */
577static inline struct eb_node *eb_prev(struct eb_node *node)
578{
579 eb_troot_t *t = node->leaf_p;
580
581 while (eb_gettag(t) == EB_LEFT) {
582 /* Walking up from left branch. We must ensure that we never
583 * walk beyond root.
584 */
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200585 if (unlikely(eb_clrtag((eb_untag(t, EB_LEFT))->b[EB_RGHT]) == NULL))
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100586 return NULL;
587 t = (eb_root_to_node(eb_untag(t, EB_LEFT)))->node_p;
588 }
589 /* Note that <t> cannot be NULL at this stage */
590 t = (eb_untag(t, EB_RGHT))->b[EB_LEFT];
591 return eb_walk_down(t, EB_RGHT);
592}
593
594/* Return next leaf node after an existing leaf node, or NULL if none. */
595static inline struct eb_node *eb_next(struct eb_node *node)
596{
597 eb_troot_t *t = node->leaf_p;
598
599 while (eb_gettag(t) != EB_LEFT)
600 /* Walking up from right branch, so we cannot be below root */
601 t = (eb_root_to_node(eb_untag(t, EB_RGHT)))->node_p;
602
603 /* Note that <t> cannot be NULL at this stage */
604 t = (eb_untag(t, EB_LEFT))->b[EB_RGHT];
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200605 if (eb_clrtag(t) == NULL)
606 return NULL;
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100607 return eb_walk_down(t, EB_LEFT);
608}
609
610/* Return previous leaf node before an existing leaf node, skipping duplicates,
611 * or NULL if none. */
612static inline struct eb_node *eb_prev_unique(struct eb_node *node)
613{
614 eb_troot_t *t = node->leaf_p;
615
616 while (1) {
617 if (eb_gettag(t) != EB_LEFT) {
618 node = eb_root_to_node(eb_untag(t, EB_RGHT));
619 /* if we're right and not in duplicates, stop here */
620 if (node->bit >= 0)
621 break;
622 t = node->node_p;
623 }
624 else {
625 /* Walking up from left branch. We must ensure that we never
626 * walk beyond root.
627 */
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200628 if (unlikely(eb_clrtag((eb_untag(t, EB_LEFT))->b[EB_RGHT]) == NULL))
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100629 return NULL;
630 t = (eb_root_to_node(eb_untag(t, EB_LEFT)))->node_p;
631 }
632 }
633 /* Note that <t> cannot be NULL at this stage */
634 t = (eb_untag(t, EB_RGHT))->b[EB_LEFT];
635 return eb_walk_down(t, EB_RGHT);
636}
637
638/* Return next leaf node after an existing leaf node, skipping duplicates, or
639 * NULL if none.
640 */
641static inline struct eb_node *eb_next_unique(struct eb_node *node)
642{
643 eb_troot_t *t = node->leaf_p;
644
645 while (1) {
646 if (eb_gettag(t) == EB_LEFT) {
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200647 if (unlikely(eb_clrtag((eb_untag(t, EB_LEFT))->b[EB_RGHT]) == NULL))
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100648 return NULL; /* we reached root */
649 node = eb_root_to_node(eb_untag(t, EB_LEFT));
650 /* if we're left and not in duplicates, stop here */
651 if (node->bit >= 0)
652 break;
653 t = node->node_p;
654 }
655 else {
656 /* Walking up from right branch, so we cannot be below root */
657 t = (eb_root_to_node(eb_untag(t, EB_RGHT)))->node_p;
658 }
659 }
660
661 /* Note that <t> cannot be NULL at this stage */
662 t = (eb_untag(t, EB_LEFT))->b[EB_RGHT];
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200663 if (eb_clrtag(t) == NULL)
664 return NULL;
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100665 return eb_walk_down(t, EB_LEFT);
666}
667
668
669/* Removes a leaf node from the tree if it was still in it. Marks the node
670 * as unlinked.
671 */
Willy Tarreau75cf17e2008-08-29 15:48:49 +0200672static forceinline void __eb_delete(struct eb_node *node)
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100673{
674 __label__ delete_unlink;
675 unsigned int pside, gpside, sibtype;
676 struct eb_node *parent;
677 struct eb_root *gparent;
678
679 if (!node->leaf_p)
680 return;
681
682 /* we need the parent, our side, and the grand parent */
683 pside = eb_gettag(node->leaf_p);
684 parent = eb_root_to_node(eb_untag(node->leaf_p, pside));
685
686 /* We likely have to release the parent link, unless it's the root,
687 * in which case we only set our branch to NULL. Note that we can
688 * only be attached to the root by its left branch.
689 */
690
Willy Tarreau1fb6c872008-05-16 19:48:20 +0200691 if (eb_clrtag(parent->branches.b[EB_RGHT]) == NULL) {
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100692 /* we're just below the root, it's trivial. */
693 parent->branches.b[EB_LEFT] = NULL;
694 goto delete_unlink;
695 }
696
697 /* To release our parent, we have to identify our sibling, and reparent
698 * it directly to/from the grand parent. Note that the sibling can
699 * either be a link or a leaf.
700 */
701
702 gpside = eb_gettag(parent->node_p);
703 gparent = eb_untag(parent->node_p, gpside);
704
705 gparent->b[gpside] = parent->branches.b[!pside];
706 sibtype = eb_gettag(gparent->b[gpside]);
707
708 if (sibtype == EB_LEAF) {
709 eb_root_to_node(eb_untag(gparent->b[gpside], EB_LEAF))->leaf_p =
710 eb_dotag(gparent, gpside);
711 } else {
712 eb_root_to_node(eb_untag(gparent->b[gpside], EB_NODE))->node_p =
713 eb_dotag(gparent, gpside);
714 }
715 /* Mark the parent unused. Note that we do not check if the parent is
716 * our own node, but that's not a problem because if it is, it will be
717 * marked unused at the same time, which we'll use below to know we can
718 * safely remove it.
719 */
720 parent->node_p = NULL;
721
722 /* The parent node has been detached, and is currently unused. It may
723 * belong to another node, so we cannot remove it that way. Also, our
724 * own node part might still be used. so we can use this spare node
725 * to replace ours if needed.
726 */
727
728 /* If our link part is unused, we can safely exit now */
729 if (!node->node_p)
730 goto delete_unlink;
731
732 /* From now on, <node> and <parent> are necessarily different, and the
733 * <node>'s node part is in use. By definition, <parent> is at least
734 * below <node>, so keeping its key for the bit string is OK.
735 */
736
737 parent->node_p = node->node_p;
738 parent->branches = node->branches;
739 parent->bit = node->bit;
740
741 /* We must now update the new node's parent... */
742 gpside = eb_gettag(parent->node_p);
743 gparent = eb_untag(parent->node_p, gpside);
744 gparent->b[gpside] = eb_dotag(&parent->branches, EB_NODE);
745
746 /* ... and its branches */
747 for (pside = 0; pside <= 1; pside++) {
748 if (eb_gettag(parent->branches.b[pside]) == EB_NODE) {
749 eb_root_to_node(eb_untag(parent->branches.b[pside], EB_NODE))->node_p =
750 eb_dotag(&parent->branches, pside);
751 } else {
752 eb_root_to_node(eb_untag(parent->branches.b[pside], EB_LEAF))->leaf_p =
753 eb_dotag(&parent->branches, pside);
754 }
755 }
756 delete_unlink:
757 /* Now the node has been completely unlinked */
758 node->leaf_p = NULL;
759 return; /* tree is not empty yet */
760}
761
762/* These functions are declared in ebtree.c */
763void eb_delete(struct eb_node *node);
764REGPRM1 struct eb_node *eb_insert_dup(struct eb_node *sub, struct eb_node *new);
765
Willy Tarreauf56fd8a2007-11-19 18:43:04 +0100766#endif /* _COMMON_EBTREE_H */
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100767
768/*
769 * Local variables:
770 * c-indent-level: 8
771 * c-basic-offset: 8
772 * End:
773 */