blob: 1a5ce86501ca18fc3398fefa29dc120f343f3200 [file] [log] [blame]
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +01001/*
2 * Elastic Binary Trees - generic macros and structures.
Willy Tarreau70bcfb72008-01-27 02:21:53 +01003 * (C) 2002-2008 - Willy Tarreau <w@1wt.eu>
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +01004 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
18 *
19 *
20 * Short history :
21 *
22 * 2007/09/28: full support for the duplicates tree => v3
23 * 2007/07/08: merge back cleanups from kernel version.
24 * 2007/07/01: merge into Linux Kernel (try 1).
25 * 2007/05/27: version 2: compact everything into one single struct
26 * 2007/05/18: adapted the structure to support embedded nodes
27 * 2007/05/13: adapted to mempools v2.
28 */
29
30
31
32/*
33 General idea:
34 -------------
35 In a radix binary tree, we may have up to 2N-1 nodes for N keys if all of
36 them are leaves. If we find a way to differentiate intermediate nodes (later
37 called "nodes") and final nodes (later called "leaves"), and we associate
38 them by two, it is possible to build sort of a self-contained radix tree with
39 intermediate nodes always present. It will not be as cheap as the ultree for
40 optimal cases as shown below, but the optimal case almost never happens :
41
42 Eg, to store 8, 10, 12, 13, 14 :
43
44 ultree this theorical tree
45
46 8 8
47 / \ / \
48 10 12 10 12
49 / \ / \
50 13 14 12 14
51 / \
52 12 13
53
54 Note that on real-world tests (with a scheduler), is was verified that the
55 case with data on an intermediate node never happens. This is because the
56 data spectrum is too large for such coincidences to happen. It would require
57 for instance that a task has its expiration time at an exact second, with
58 other tasks sharing that second. This is too rare to try to optimize for it.
59
60 What is interesting is that the node will only be added above the leaf when
61 necessary, which implies that it will always remain somewhere above it. So
62 both the leaf and the node can share the exact value of the leaf, because
63 when going down the node, the bit mask will be applied to comparisons. So we
64 are tempted to have one single key shared between the node and the leaf.
65
66 The bit only serves the nodes, and the dups only serve the leaves. So we can
67 put a lot of information in common. This results in one single entity with
68 two branch pointers and two parent pointers, one for the node part, and one
69 for the leaf part :
70
71 node's leaf's
72 parent parent
73 | |
74 [node] [leaf]
75 / \
76 left right
77 branch branch
78
79 The node may very well refer to its leaf counterpart in one of its branches,
80 indicating that its own leaf is just below it :
81
82 node's
83 parent
84 |
85 [node]
86 / \
87 left [leaf]
88 branch
89
90 Adding keys in such a tree simply consists in inserting nodes between
91 other nodes and/or leaves :
92
93 [root]
94 |
95 [node2]
96 / \
97 [leaf1] [node3]
98 / \
99 [leaf2] [leaf3]
100
101 On this diagram, we notice that [node2] and [leaf2] have been pulled away
102 from each other due to the insertion of [node3], just as if there would be
103 an elastic between both parts. This elastic-like behaviour gave its name to
104 the tree : "Elastic Binary Tree", or "EBtree". The entity which associates a
105 node part and a leaf part will be called an "EB node".
106
107 We also notice on the diagram that there is a root entity required to attach
108 the tree. It only contains two branches and there is nothing above it. This
109 is an "EB root". Some will note that [leaf1] has no [node1]. One property of
110 the EBtree is that all nodes have their branches filled, and that if a node
111 has only one branch, it does not need to exist. Here, [leaf1] was added
112 below [root] and did not need any node.
113
114 An EB node contains :
115 - a pointer to the node's parent (node_p)
116 - a pointer to the leaf's parent (leaf_p)
117 - two branches pointing to lower nodes or leaves (branches)
118 - a bit position (bit)
119 - an optional key.
120
121 The key here is optional because it's used only during insertion, in order
122 to classify the nodes. Nothing else in the tree structure requires knowledge
123 of the key. This makes it possible to write type-agnostic primitives for
124 everything, and type-specific insertion primitives. This has led to consider
125 two types of EB nodes. The type-agnostic ones will serve as a header for the
126 other ones, and will simply be called "struct eb_node". The other ones will
127 have their type indicated in the structure name. Eg: "struct eb32_node" for
128 nodes carrying 32 bit keys.
129
130 We will also node that the two branches in a node serve exactly the same
131 purpose as an EB root. For this reason, a "struct eb_root" will be used as
132 well inside the struct eb_node. In order to ease pointer manipulation and
133 ROOT detection when walking upwards, all the pointers inside an eb_node will
134 point to the eb_root part of the referenced EB nodes, relying on the same
135 principle as the linked lists in Linux.
136
137 Another important point to note, is that when walking inside a tree, it is
138 very convenient to know where a node is attached in its parent, and what
139 type of branch it has below it (leaf or node). In order to simplify the
140 operations and to speed up the processing, it was decided in this specific
141 implementation to use the lowest bit from the pointer to designate the side
142 of the upper pointers (left/right) and the type of a branch (leaf/node).
143 This practise is not mandatory by design, but an implementation-specific
144 optimisation permitted on all platforms on which data must be aligned. All
145 known 32 bit platforms align their integers and pointers to 32 bits, leaving
146 the two lower bits unused. So, we say that the pointers are "tagged". And
147 since they designate pointers to root parts, we simply call them
148 "tagged root pointers", or "eb_troot" in the code.
149
150 Duplicate keys are stored in a special manner. When inserting a key, if
151 the same one is found, then an incremental binary tree is built at this
152 place from these keys. This ensures that no special case has to be written
153 to handle duplicates when walking through the tree or when deleting entries.
154 It also guarantees that duplicates will be walked in the exact same order
155 they were inserted. This is very important when trying to achieve fair
156 processing distribution for instance.
157
158 Algorithmic complexity can be derived from 3 variables :
159 - the number of possible different keys in the tree : P
160 - the number of entries in the tree : N
161 - the number of duplicates for one key : D
162
163 Note that this tree is deliberately NOT balanced. For this reason, the worst
164 case may happen with a small tree (eg: 32 distinct keys of one bit). BUT,
165 the operations required to manage such data are so much cheap that they make
166 it worth using it even under such conditions. For instance, a balanced tree
167 may require only 6 levels to store those 32 keys when this tree will
168 require 32. But if per-level operations are 5 times cheaper, it wins.
169
170 Minimal, Maximal and Average times are specified in number of operations.
171 Minimal is given for best condition, Maximal for worst condition, and the
172 average is reported for a tree containing random keys. An operation
173 generally consists in jumping from one node to the other.
174
175 Complexity :
176 - lookup : min=1, max=log(P), avg=log(N)
177 - insertion from root : min=1, max=log(P), avg=log(N)
178 - insertion of dups : min=1, max=log(D), avg=log(D)/2 after lookup
179 - deletion : min=1, max=1, avg=1
180 - prev/next : min=1, max=log(P), avg=2 :
181 N/2 nodes need 1 hop => 1*N/2
182 N/4 nodes need 2 hops => 2*N/4
183 N/8 nodes need 3 hops => 3*N/8
184 ...
185 N/x nodes need log(x) hops => log2(x)*N/x
186 Total cost for all N nodes : sum[i=1..N](log2(i)*N/i) = N*sum[i=1..N](log2(i)/i)
187 Average cost across N nodes = total / N = sum[i=1..N](log2(i)/i) = 2
188
189 This design is currently limited to only two branches per node. Most of the
190 tree descent algorithm would be compatible with more branches (eg: 4, to cut
191 the height in half), but this would probably require more complex operations
192 and the deletion algorithm would be problematic.
193
194 Useful properties :
195 - a node is always added above the leaf it is tied to, and never can get
196 below nor in another branch. This implies that leaves directly attached
197 to the root do not use their node part, which is indicated by a NULL
198 value in node_p. This also enhances the cache efficiency when walking
199 down the tree, because when the leaf is reached, its node part will
200 already have been visited (unless it's the first leaf in the tree).
201
202 - pointers to lower nodes or leaves are stored in "branch" pointers. Only
203 the root node may have a NULL in either branch, it is not possible for
204 other branches. Since the nodes are attached to the left branch of the
205 root, it is not possible to see a NULL left branch when walking up a
206 tree. Thus, an empty tree is immediately identified by a NULL left
207 branch at the root. Conversely, the one and only way to identify the
208 root node is to check that it right branch is NULL.
209
210 - a node connected to its own leaf will have branch[0|1] pointing to
211 itself, and leaf_p pointing to itself.
212
213 - a node can never have node_p pointing to itself.
214
215 - a node is linked in a tree if and only if it has a non-null leaf_p.
216
217 - a node can never have both branches equal, except for the root which can
218 have them both NULL.
219
220 - deletion only applies to leaves. When a leaf is deleted, its parent must
221 be released too (unless it's the root), and its sibling must attach to
222 the grand-parent, replacing the parent. Also, when a leaf is deleted,
223 the node tied to this leaf will be removed and must be released too. If
224 this node is different from the leaf's parent, the freshly released
225 leaf's parent will be used to replace the node which must go. A released
226 node will never be used anymore, so there's no point in tracking it.
227
228 - the bit index in a node indicates the bit position in the key which is
229 represented by the branches. That means that a node with (bit == 0) is
230 just above two leaves. Negative bit values are used to build a duplicate
231 tree. The first node above two identical leaves gets (bit == -1). This
232 value logarithmically decreases as the duplicate tree grows. During
233 duplicate insertion, a node is inserted above the highest bit value (the
234 lowest absolute value) in the tree during the right-sided walk. If bit
235 -1 is not encountered (highest < -1), we insert above last leaf.
236 Otherwise, we insert above the node with the highest value which was not
237 equal to the one of its parent + 1.
238
239 - the "eb_next" primitive walks from left to right, which means from lower
240 to higher keys. It returns duplicates in the order they were inserted.
241 The "eb_first" primitive returns the left-most entry.
242
243 - the "eb_prev" primitive walks from right to left, which means from
244 higher to lower keys. It returns duplicates in the opposite order they
245 were inserted. The "eb_last" primitive returns the right-most entry.
246
247 */
248
Willy Tarreauf56fd8a2007-11-19 18:43:04 +0100249#ifndef _COMMON_EBTREE_H
250#define _COMMON_EBTREE_H
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100251
252#include <stdlib.h>
Willy Tarreaube68e6a2007-11-21 10:34:15 +0100253#include <common/config.h>
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100254
255/* Note: we never need to run fls on null keys, so we can optimize the fls
256 * function by removing a conditional jump.
257 */
258#if defined(__i386__)
259static inline int flsnz(int x)
260{
261 int r;
262 __asm__("bsrl %1,%0\n"
263 : "=r" (r) : "rm" (x));
264 return r+1;
265}
266#else
267// returns 1 to 32 for 1<<0 to 1<<31. Undefined for 0.
268#define flsnz(___a) ({ \
269 register int ___x, ___bits = 0; \
270 ___x = (___a); \
271 if (___x & 0xffff0000) { ___x &= 0xffff0000; ___bits += 16;} \
272 if (___x & 0xff00ff00) { ___x &= 0xff00ff00; ___bits += 8;} \
273 if (___x & 0xf0f0f0f0) { ___x &= 0xf0f0f0f0; ___bits += 4;} \
274 if (___x & 0xcccccccc) { ___x &= 0xcccccccc; ___bits += 2;} \
275 if (___x & 0xaaaaaaaa) { ___x &= 0xaaaaaaaa; ___bits += 1;} \
276 ___bits + 1; \
277 })
278#endif
279
280static inline int fls64(unsigned long long x)
281{
282 unsigned int h;
283 unsigned int bits = 32;
284
285 h = x >> 32;
286 if (!h) {
287 h = x;
288 bits = 0;
289 }
290 return flsnz(h) + bits;
291}
292
293#define fls_auto(x) ((sizeof(x) > 4) ? fls64(x) : flsnz(x))
294
295/* Linux-like "container_of". It returns a pointer to the structure of type
296 * <type> which has its member <name> stored at address <ptr>.
297 */
298#ifndef container_of
299#define container_of(ptr, type, name) ((type *)(((void *)(ptr)) - ((long)&((type *)0)->name)))
300#endif
301
302/*
303 * Gcc >= 3 provides the ability for the program to give hints to the compiler
304 * about what branch of an if is most likely to be taken. This helps the
305 * compiler produce the most compact critical paths, which is generally better
306 * for the cache and to reduce the number of jumps. Be very careful not to use
307 * this in inline functions, because the code reordering it causes very often
308 * has a negative impact on the calling functions.
309 */
Willy Tarreau70bcfb72008-01-27 02:21:53 +0100310#if !defined(likely)
311#if __GNUC__ < 3
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100312#define __builtin_expect(x,y) (x)
Willy Tarreau70bcfb72008-01-27 02:21:53 +0100313#define likely(x) (x)
314#define unlikely(x) (x)
315#elif __GNUC__ < 4
316/* gcc 3.x does the best job at this */
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100317#define likely(x) (__builtin_expect((x) != 0, 1))
318#define unlikely(x) (__builtin_expect((x) != 0, 0))
Willy Tarreau70bcfb72008-01-27 02:21:53 +0100319#else
320/* GCC 4.x is stupid, it performs the comparison then compares it to 1,
321 * so we cheat in a dirty way to prevent it from doing this. This will
322 * only work with ints and booleans though.
323 */
324#define likely(x) (x)
325#define unlikely(x) (__builtin_expect((x), 0))
326#endif
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100327#endif
328
329/* Support passing function parameters in registers. For this, the
330 * CONFIG_EBTREE_REGPARM macro has to be set to the maximal number of registers
331 * allowed. Some functions have intentionally received a regparm lower than
332 * their parameter count, it is in order to avoid register clobbering where
333 * they are called.
334 */
335#ifndef REGPRM1
336#if CONFIG_EBTREE_REGPARM >= 1
337#define REGPRM1 __attribute__((regparm(1)))
338#else
339#define REGPRM1
340#endif
341#endif
342
343#ifndef REGPRM2
344#if CONFIG_EBTREE_REGPARM >= 2
345#define REGPRM2 __attribute__((regparm(2)))
346#else
347#define REGPRM2 REGPRM1
348#endif
349#endif
350
351#ifndef REGPRM3
352#if CONFIG_EBTREE_REGPARM >= 3
353#define REGPRM3 __attribute__((regparm(3)))
354#else
355#define REGPRM3 REGPRM2
356#endif
357#endif
358
359/* Number of bits per node, and number of leaves per node */
360#define EB_NODE_BITS 1
361#define EB_NODE_BRANCHES (1 << EB_NODE_BITS)
362#define EB_NODE_BRANCH_MASK (EB_NODE_BRANCHES - 1)
363
364/* Be careful not to tweak those values. The walking code is optimized for NULL
365 * detection on the assumption that the following values are intact.
366 */
367#define EB_LEFT 0
368#define EB_RGHT 1
369#define EB_LEAF 0
370#define EB_NODE 1
371
372/* This is the same as an eb_node pointer, except that the lower bit embeds
373 * a tag. See eb_dotag()/eb_untag()/eb_gettag(). This tag has two meanings :
374 * - 0=left, 1=right to designate the parent's branch for leaf_p/node_p
375 * - 0=link, 1=leaf to designate the branch's type for branch[]
376 */
377typedef void eb_troot_t;
378
379/* The eb_root connects the node which contains it, to two nodes below it, one
380 * of which may be the same node. At the top of the tree, we use an eb_root
381 * too, which always has its right branch NULL.
382 */
383struct eb_root {
384 eb_troot_t *b[EB_NODE_BRANCHES]; /* left and right branches */
385};
386
387/* The eb_node contains the two parts, one for the leaf, which always exists,
388 * and one for the node, which remains unused in the very first node inserted
389 * into the tree. This structure is 20 bytes per node on 32-bit machines. Do
390 * not change the order, benchmarks have shown that it's optimal this way.
391 */
392struct eb_node {
393 struct eb_root branches; /* branches, must be at the beginning */
394 eb_troot_t *node_p; /* link node's parent */
395 eb_troot_t *leaf_p; /* leaf node's parent */
396 int bit; /* link's bit position. */
397};
398
399/* Return the structure of type <type> whose member <member> points to <ptr> */
400#define eb_entry(ptr, type, member) container_of(ptr, type, member)
401
402/* The root of a tree is an eb_root initialized with both pointers NULL.
403 * During its life, only the left pointer will change. The right one will
404 * always remain NULL, which is the way we detect it.
405 */
406#define EB_ROOT \
407 (struct eb_root) { \
408 .b = {[0] = NULL, [1] = NULL }, \
409 }
410
411#define EB_TREE_HEAD(name) \
412 struct eb_root name = EB_ROOT
413
414
415/***************************************\
416 * Private functions. Not for end-user *
417\***************************************/
418
419/* Converts a root pointer to its equivalent eb_troot_t pointer,
420 * ready to be stored in ->branch[], leaf_p or node_p. NULL is not
421 * conserved. To be used with EB_LEAF, EB_NODE, EB_LEFT or EB_RGHT in <tag>.
422 */
423static inline eb_troot_t *eb_dotag(const struct eb_root *root, const int tag)
424{
425 return (eb_troot_t *)((void *)root + tag);
426}
427
428/* Converts an eb_troot_t pointer pointer to its equivalent eb_root pointer,
429 * for use with pointers from ->branch[], leaf_p or node_p. NULL is conserved
430 * as long as the tree is not corrupted. To be used with EB_LEAF, EB_NODE,
431 * EB_LEFT or EB_RGHT in <tag>.
432 */
433static inline struct eb_root *eb_untag(const eb_troot_t *troot, const int tag)
434{
435 return (struct eb_root *)((void *)troot - tag);
436}
437
438/* returns the tag associated with an eb_troot_t pointer */
439static inline int eb_gettag(eb_troot_t *troot)
440{
441 return (unsigned long)troot & 1;
442}
443
444/* Converts a root pointer to its equivalent eb_troot_t pointer and clears the
445 * tag, no matter what its value was.
446 */
447static inline struct eb_root *eb_clrtag(const eb_troot_t *troot)
448{
449 return (struct eb_root *)((unsigned long)troot & ~1UL);
450}
451
452/* Returns a pointer to the eb_node holding <root> */
453static inline struct eb_node *eb_root_to_node(struct eb_root *root)
454{
455 return container_of(root, struct eb_node, branches);
456}
457
458/* Walks down starting at root pointer <start>, and always walking on side
459 * <side>. It either returns the node hosting the first leaf on that side,
460 * or NULL if no leaf is found. <start> may either be NULL or a branch pointer.
461 * The pointer to the leaf (or NULL) is returned.
462 */
463static inline struct eb_node *eb_walk_down(eb_troot_t *start, unsigned int side)
464{
465 /* A NULL pointer on an empty tree root will be returned as-is */
466 while (eb_gettag(start) == EB_NODE)
467 start = (eb_untag(start, EB_NODE))->b[side];
468 /* NULL is left untouched (root==eb_node, EB_LEAF==0) */
469 return eb_root_to_node(eb_untag(start, EB_LEAF));
470}
471
472/* This function is used to build a tree of duplicates by adding a new node to
473 * a subtree of at least 2 entries. It will probably never be needed inlined,
474 * and it is not for end-user.
475 */
476static inline struct eb_node *
477__eb_insert_dup(struct eb_node *sub, struct eb_node *new)
478{
479 struct eb_node *head = sub;
480
481 struct eb_troot *new_left = eb_dotag(&new->branches, EB_LEFT);
482 struct eb_troot *new_rght = eb_dotag(&new->branches, EB_RGHT);
483 struct eb_troot *new_leaf = eb_dotag(&new->branches, EB_LEAF);
484
485 /* first, identify the deepest hole on the right branch */
486 while (eb_gettag(head->branches.b[EB_RGHT]) != EB_LEAF) {
487 struct eb_node *last = head;
488 head = container_of(eb_untag(head->branches.b[EB_RGHT], EB_NODE),
489 struct eb_node, branches);
490 if (head->bit > last->bit + 1)
491 sub = head; /* there's a hole here */
492 }
493
494 /* Here we have a leaf attached to (head)->b[EB_RGHT] */
495 if (head->bit < -1) {
496 /* A hole exists just before the leaf, we insert there */
497 new->bit = -1;
498 sub = container_of(eb_untag(head->branches.b[EB_RGHT], EB_LEAF),
499 struct eb_node, branches);
500 head->branches.b[EB_RGHT] = eb_dotag(&new->branches, EB_NODE);
501
502 new->node_p = sub->leaf_p;
503 new->leaf_p = new_rght;
504 sub->leaf_p = new_left;
505 new->branches.b[EB_LEFT] = eb_dotag(&sub->branches, EB_LEAF);
506 new->branches.b[EB_RGHT] = new_leaf;
507 return new;
508 } else {
509 int side;
510 /* No hole was found before a leaf. We have to insert above
511 * <sub>. Note that we cannot be certain that <sub> is attached
512 * to the right of its parent, as this is only true if <sub>
513 * is inside the dup tree, not at the head.
514 */
515 new->bit = sub->bit - 1; /* install at the lowest level */
516 side = eb_gettag(sub->node_p);
517 head = container_of(eb_untag(sub->node_p, side), struct eb_node, branches);
518 head->branches.b[side] = eb_dotag(&new->branches, EB_NODE);
519
520 new->node_p = sub->node_p;
521 new->leaf_p = new_rght;
522 sub->node_p = new_left;
523 new->branches.b[EB_LEFT] = eb_dotag(&sub->branches, EB_NODE);
524 new->branches.b[EB_RGHT] = new_leaf;
525 return new;
526 }
527}
528
529
530/**************************************\
531 * Public functions, for the end-user *
532\**************************************/
533
534/* Return the first leaf in the tree starting at <root>, or NULL if none */
535static inline struct eb_node *eb_first(struct eb_root *root)
536{
537 return eb_walk_down(root->b[0], EB_LEFT);
538}
539
540/* Return the last leaf in the tree starting at <root>, or NULL if none */
541static inline struct eb_node *eb_last(struct eb_root *root)
542{
543 return eb_walk_down(root->b[0], EB_RGHT);
544}
545
546/* Return previous leaf node before an existing leaf node, or NULL if none. */
547static inline struct eb_node *eb_prev(struct eb_node *node)
548{
549 eb_troot_t *t = node->leaf_p;
550
551 while (eb_gettag(t) == EB_LEFT) {
552 /* Walking up from left branch. We must ensure that we never
553 * walk beyond root.
554 */
555 if (unlikely((eb_untag(t, EB_LEFT))->b[EB_RGHT] == NULL))
556 return NULL;
557 t = (eb_root_to_node(eb_untag(t, EB_LEFT)))->node_p;
558 }
559 /* Note that <t> cannot be NULL at this stage */
560 t = (eb_untag(t, EB_RGHT))->b[EB_LEFT];
561 return eb_walk_down(t, EB_RGHT);
562}
563
564/* Return next leaf node after an existing leaf node, or NULL if none. */
565static inline struct eb_node *eb_next(struct eb_node *node)
566{
567 eb_troot_t *t = node->leaf_p;
568
569 while (eb_gettag(t) != EB_LEFT)
570 /* Walking up from right branch, so we cannot be below root */
571 t = (eb_root_to_node(eb_untag(t, EB_RGHT)))->node_p;
572
573 /* Note that <t> cannot be NULL at this stage */
574 t = (eb_untag(t, EB_LEFT))->b[EB_RGHT];
575 return eb_walk_down(t, EB_LEFT);
576}
577
578/* Return previous leaf node before an existing leaf node, skipping duplicates,
579 * or NULL if none. */
580static inline struct eb_node *eb_prev_unique(struct eb_node *node)
581{
582 eb_troot_t *t = node->leaf_p;
583
584 while (1) {
585 if (eb_gettag(t) != EB_LEFT) {
586 node = eb_root_to_node(eb_untag(t, EB_RGHT));
587 /* if we're right and not in duplicates, stop here */
588 if (node->bit >= 0)
589 break;
590 t = node->node_p;
591 }
592 else {
593 /* Walking up from left branch. We must ensure that we never
594 * walk beyond root.
595 */
596 if (unlikely((eb_untag(t, EB_LEFT))->b[EB_RGHT] == NULL))
597 return NULL;
598 t = (eb_root_to_node(eb_untag(t, EB_LEFT)))->node_p;
599 }
600 }
601 /* Note that <t> cannot be NULL at this stage */
602 t = (eb_untag(t, EB_RGHT))->b[EB_LEFT];
603 return eb_walk_down(t, EB_RGHT);
604}
605
606/* Return next leaf node after an existing leaf node, skipping duplicates, or
607 * NULL if none.
608 */
609static inline struct eb_node *eb_next_unique(struct eb_node *node)
610{
611 eb_troot_t *t = node->leaf_p;
612
613 while (1) {
614 if (eb_gettag(t) == EB_LEFT) {
615 if (unlikely((eb_untag(t, EB_LEFT))->b[EB_RGHT] == NULL))
616 return NULL; /* we reached root */
617 node = eb_root_to_node(eb_untag(t, EB_LEFT));
618 /* if we're left and not in duplicates, stop here */
619 if (node->bit >= 0)
620 break;
621 t = node->node_p;
622 }
623 else {
624 /* Walking up from right branch, so we cannot be below root */
625 t = (eb_root_to_node(eb_untag(t, EB_RGHT)))->node_p;
626 }
627 }
628
629 /* Note that <t> cannot be NULL at this stage */
630 t = (eb_untag(t, EB_LEFT))->b[EB_RGHT];
631 return eb_walk_down(t, EB_LEFT);
632}
633
634
635/* Removes a leaf node from the tree if it was still in it. Marks the node
636 * as unlinked.
637 */
638static inline void __eb_delete(struct eb_node *node)
639{
640 __label__ delete_unlink;
641 unsigned int pside, gpside, sibtype;
642 struct eb_node *parent;
643 struct eb_root *gparent;
644
645 if (!node->leaf_p)
646 return;
647
648 /* we need the parent, our side, and the grand parent */
649 pside = eb_gettag(node->leaf_p);
650 parent = eb_root_to_node(eb_untag(node->leaf_p, pside));
651
652 /* We likely have to release the parent link, unless it's the root,
653 * in which case we only set our branch to NULL. Note that we can
654 * only be attached to the root by its left branch.
655 */
656
657 if (parent->branches.b[EB_RGHT] == NULL) {
658 /* we're just below the root, it's trivial. */
659 parent->branches.b[EB_LEFT] = NULL;
660 goto delete_unlink;
661 }
662
663 /* To release our parent, we have to identify our sibling, and reparent
664 * it directly to/from the grand parent. Note that the sibling can
665 * either be a link or a leaf.
666 */
667
668 gpside = eb_gettag(parent->node_p);
669 gparent = eb_untag(parent->node_p, gpside);
670
671 gparent->b[gpside] = parent->branches.b[!pside];
672 sibtype = eb_gettag(gparent->b[gpside]);
673
674 if (sibtype == EB_LEAF) {
675 eb_root_to_node(eb_untag(gparent->b[gpside], EB_LEAF))->leaf_p =
676 eb_dotag(gparent, gpside);
677 } else {
678 eb_root_to_node(eb_untag(gparent->b[gpside], EB_NODE))->node_p =
679 eb_dotag(gparent, gpside);
680 }
681 /* Mark the parent unused. Note that we do not check if the parent is
682 * our own node, but that's not a problem because if it is, it will be
683 * marked unused at the same time, which we'll use below to know we can
684 * safely remove it.
685 */
686 parent->node_p = NULL;
687
688 /* The parent node has been detached, and is currently unused. It may
689 * belong to another node, so we cannot remove it that way. Also, our
690 * own node part might still be used. so we can use this spare node
691 * to replace ours if needed.
692 */
693
694 /* If our link part is unused, we can safely exit now */
695 if (!node->node_p)
696 goto delete_unlink;
697
698 /* From now on, <node> and <parent> are necessarily different, and the
699 * <node>'s node part is in use. By definition, <parent> is at least
700 * below <node>, so keeping its key for the bit string is OK.
701 */
702
703 parent->node_p = node->node_p;
704 parent->branches = node->branches;
705 parent->bit = node->bit;
706
707 /* We must now update the new node's parent... */
708 gpside = eb_gettag(parent->node_p);
709 gparent = eb_untag(parent->node_p, gpside);
710 gparent->b[gpside] = eb_dotag(&parent->branches, EB_NODE);
711
712 /* ... and its branches */
713 for (pside = 0; pside <= 1; pside++) {
714 if (eb_gettag(parent->branches.b[pside]) == EB_NODE) {
715 eb_root_to_node(eb_untag(parent->branches.b[pside], EB_NODE))->node_p =
716 eb_dotag(&parent->branches, pside);
717 } else {
718 eb_root_to_node(eb_untag(parent->branches.b[pside], EB_LEAF))->leaf_p =
719 eb_dotag(&parent->branches, pside);
720 }
721 }
722 delete_unlink:
723 /* Now the node has been completely unlinked */
724 node->leaf_p = NULL;
725 return; /* tree is not empty yet */
726}
727
728/* These functions are declared in ebtree.c */
729void eb_delete(struct eb_node *node);
730REGPRM1 struct eb_node *eb_insert_dup(struct eb_node *sub, struct eb_node *new);
731
Willy Tarreauf56fd8a2007-11-19 18:43:04 +0100732#endif /* _COMMON_EBTREE_H */
Willy Tarreaue6d2e4d2007-11-15 23:56:17 +0100733
734/*
735 * Local variables:
736 * c-indent-level: 8
737 * c-basic-offset: 8
738 * End:
739 */