AVL Tree Source Code Library
|
URI: |
http://herbert.gandraxa.com/herbert/avl_lib.asp |
|
Link template: |
<a href="http://herbert.gandraxa.com/herbert/avl_lib.asp">AVL Tree Source Code Library</a> |
|
Link symbols: |
|
AVL Tree Source Code Library
|
URI: |
http://herbert.gandraxa.com/herbert/avl_lib.asp |
|
Link template: |
<a href="http://herbert.gandraxa.com/herbert/avl_lib.asp">AVL Tree Source Code Library</a> |
|
Link symbols: |
|
Home »
Linoleum Source Code »
AVL Tree »
Source Code Library
This is the source code of the AVL Tree library, implemented in the
Linoleum programming language.
(Balanced Binary AVL Trees "AVL.txt" 1.4- Copyright 2006-2008 Herbert Glarner) (*****************************************************************************) (=========================================================================) ( Copyright Herbert Glarner [2006, 2007, 2008] Mail: herbert.glarner@bluewin.ch Free usage granted for all purposes, including commercial, provided that above credits are given. Suggestions for performance improvement welcome. Though thoroughly tested for trees of up to 20 million nodes, no liability for any damage usage may cause. ) (=========================================================================) (=========================================================================) (Implements an AVL Tree in RAM.) (The dedicated RAM can be dynamic or static. It grows from top towards the bottom as nodes are added. When nodes are deleted the used space shrinks again, leaving ZERO memory fragmentation. The RAM area to hold the tree is organized as follows: +-----------------------+ <-- variable [avl tree top] | Topmost node | +-----------------------+ | Node | +-----------------------+ | ... | The root node can be anywhere in this area: +-----------------------+ address is in the variable [avl tree root]. | Root | +-----------------------+ New nodes cause [avl tree extent] to go down. | ... | If [avl tree extent] would be underflowed, +-----------------------+ an out of memory error is returned. | Node | +-----------------------+ Deleted nodes move [avl tree extent] upwards. | Bottommost node | +-----------------------+ <-- variable [avl tree extent] | Empty memory | | | This area can be used to hold data associated | | with the nodes' DATA field. [avl tree extent] | | is managed from within the library and tells | | up to which address data can be used. When | | using this area, don't expect it to contain | | Nulls: this is the lib user's responsibility. | | +-----------------------+ <-- variable [avl tree bottom] A single node consists of 6 units. It is organized as follows: +0 +1 +2 +3 +4 +5 +-----+------+--------+------+-------+-------+ | Key | Data | Parent | Left | Right | Flags | +-----+------+--------+------+-------+-------+ Key is the node's 32 bit key, i.e. a number between 0 and 2^32-2. Note, that the key -1 = ffff ffffh is not allowed. Data can contain any arbitrary 32 bit data: it is not examined nor modified in whatever way within the library, it is just the payload of the node. In most cases this will be an address to a record of some sort, though if 32 bit are enough can also contain the bare payload. Parent, Left and Right contain addresses between [avl tree top] and [avl tree extent]. If a node has no left child, it's Left field will be -1 = ffff ffffh. The same is true for the right child. All nodes have a Parent field except for the root node, which will have -1 in it. The root node's address, and thus the start of the AVL tree, can be accessed via the variable [avl tree root]. Note, that the node being the parent changes very frequently: don't store its address in your program. Flags contain information important to the maintenance of the tree. Never modify the bits for which AVL FLAG XXX constants are defined, see below. The other bits are freely usable, if there indeed is a need for additional payload data. However, recommendation is to leave that field alone. Note, that the DSA library using this AVL library does make use of these free bits. In total, only 6 library variables are used. Four were mentioned above: [avl tree top], [avl tree bottom], [avl tree extent], [avl tree root]. They control boundaries of the designated RAM area. The other two are: [avl tree error] all public library routines terminate with 'ok' if there was no error. If they return with 'failed', check this variable for the error cause. The error codes are given below in the constants AVL ERR XXX. /Always/ check if a public routine returns with 'ok' ! [avl tree status] Leave this unit alone: it contains important information about the current tree maintenance, using individual bits. Changing any data within this unit almost certainly guarantees to crash your application. ) (=========================================================================) (=========================================================================) (Routines summary: Name of routine Short description) (-------------------------------------------------------------------------) ( PUBLIC ROUTINES: Avl Init Do this first to tell about memory Avl Add Node Inserting a new node into the AVL tree Avl Remove Node Removing an existing node from the tree Avl Find Node Returns the data associated with a node INTERNAL ROUTINES: Avl Locate Node Also stores internal traversal info Avl Rebalance Dont call directly, crash guaranteed Avl Successor Finds the successor of a given node Avl Weight Left Lighter Making left subtree shorter Avl Weight Right Lighter Making right subtree shorter Of course one can use also Internal Routines from the outside world, however, with the exception of 'Avl Locate Node' they are of not much practical use when used isolated. In any case, read the assumptions in the comment header of those routines if you do make use of them. ) (=========================================================================) (=========================================================================) (Memory requirements: 6 units per node, plus 6 units for variables in total: {NumNodes+1}*6 [However, the number of nodes does not need to be known in advance: any memory area can be provided to be managed by the library, and it even can coexist with user data, as long as the tree extent is respected. This is used this way by the DSA library.] ) (=========================================================================) (=========================================================================) (Version history: 10 Jun 2006: 1.0 released 22 Jun 2006: 1.1 released. Makes sure, that the unused 28 bits of the FLAGS field may be used as additional userdata. Prior to this update, the whole FLAGS field was transferred to a node which was to inherit a former node's flags. This included the 28 unused bits. However, if the user wishes to store userdata into those, then these 28 bits must not be transferred. 16 Nov 2006: 1.2 released. Bug fix. When the root was removed, and the root had a right child, and that right child had no left child, then tree rebalancing was not done correctly. 29 Nov 2006: 1.3 released. Optimizations. Mainly eliminating lots of conditional jumps by taking advantage of branch prediction mechanisms. Some Karnaugh Maps replace a series of chained conditional jumps. Eliminated some Write/Write- and Write/Read-stalls. 18 Jan 2008: 1.4 released. Bug fix. Jeff Marrison noticed a bug which could lead to a crash: when removing a node, in case I or II, the left child of the removed node was given the new parent in any case, regardless of its existence. ) (=========================================================================) (*****************************************************************************) "libraries" (*****************************************************************************) "directors" program name = { AVL }; unit = 32; (*****************************************************************************) "constants" (Convention: "AVL ...", UPPER CASE) (=========================================================================) (Error codes) (-------------------------------------------------------------------------) AVL ERR NONE = 0; (No error) (-------------------------------------------------------------------------) (From 'Avl Init') AVL ERR MEMORY BOUNDS = 1; (Upper RAM Address < Lower RAM Address) AVL ERR MEMORY SIZE = 2; (Not even space for a single node) (-------------------------------------------------------------------------) (From 'Avl Add Node') AVL ERR NO INIT = 3; (Init /must/ be run first) AVL ERR INVALID KEY = 4; (The key ffff ffffh is not allowed) AVL ERR KEY EXISTS = 5; (Key must be /unique/ 32bit data) AVL ERR OUT OF MEMORY = 6; (In Init provide more memory) (-------------------------------------------------------------------------) (From 'Avl Remove Node') ( AVL ERR NO INIT = 3;) (Init /must/ be run first) AVL ERR EMPTY TREE = 7; (There is no node to delete) AVL ERR NO SUCH KEY = 8; (Node with given key does not exist) (-------------------------------------------------------------------------) (From 'Avl Find Node') ( AVL ERR NO INIT = 3;) (Init /must/ be run first) ( AVL ERR EMPTY TREE = 7;) (There is no node to search for) ( AVL ERR NO SUCH KEY = 8;) (Node with given key does not exist) (=========================================================================) (=========================================================================) (Node record) (-------------------------------------------------------------------------) (Each node consists of a fixed length record of 6 units.) AVL NODE SIZE = 6; (The number of units as follows) (TECHNICAL NOTE: Although all Pentiums at date of writing have a cache line size of 32 bytes, it does /not/ improve performance to set above size to 8 units = 32 bytes. This is a tree structure, requiring several accesses to different locations: it is better to have smaller entries without any gaps than to align all entries at cache line boundaries, because overall cache fetches will be reduced. For additional informations see the 'DSA' library, routine 'Dsa Cache Alignment'.) (-------------------------------------------------------------------------) (The offsets of the fields within a node. Do /not/ change their order. Additional fields can be added, though, if also AVL NODE SIZE is increased accordingly.) AVL KEY = 0; (node's key) AVL DATA = 1; (data associated with key) AVL PARENT = 2; (address of parent if any, or -1) AVL LEFT = 3; (address of left child if any, or -1) AVL RIGHT = 4; (address of right child if any, or -1) AVL FLAGS = 5; (bitmask, may contain additional data) (Balance bits. A heavy unbalance causes a balancing operation) AVL FLAG UNBALANCED = 0000 0001h; (Set = not absolutely balanced) AVL FLAG UNB LEFT = 0000 0002h; (Set = unbalance is to the left) AVL FLAG UNB HEAVY = 0000 0004h; (Set = unbalance is heavy) (Combinations of above 3 flags, used as shorcuts in the lib.) AVL FLAG UNB MASK = 0000 0007h; (All balance flags) AVL FLAG LEFT HEAVY = 0000 0007h; AVL FLAG LEFT BALANCED = 0000 0003h; AVL FLAG ABS BALANCED = 0000 0000h; AVL FLAG RIGHT BALANCED = 0000 0001h; AVL FLAG RIGHT HEAVY = 0000 0005h; (Last traversal direction, important for 'Avl Add Node') AVL FLAG TRAV LEFT = 0000 0008h; (Set = traversed left subtree) (*****************************************************************) (UPDATE 1.1: All effectively used flags in the FLAGS field.) AVL FLAG ALL = 0000 000fh; (*****************************************************************) (The other 28 bits are free to be used as additional payload data. However, it is recommended to leave that whole field as it: Accidentally modifying any of the 4 defined bits will crash your application or provide unexpected results.) (=========================================================================) (=========================================================================) (Status flags) (-------------------------------------------------------------------------) (The variable [avl status] will hold some flags about the current status of the tree) AVL STATUS INIT = 0000 0001h; (set when 'Avl Init' was successful) AVL STATUS ROOT = 0000 0002h; (set as soon as a root exists) AVL STATUS FOUND = 0000 0004h; (set when a searched node was found) AVL STATUS LEFT = 0000 0008h; (set when potential left child) AVL STATUS LCOP = 0000 0010h; (set when left child of parent) (Theoretically, the other bits within the [avl status] variable are free to be used. However, usage is discouraged.) (=========================================================================) (*****************************************************************************) "variables" (Convention: "avl ...", lower case) (=========================================================================) (Error condition as per AVL ERR XXX constants.) (-------------------------------------------------------------------------) avl error = 0; (=========================================================================) (=========================================================================) (Some flags about the current status of the library. The bits are given in the AVL STATUS XXX constants.) (-------------------------------------------------------------------------) avl status = 0; (=========================================================================) (=========================================================================) (The tree's nodes must be stored somewhere. These variables tell what is stored where.) (-------------------------------------------------------------------------) (This variable has the TOP of the tree's address, from where it continues to grow /downwards/ to allow implementation as a Heap in concunjuction with data growing upwards. Once allocated, this may not be changed anymore, or we would not find the tree any longer. This address is set with the routine 'Avl Init Tree', which will set this to the provided top of memory area.) avl tree top = 0; (exclusive the given address) (-------------------------------------------------------------------------) (This variable has the bottom of the memory area down to which the tree may continue to grow without causing an error. Such, a bidirectional memory area can esily be maintained, with the associated data growing upwards while the tree grows downwards.) avl tree bottom = 0; (inclusive the given address) (-------------------------------------------------------------------------) (Somewhere in the tree there exists a node which is the root. Initially, when the first node is added, this is simply the topmost node within the tree's heap, i.e. it is at [avl tree top]-AVL NODE SIZE. However, during balancing the root node can be relocated and literally become one of any existing nodes. The /address/ of this node can be found here. Just after 'Avl Init' this value is identical with [avl tree top] and a flag indicating the absence of a root is set. Don't store this address in your application: the root changes frequently!) avl tree root = 0; (-------------------------------------------------------------------------) (Somewhere in the memory there exists a node which currently is the bottommost node. Note, that this does not tell anything about its sorting order, it may even be the root. It merely tells about the node's position in memory. The next node to be added will find its place at [avl tree extent]+AVL NODE SIZE. Initially, this is identical to the top of the memory area provided by 'Avl Init Tree'. If a node is deleted anywhere between [avl tree top] and [avl tree extent], both inclusive, then this currently bottommost node /physically/ takes the deleted node's place, ensuring an efficient memory usage of the tree, avoiding any memory fragmentation. [avl tree extent] then is set to [avl tree extent] plus AVL NODE SIZE, i.e. closer to the top, making the tree shorter.) avl tree extent = 0; (=========================================================================) (*****************************************************************************) "workspace" (Convention: "avl ...", lower case) (None) (*****************************************************************************) "programme" (Convention: "Avl ...", Mixed case) (=========================================================================) (This is a library only, it can not be run stand-alone. See "AVL Test.txt" for a test setup.) (-------------------------------------------------------------------------) end; (=========================================================================) (*****************************************************************************) "Avl Init" (=========================================================================) (Initializes an AVL tree. Provided, that the memory area is suitable, this will always work, even when a tree already exists. An existing tree easily can be re-initialized, i.e. logically 'deleted' and prepared for further re-usage by calling 'Avl Init' again.) (-------------------------------------------------------------------------) (Input: D: Top of RAM E: Bottom of RAM: Output: - All registers unchanged. Can cause errors. If the routine fails, the error code is in [avl error]. ) (=========================================================================) (ENTRY) A -->; (=========================================================================) (VERIFICATION) (Either this routine exits with an error, in which case the tree is invalid, or the routine succeeds and provides a successful initialization flag in the status word.) [avl status] = 0; (Not initialized yet, all flags clear) (-------------------------------------------------------------------------) (Check, that the upper memory address is larger than the lower address: we can not deal with negative memory.) (UPDATE 1.3. Branch prediction optimization. Handler at bottom of routine.) ? D < E -> Avl Init Bounds Not Ok; (-------------------------------------------------------------------------) (Check, that we have at least the space for a single node. If we don't, we can't even create a root.) A = D; (Top of RAM) A - E; (minus Bottom of RAM = Size in units) (UPDATE 1.3. Branch prediction optimization. Handler at bottom of routine.) ? A < AVL NODE SIZE -> Avl Init Size Not Ok; (=========================================================================) (PROCESS) (Store the physical top and bottom of the memory area being managed by this library.) [avl tree top] = D; (Library does not access this and above) [avl tree bottom] = E; (Library may access this, but not below) (Initially, no root exists. This remains the case, until the first node has been added. The same is true for the physically bottommost node.) [avl tree root] = D; (No root exists = empty tree) [avl tree extent] = D; (No nodes exist, thus also no bottommost) (Internal flag to tell that the tree was successfully initialized. All other flags are cleared.) [avl status] = AVL STATUS INIT; (=========================================================================) (EXIT) "Avl Init Exit OK" <-- A; [avl error] = AVL ERR NONE; end; "Avl Init Exit Failed" <-- A; fail; (=========================================================================) (UPDATE 1.3 START. Error handlers moved down from inline handling for Branch Prediction Optimization.) "Avl Init Bounds Not Ok" [avl error] = AVL ERR MEMORY BOUNDS; -> Avl Init Exit Failed; "Avl Init Size Not Ok" [avl error] = AVL ERR MEMORY SIZE; -> Avl Init Exit Failed; (UPDATE 1.3 END) (=========================================================================) (*****************************************************************************) "Avl Add Node" (=========================================================================) (Adding a new node to the AVL tree.) (-------------------------------------------------------------------------) (Input: E: Key of the node. The key is a 32 bit integer. Keys must be unique. Key -1 = FFFF FFFFh is not allowed. D: 32 bit Data associated with the key. This can be any value; in most cases it will be a pointer to a data structure of some sort. Output: - All registers unchanged. May cause errors. NOTE: After insertion into the tree the new node's address, if required, can easily be retrieved by the caller from variable [avl tree extent]. WARNING: Use variable [avl tree extent] with care and as a general rule DO NOT STORE such addresses for later reusage, as deletions will cause nodes to /physically/ change addresses.) (=========================================================================) (ENTRY) A -->; B -->; C -->; (UPDATE 1.3. Need additional register to work with Karnaugh Maps.) D -->; (=========================================================================) (VERIFICATION) (Verify, that successfully initialized with 'Avl Init'. If not, leave with an error: we would likely crash if we operated with nonsensical pointers.) (UPDATE 1.3. Branch prediction optimization. Handler at bottom of routine.) ? [avl status] - AVL STATUS INIT -> Avl Add Node Init Not Ok; (-------------------------------------------------------------------------) (Don't feed the key -1 = ffff ffffh: It is the only unallowed 32 bit key.) (UPDATE 1.3. Branch prediction optimization. Handler at bottom of routine.) ? E = minus 1 -> Avl Add Node Key Not Ok; (=========================================================================) (PROCESS) (If no root exists yet, the new node will become the root.) (UPDATE 1.3. Branch prediction optimization. It is a very rare case that no root exists, thus we fall through when one exists.) ? [avl status] - AVL STATUS ROOT -> Avl Add Node No Root Exists; (-> Avl Add Node Root Exists;) (-------------------------------------------------------------------------) ("Avl Add Node Root Exists") (There already exist nodes. Search the node, which will become the future parent of this new node. Should a node with this key already exist, the flag AVL STATUS FOUND will be set in the status word after execution of the routine 'Avl Locate Node'. We are not allowed to have ambigeous keys, and thus leave with an error in that case.) (E = E;) (Key as per input) => Avl Locate Node; (B has found node or pot. parent) (Terminates with 'leave': do not query 'ok' or 'failed'!) (UPDATE 1.3. Branch prediction optimization. Handler at bottom of routine.) ? [avl status] + AVL STATUS FOUND -> Avl Add Node Key Exists; (-------------------------------------------------------------------------) ("Avl Add Node Is New") (A node with key E does not exist yet.) (In B is the designated parent of the new node. The status word flag AVL STATUS LEFT tells, if the new node has to become that parent's left or right child. If set, it's B's left child, else B's right child. Get address of B's appropriate left/right field into C.) (*************************************************************************) (UPDATE 1.3 START. Code flow optimization. Instead if unconditionally branching in 50% of all cases and conditionally in the other 50%, we only branch conditionally in 50% of all cases.) (Was:) (C = B;) (Address of parent) (? [avl status] - AVL STATUS LEFT -> Avl Add Node Branch Right;) (C + AVL LEFT;) (Address of parent's left subtree) (-> Avl Add Node Branch Ok;) ("Avl Add Node Branch Right") (C + AVL RIGHT;) (Address of parent's right subtree) ("Avl Add Node Branch Ok") (New:) C = B; (Address of parent) (UPDATE 1.3. Instruction scheduling optimization. Moved up from below.) A = [avl tree extent]; (Moved up from next code section) C + AVL LEFT; (Assume left parent's subtree) (UPDATE 1.3. Instruction scheduling optimization. Moved up from below.) A - AVL NODE SIZE; (Moved up from next code section) ? [avl status] + AVL STATUS LEFT -> Avl Add Node Branch Ok; C + AVL RIGHT minus AVL LEFT; (Admit right parent's subtree) "Avl Add Node Branch Ok" (UPDATE 1.3 END) (*************************************************************************) (In C we have the address of the 'left' or 'right' field of the future parent B, to which the new node must be linked. The address of the new node is just below the current physical bottommost node. However, should that position be below the assigned bottom of memory, we have no more memory within the designated RAM and exit with an error.) (UPDATE 1.3. Instruction scheduling optimization. The both following instructions were moved up into the previous code section.) (A = [avl tree extent];) (A - AVL NODE SIZE;) (Address of new node) (UPDATE 1.3. Branch prediction optimization. Handler at bottom of routine.) ? A < [avl tree bottom] -> Avl Add Node Not Enough Memory; (-------------------------------------------------------------------------) (The future location is in A. It was made sure, that we do not underflow the current bottom of memory.) [avl tree extent] = A; (Address of new bottommost node) (Now make the new node. A becomes child of B.) [A plus AVL KEY] = E; (Key as per input) [A plus AVL DATA] = D; (Data as per input) [A plus AVL PARENT] = B; (B is A's parent) [A plus AVL LEFT] = minus 1; (No children yet) [A plus AVL RIGHT] = minus 1; (No children yet) [A plus AVL FLAGS] = 0; (Initially balanced) (Inform B about its new child A. This can be either the left or the right child. The address of the appropriate left or right field was already evaluated into C.) [C] = A; (Parent B knows about child A) (The added node requires rebalancing of the ancestors. A itself, just having been inserted, is balanced as it has no children yet. But its ancestors change their balance.) (-> Avl Add Node Check Balance;) (-------------------------------------------------------------------------) (Loop start. Work all the way back up the traversed ancestors, up to and including the root, unless the loop is quit as per below situations. Testing for root is at the end of the loop. Even with many nodes in the region of millions the loop is typically not iterated more than about 3 times, as an exit condition typically will be met pretty soon.) "Avl Add Node Check Balance" (Loop, jumped to from below) A = [A plus AVL PARENT]; (Get node's parent) B = [A plus AVL FLAGS]; (From that node's flags word...) B & AVL FLAG UNB MASK; (...extract balance flags) (What branch did we take from there? Flag AVL FLAG TRAV LEFT tells us. Since AVL trees are balanced, there is no preference for left or right and we arbitrarily fall-through for left and jump for right.) ? [A plus AVL FLAGS] - AVL FLAG TRAV LEFT -> Avl Add Node Trav Right; (-> Avl Add Node Trav Left;) (-------------------------------------------------------------------------) ("Avl Add Node Trav Left") (About half go here) (We followed the left branch. The insertion makes this subtree heavier to the left. B currently can have 3 states: Left-balanced 011b becomes left-heavy 111b absolutely balanced 000b becomes left-balanced 011b Right-balanced 001b becomes absolutely balanced 000b ) (*************************************************************************) (UPDATE 1.3 START. Code flow optimization. What followed was a series of conditional branches to handle each case individually. A Karnaugh Map can handle above without any branches. Input were two variables: D was 1 when the middle bit AVL FLAG UNB LEFT was set, and B was 1 when the right bit AVL FLAG UNBALANCED was set.) (-------------------------------------------------------------------------) (Obsolete old code due to Karnaugh Map.) (? B = AVL FLAG LEFT BALANCED -> Avl Add Node Left Heavy; ? B = AVL FLAG ABS BALANCED -> Avl Add Node Left Balanced; ? B = AVL FLAG RIGHT BALANCED -> Avl Add Node Abs Balanced;) (-------------------------------------------------------------------------) D = B; B & 1b; (Second Karnaugh variable B) D > 1; (First Karnaugh variable D) (-------------------------------------------------------------------------) (The Karnaugh Map for the resulting 3 binary digits is as follows: Left Digit: D Middle digit: |B+D Right digit: |B+D Note that above is Boolean Algebra notation, '+' means OR, and '|' is NOT. The following 7 instructions execute in 5.5 cycles, much better than the series of conditional jumps.) (U ) (Previous SHR always in U) B # 1b; ( V ) (|B) B | D; (U ) (|B+D) C = B; (U WRS) (Copying, we have the same expression for 2 digits) B < 1; (U ) (Middle digit) D < 2; (U ) (D already ok, just shift in place for left digit) B + C; ( V ) (Middle and right digit) B + D; (U ) (All 3 digits) ( V ) (Next JMP is Pairable in V) (-------------------------------------------------------------------------) (This is the point where we would jump to 'Avl Add Node New Balance' before the update, only to separate into the same two branches again after filling in the new flags which are in B by now. It is better to insert the flags in both actual branches and then continue without a new branching.) (Masking in the new value as per corresponding Karnaugh Map.) [A plus AVL FLAGS] & ffff ffffh minus AVL FLAG UNB MASK; [A plus AVL FLAGS] | B; (New balance is set now) (-------------------------------------------------------------------------) (The new branch 'Avl Add Node Left Resp' is integrated directly: we are already where we want to be.) (If the new node did not cause the tree to become higher, we break the balance update loop. We need to check if the left or the right subtree was responsible for the height increase to determine this.) (If Left was /not/ responsible for a height increase, this won't propagate further up, the loop can be quit after this update. Left was not responsible, if the left height is smaller or equal to the right height. This is the case for the balance flags 101b, 001b, 000b, but not for the flags 111b and 011b. In other words: if the middle flag AVL FLAG UNB LEFT is not set, then we can break the loop. This can be compared directly. Typically we will fall-through the following conditional branch.) ? [A plus AVL FLAGS] - AVL FLAG UNB LEFT -> Avl Add Node Exit OK; -> Avl Add Node Chk Heavy; (-------------------------------------------------------------------------) (UPDATE 1.3 END) (*************************************************************************) "Avl Add Node Trav Right" (About half go here) (We followed the right branch. The insertion makes this subtree heavier to the right. B currently can have 3 states: Left-balanced 011b becomes absolutely balanced 000b absolutely balanced 000b becomes right-balanced 001b Right-balanced 001b becomes right-heavy 101b ) (*************************************************************************) (UPDATE 1.3 START. Code flow optimization. What followed was a series of conditional branches to handle each case individually. A Karnaugh Map can handle above without any branches. Input were two variables: D was 1 when the middle bit AVL FLAG UNB LEFT was set, and B was 1 when the right bit AVL FLAG UNBALANCED was set.) (-------------------------------------------------------------------------) (Obsolete old code due to Karnaugh Map.) (? B = AVL FLAG LEFT BALANCED -> Avl Add Node Abs Balanced; ? B = AVL FLAG ABS BALANCED -> Avl Add Node Right Balanced; ? B = AVL FLAG RIGHT BALANCED -> Avl Add Node Right Heavy;) (-------------------------------------------------------------------------) D = B; B & 1b; (Second Karnaugh variable B) D > 1; (First Karnaugh variable D) (-------------------------------------------------------------------------) (The Karnaugh Map for the resulting 3 binary digits is as follows: Left Digit: |DB Middle digit: 0 Right digit: |D Note that above is Boolean Algebra notation, factor = AND, and '|' is NOT. The following 4 instructions execute in 3.5 cycles, even better than the Karnaugh Map for the left subtree.) (U ) (Previous SHR always in U) D # 1b; (U WWS) B & D; (U WRS) B < 2; (U WWS) B + D; (U WWS) ( V ) (Next AND is Pairable in V) (-------------------------------------------------------------------------) (This is the point where we would jump to 'Avl Add Node New Balance' before the update, only to separate into the same two branches again after filling in the new flags which are in B by now. It is better to insert the flags in both actual branches and then continue without a new branching.) (Masking in the new value as per corresponding Karnaugh Map.) [A plus AVL FLAGS] & ffff ffffh minus AVL FLAG UNB MASK; [A plus AVL FLAGS] | B; (New balance is set now) (-------------------------------------------------------------------------) (The new branch 'Avl Add Node Left Resp' is integrated directly: we are already where we want to be.) (If the new node did not cause the tree to become higher, we break the balance update loop. We need to check if the left or the right subtree was responsible for the height increase to determine this.) (If Right was /not/ responsible for a height increase, this won't propagate further up, the loop can be left after this update. Right was not responsible, if the right height is smaller or equal to the left height. This is the case for the balance flags 111b, 011b, 000b, but not for 101b and 001b. In other words, we can break the loop, when the two rightmost flags X=AVL FLAG UNB LEFT and Y=AVL FLAG UNBALANCED evaluate to the Boolean Algebra term |C+B being true, with X being the middle flag and Y the rightmost flag.) C = B; (The three flags A X Y) C & AVL FLAG UNB LEFT plus AVL FLAG UNBALANCED; (X Y) C # AVL FLAG UNBALANCED; (X !Y) (If C is not 0, either X or !Y is true and we quit. Typically we will fall-through the following conditional branch, though.) ? C != 0 -> Avl Add Node Exit OK; (Jump if either set) (-> Avl Add Node Chk Heavy;) (Fall-through) (UPDATE 1.3 END) (*************************************************************************) (*************************************************************************) (UPDATE 1.3. Obsolete old code with many conditional branches and labels removed: the code was replaced by above Karnaugh Maps.) (*************************************************************************) (-------------------------------------------------------------------------) "Avl Add Node Chk Heavy" (Left and right meet here unless exited) (If the flag AVL FLAG UNB HEAVY is set, we need to rebalance by executing the appropriate rotations. This can be done by calling 'Avl Rebalance'.) (*************************************************************************) (UPDATE 1.3. Codeflow optimization. Typically, AVL FLAG UNB HEAVY will not be set, i.e. the Node will not be heavy, in the region of about 80% of all cases. Thus we reorganize the former branch to fall through if the flag is clear.) ? [A plus AVL FLAGS] + AVL FLAG UNB HEAVY -> Avl Add Node Is Heavy; (-------------------------------------------------------------------------) ("Avl Add Node Node Not Heavy") (Usual case, approx 80%) (If this was not the root, continue to check further up the tree. Typically it is really rare to travel all the way up to the root: usually we leave somewhen earlier, as soon as a node was rebalanced. We take the branch if it is the root, otherwise we continue.) ? A = [avl tree root] -> Avl Add Node Exit Ok; (Was root) -> Avl Add Node Check Balance; (Continue to check rebalancing) (-------------------------------------------------------------------------) "Avl Add Node Is Heavy" (Minority case, approx 20%) => Avl Rebalance; (Can not cause errors) (-> Avl Add Node Exit OK;) (Rebalancing breaks updates further up) (UPDATE 1.3 END) (*************************************************************************) (=========================================================================) (EXIT) (UPDATE 1.3. Codeflow comment. There are 3 main cases jumping to this exit label: breaking the rebalancing loop for left, for right, and after issuing a rebalance operation, in which case we fall-through to here from the previous section. All these cases are almost equally distributed. One minority case is processed the root, which after completion also jumps to this label.) "Avl Add Node Exit Ok" (UPDATE 1.3. Need additional register to work with Karnaugh Maps.) <-- D; <-- C; <-- B; <-- A; [avl error] = AVL ERR NONE; end; (-------------------------------------------------------------------------) "Avl Add Node Exit Failed" (UPDATE 1.3. Need additional register to work with Karnaugh Maps.) <-- D; <-- C; <-- B; <-- A; fail; (=========================================================================) (UPDATE 1.3 START. Branch prediction optimization. This code section was moved down from above, as it reflects the rare case that no root exists.) "Avl Add Node No Root Exists" (No root exists, create just below Top of Memory) [avl tree root] - AVL NODE SIZE; (Physically topmost node) A = [avl tree root]; [A plus AVL KEY] = E; (Key as per input) [A plus AVL DATA] = D; (Data as per input) [A plus AVL PARENT] = minus 1; (The root never has a parent) [A plus AVL LEFT] = minus 1; (Initially no children) [A plus AVL RIGHT] = minus 1; (Initially no children) [A plus AVL FLAGS] = 0; (Just 1 node = completely balanced) (A root exists now. Set the general flag reflecting that.) [avl status] | AVL STATUS ROOT; (The root is the so far only node, thus it also is the bottommost physical node.) [avl tree extent] = [avl tree root]; (Bottommost node) -> Avl Add Node Exit OK; (Done. Target is above.) (UPDATE 1.3 END) (=========================================================================) (UPDATE 1.3 START. Error handlers moved down from inline handling for Branch Prediction Optimization.) "Avl Add Node Init Not Ok" [avl error] = AVL ERR NO INIT; -> Avl Add Node Exit Failed; "Avl Add Node Key Not Ok" [avl error] = AVL ERR INVALID KEY; -> Avl Add Node Exit Failed; "Avl Add Node Key Exists" [avl error] = AVL ERR KEY EXISTS; -> Avl Add Node Exit Failed; "Avl Add Node Not Enough Memory" [avl error] = AVL ERR OUT OF MEMORY; -> Avl Add Node Exit Failed; (UPDATE 1.3 END) (=========================================================================) (*****************************************************************************) "Avl Remove Node" (=========================================================================) (Removing an existing node from the AVL tree. In particular note the remarks given in the 'variables' section for variable 'avl tree extent'.) (-------------------------------------------------------------------------) (Input: E: Key of the node. The key is a 32 bit integer. Output: - All registers unchanged. May return errors, check [avl error] after 'failed'. It may contain: AVL ERR NO INIT: The library was not initalized with 'Avl Init'. AVL ERR EMPTY TREE: There are no nodes at all in the tree. AVL ERR NO SUCH KEY: A node with the requested key does not exist. ) (=========================================================================) (ENTRY) A -->; B -->; C -->; D -->; E -->; (=========================================================================) (VERFICATION) (Verify, that successfully initialized with 'Avl Init'. If not, leave with an error: we would likely crash if we operated with nonsensical pointers.) (UPDATE 1.3. Branch prediction optimization. Handler at bottom of routine.) ? [avl status] - AVL STATUS INIT -> Avl Remove Node Init Not Ok; (-------------------------------------------------------------------------) (If the tree is empty, there is no node to be deleted. The tree is empty if there is no root. There is no root, if the status bit AVL STATUS ROOT is clear. This bit is clear just after 'Avl Init' before the insertion of the first node, and also when the last node has been deleted.) (UPDATE 1.3. Branch prediction optimization. Handler at bottom of routine.) ? [avl status] - AVL STATUS ROOT -> Avl Remove Node No Root; (-------------------------------------------------------------------------) (There already exist nodes. Search the node to be removed. When the node can be found, the flag AVL STATUS FOUND will be set in the status word. If this flag is not set, we leave with an error: only existing nodes can be removed.) (E = E;) (Key as per input) => Avl Locate Node; (B has found node or pot. parent) (Terminates with 'leave': do not query 'ok' or 'failed'!) (UPDATE 1.3. Branch prediction optimization. Handler at bottom of routine.) ? [avl status] - AVL STATUS FOUND -> Avl Remove Node Not Existing; (=========================================================================) (PROCESS) ("Avl Remove Node Exists") (A node with key E does exist, its address is in B. An existing node can be removed. How exactly the node is removed from the AVL tree depends from the children it has. In any case are relinks neccessary on B's parent. Get the address of B's parent into E.) E = [B plus AVL PARENT]; (-1 if B is root) (Dinstinguish three cases 1 to 3 as per children of B.) C = [B plus AVL LEFT]; (Get address of B's left child) D = [B plus AVL RIGHT]; (Get address of B's right child) (UPDATE 1.3. Branch prediction optimization. Usually a node has two children: a typical ratio to have a right child in large trees is about 60%. Therefore we fall through the conditional branch if there is a right child.) ? D = minus 1 -> Avl Remove Node Case One; (-> Avl Remove Node Has Right Child;) (-------------------------------------------------------------------------) (UPDATE 1.3. Moved up from further below to take advantage of fall-through.) ("Avl Remove Node Has Right Child") (B has a right child D. We distinguish the two remaining cases. If D has a left child A, that is the hardest case 3. If D has no left child A, this is case 2. Typically, with 40% case 2 is less frequent than case 3 with 60%, when considering only these 2 cases. To take advantage of branch prediction we fall through into case 3 here and branch on case 2.) A = [D plus AVL LEFT]; ? A = minus 1 -> Avl Remove Node Case Two; -> Avl Remove Node Case Three; (-------------------------------------------------------------------------) (UPDATE 1.3. Code flow comment: to further optimize code flow we would directly fall through to case 3 here, saving above conditional jump which is executed in approx 36% of all cases. However, for once I won't do this to maintain readability of the source: the following 3 code sections deal with each case in numerical order.) (-------------------------------------------------------------------------) "Avl Remove Node Case One" (Approx 39% of all 3 cases) (B has no right child D, including no children at all. This is the trivial case, although the branch appears lengthy due to the handling of a potentially new empty tree. The Parent's child field leading to B is set to B's left child C, if it has one, or to -1, if not.) (If B was the root, indicated by E being -1, then the replacing node, i.e. B's left child C, becomes the new root.) (UPDATE 1.3. Branch prediction optimization. Typically it is rare that B is the root: we use fall-through when it is not and only jump if it is. This requires that the originally following code fragment handling the root situation is moved down to the end of this code section.) ? E = minus 1 -> Avl Remove Node Case One Check Empty; (UPDATE 1.4. Make sure, that C really exists. Thanks to Jeff Marrison for spotting this bug.) ? C = minus 1 -> Avl Remove Node Case One No Left Child; (B was not the root. C becomes E's new child in place of former B. First make C's new parent to be E.) [C plus AVL PARENT] = E; (C has new parent E) "Avl Remove Node Case One No Left Child" (-> Avl Remove Node Case One Not New Root;) ("Avl Remove Node Case One Not New Root") (B may have been E's left or right child. The way we took to locate B was registered in B's ancestors' flags, also for parent E. If E's flag AVL FLAG TRAV LEFT is set, then B is E's left child, else its right child. The children of B's left child C were not touched in any way, thus C's balance is unmodified. However, E, former parent of B and new parent of C, is lighter on the side on which C now is, i.e., E changes its balance: it becomes lighter on the left or right side, resp.) ? [E plus AVL FLAGS] - AVL FLAG TRAV LEFT -> Avl Remove Node Case One Relink Right; ("Avl Remove Node Case One Relink Left") (B was E's left child. C takes that place now.) [E plus AVL LEFT] = C; (E has new left child C) (This makes E lighter on the left side.) => Avl Weight Left Lighter; (Can't cause errors) (Terminates with 'leave': do not query 'ok' or 'failed'!) -> Avl Remove Node Relinked; "Avl Remove Node Case One Relink Right" (B was E's right child. C takes that place now.) [E plus AVL RIGHT] = C; (E has new right child C) (This makes E lighter on the right side.) => Avl Weight Right Lighter; (Can't cause errors) (Terminates with 'leave': do not query 'ok' or 'failed'!) -> Avl Remove Node Relinked; (*************************************************************************) (UPDATE 1.3 START. Code flow optimization. The remaining instructions within this code section were moved down from above. They handle the rare case that B is the root.) "Avl Remove Node Case One Check Empty" (B was the root, C is the designated new root. If there is no C, then the tree becomes empty.) ? C != minus 1 -> Avl Remove Node Case One New Root; (Tree becomes empty now.) [avl status] # AVL STATUS ROOT; (There is no root = empty tree) [avl tree root] = [avl tree top]; (State like 'Avl Init') (We don't need to fill any gap, instead jump directly to compacting the tree by moving [avl tree extent] upwards, in this case matching [avl tree top]. Since there are no nodes, no tree balances need to be adjusted and no tree rotaions need to be executed.) -> Avl Remove Node Compact; "Avl Remove Node Case One New Root" (B was the root, C is the designated new root. Simply making C the new root detaches all connections with B: B can not be found anymore.) [C plus AVL PARENT] = minus 1; (C has no parent any longer) [avl tree root] = C; (Register the new root) (Because C's children have not been touched in any way, its balance is unchanged. Since it is the root now, no other nodes' balances can have been affected. However, the gap resulting from removing the old root still needs to be filled.) -> Avl Remove Node Relinked; (UPDATE 1.3 END) (*************************************************************************) (-------------------------------------------------------------------------) "Avl Remove Node Case Two" (Approx 25% of all 3 cases) (The node to be deleted is in B. B's left child is in C, B's right child in D, B's parent in E. E may be -1, then B was the root. C may be -1. D exists, but D's left child A does not exist, or we would not be here.) (D is moved into B's place. B's former left subtree C, if any, is attached as new left subtree of D.) [D plus AVL LEFT] = C; (D has new left subtree C) (UPDATE 1.4. Make sure, that C really exists. Thanks to Jeff Marrison for spotting this bug.) ? C = minus 1 -> Avl Remove Node Case Two No Left Child; [C plus AVL PARENT] = D; (C's new parent is D) "Avl Remove Node Case Two No Left Child" (If B was the root, D becomes the new root now.) (UPDATE 1.3. Branch prediction optimization. Typically it is rare that B is the root: we use fall-through when it is not and only jump if it is. This requires that the originally following code fragment handling the root situation is moved down to the end of this code section.) ? E = minus 1 -> Avl Remove Node Case Two New Root; (-> Avl Remove Node Case Two No New Root;) ("Avl Remove Node Case Two No New Root") (B was not the root. D becomes child of E now, replacing B.) [D plus AVL PARENT] = E; (D's new parent is E) (B could have been E's left or right child. E's flag AVL FLAG TRAV LEFT tells us which it is.) ? [E plus AVL FLAGS] - AVL FLAG TRAV LEFT -> Avl Remove Node Case Two Relink Right; (-> Avl Remove Node Case Two Relink Left;) ("Avl Remove Node Case Two Relink Left") [E plus AVL LEFT] = D; (D becomes new left child of E) -> Avl Remove Node Case Two Rebalance; "Avl Remove Node Case Two Relink Right" [E plus AVL RIGHT] = D; (D becomes new right child of E) (-> Avl Remove Node Case Two Rebalance;) "Avl Remove Node Case Two Rebalance" (Also jumped to for new root below) (D was moved up to B's place. B's former balance depended from its left subtree, which was not changed, and its right subtree, whose root was D. Since D moved up, D is one lighter on the right side now than B used to be. This is no different for D having become a left or right child of E.) (*************************************************************************) (UPDATE 1.1: For the case that the caller wishes to use the unused FLAGS bits for additional user data, we may not transfer the whole FLAGS field, but may only mask in the relevant bits effectively used as flags.) (Was: [D plus AVL FLAGS] = [B plus AVL FLAGS]; Replaced by:) C -->; C = [B plus AVL FLAGS]; C & AVL FLAG ALL; [D plus AVL FLAGS] & ffff ffffh minus AVL FLAG ALL; [D plus AVL FLAGS] | C; <-- C; (*************************************************************************) E = D; => Avl Weight Right Lighter; -> Avl Remove Node Relinked; (*************************************************************************) (UPDATE 1.3 START. Code flow optimization. The remaining instructions within this code section were moved down from above. They handle the rare case that B is the root.) "Avl Remove Node Case Two New Root" (D becomes new root) [D plus AVL PARENT] = minus 1; (D has no parent any longer) [avl tree root] = D; (Register new root) (*************************************************************************) (UPDATE 1.2 START: When Case Two applies /and/ the node to be removed is the root, we jumped prematurely to label 'Avl Remove Node Relinked'. This is wrong: we still need to rebalance the tree, as it becomes lighter on the right side. Also rechecked Case One: the error does not apply there, because the left child of the removed node simply becomes the root then, and since there is no right child in that case, no further implications arise. In Case Three rebalancing is already done /before/ the 'new root situation' is handled, thus was always correct.) (Was: -> Avl Remove Node Relinked; Replaced by:) -> Avl Remove Node Case Two Rebalance; (Branch target is above) (UPDATE 1.2 END) (*************************************************************************) (UPDATE 1.3 END) (*************************************************************************) (-------------------------------------------------------------------------) "Avl Remove Node Case Three" (Approx 36% of all 3 cases) (The node to be deleted is in B. B's left child is in C, B's right child in D, B's parent in E. E may be -1, then B was the root. C may be -1. D and also D's left child A both exist, or we would not be here.) (This is the most complicated case of the three, where B's right child D has a left child A.) (We need an additional register, temporarily save the node to be deleted onto the stack.) B -->; (We find B's inorder successor, i.e. the node which has the next-higher key, and store its address into A [1]. Then we can replace B with new A. We know, that new A has no left subtree, because it is B's immediate successor; if there was a left subtree, then that one would be B's immediate successor, so A can not have a left child. Now, because A does not have a left child, we can simply replace A with its right child, thus detaching A. That detached A then takes B's place. [1] In A was D's left child, but we do not need it anylonger and can overwrite it with the routine's result. However, the successor can happen to be the left child of D itself, in which case A still has the same address as before.) (Get B's inorder successor into A. We don't need its old content anymore.) (B = B;) (The node for which we want succ.) => Avl Successor; (B's successor is in A) (Terminates with 'leave': do not query 'ok' or 'failed'!) (We don't need to check if there really was a successor: case three is defined partially by the existence of a right child of B, which in any case is larger and thus the first potential successor.) (Detach A from tree, i.e. link A's parent with A's right child such, that A's right child becomes A's parent's left child. A can't have a left child or that one would have been the successor instead of A. Get A's right child into C. We don't need the old content of C anymore. If there is no right child it will be -1. Also get A's parent into B. The original B containing the node to be deleted was saved onto the stack. If the successor A was the left child of D and thus the closest possible, then B and D are identical.) C = [A plus AVL RIGHT]; (A's right child: May be -1) B = [A plus AVL PARENT]; (A's parent: Can't be -1, may be D) [B plus AVL LEFT] = C; (Whether C exists or not) (UPDATE 1.3. Branch prediction optimization. If there is no right child, we can't give it a new parent link. Typically this is the case for about 5/6 of all nodes of case 3, i.e. only a minority will have such a child. Thus we reorganize such, that the more common case of having no child falls through the conditional branch, at the expense of moving a very short code fragment down to this code segment's end and after execution resuming by unconditionally branching up to 'Avl Remove Node Case Three Succ No Right' in order to proceed.) ? C != minus 1 -> Avl Remove Node Case Three Succ Has Right; (-> Avl Remove Node Case Three Succ No Right;) "Avl Remove Node Case Three Succ No Right" (Also jumped to from below) (Detached A's right child C may or may not exist. If C exists, it can't have any children, because that would have caused A to be rightheavy and a former rotation would have fixed the situation. If C exists, though, it replaces detached A, making the tree for ancestors of detached A shorter. If it does not exist, A is removed without replacement, making the tree for ancestors of removed A shorter as well. Thus, removal of A in /any/ case leads to a lighter left side for A's ancestor. This ancestor of A is still in B and may be identical with D, depending on how far away the successor was placed from D.) E -->; (Routine expects argument in E) E = B; (Ancestor of detached A) => Avl Weight Left Lighter; (May restructure whole tree) (Terminates with 'leave': do not query 'ok' or 'failed'!) <-- E; (Above rebalancing of ancestors of A may have caused B and D to point to wrong nodes now, namely if rotations were executed; in fact, a large deal of the tree may have been reorganized by multiple rotations, including the tree root and both its subtrees. However, B as ancestor of detached A is not needed anymore, but C and D, the children of the node to be removed, and also E, its parent all must be refetched. Since the node to be removed is on the top of stack, we need to get it back first: it is restored into B.) (Retrieve temporarily saved node to remove and determine new relatives.) <-- B; (Restore deleted node B) C = [B plus AVL LEFT]; (Possibly new left child of B) D = [B plus AVL RIGHT]; (Possibly new right child of B) (UPDATE 1.3. Instruction scheduling optimization.) [A plus AVL LEFT] = C; (Moved up from below) E = [B plus AVL PARENT]; (Possibly new parent of B) (The node to be removed B is now replaced with the detached successor A. This means that C becomes A's left child and D becomes A's right child.) (UPDATE 1.3. Instruction scheduling optimization. We have a Write/Write- stall in the 3rd and 4th line below, and since we have 2 reads of A and 2 writes of A we can't get rid of it: some kind of stall would remain. So we moved the 3rd line up to be processed earlier.) [C plus AVL PARENT] = A; [D plus AVL PARENT] = A; ([A plus AVL LEFT] = C;) (Moved up to above) [A plus AVL RIGHT] = D; (Because the balances of the ancestors of formerly detached and now reintegrated A already were updated, encomprising also the now removed node B, the reintegrated node A simply inherits B's already up-to-date balance. Even if B was relinked during balance update, its address is available in B again and the flags still apply, because A is its direct successor. No further adjustments are needed.) (************************************************************************) (UPDATE 1.1 START: For the case that the caller wishes to use the unused FLAGS bits for additional user data, we may not transfer the whole FLAGS field, but may only mask in the relevant bits effectively used as flags.) (Was: [A plus AVL FLAGS] = [B plus AVL FLAGS]; Replaced by:) C -->; C = [B plus AVL FLAGS]; C & AVL FLAG ALL; [A plus AVL FLAGS] & ffff ffffh minus AVL FLAG ALL; [A plus AVL FLAGS] | C; <-- C; (UPDATE 1.1 END) (************************************************************************) (E becomes A's new parent, even when E is not existing.) [A plus AVL PARENT] = E; (If B was the root, then A becomes the root now. B was the root if its parent E is -1.) (UPDATE 1.3. Branch prediction optimization. It typically is rare that we remove the root. Therefore we reschedule such, that non-root nodes fall through. This makes it neccessary to move down the code fragment handling the root-case.) ? E = minus 1 -> Avl Remove Node Case Three Was Root; (-> Avl Remove Node Case Three Not Root;) ("Avl Remove Node Case Three Not Root") (B was E's left or right child. That place is taken by A now. We can't use the traversal flag AVL FLAG TRAV LEFT here, the tree may have been modified in substantial parts. B still has the address of E's former, now removed child: we can check E's left or right field to determine which child it was and relink the new child A accordingly.) ? B = [E plus AVL RIGHT] -> Avl Remove Node Case Three Relink Right; (-> Avl Remove Node Case Three Relink Left;) ("Avl Remove Node Case Three Relink Left") [E plus AVL LEFT] = A; (A becomes new left child of E) -> Avl Remove Node Relinked; "Avl Remove Node Case Three Relink Right" [E plus AVL RIGHT] = A; (A becomes new right child of E) (-> Avl Remove Node Relinked;) (-------------------------------------------------------------------------) (All three cases meet here, after that all neccessary relinks have been done. There is only one exception in case 1, when the removal of the node has left us with a now empty tree: in that case we jumped directly to label 'Avl Remove Node Compact'. For the others which meet here: B still has the node to be removed, but it was detached from the AVL tree and completely isolated, i.e. it would not be found anymore when searched for.) "Avl Remove Node Relinked" (We have one element less now in the AVL tree. Because we want the whole tree without any fragmentation, we fill the gap now with the /bottommost/ node, then compact the tree. If the removed node B was the bottommost, then no further relinks are neccessary, the space just can be given up by modifying [avl tree extent]. Also no relinks can be done, when there are no more nodes, but this we handled already above in case 1, where we jump directly to the Compact label. The percentage of nodes jumping is pretty considerable: half of all nodes jump here.) ? B = [avl tree extent] -> Avl Remove Node Compact; (There exists an active node at [avl tree extent], which needs to be moved to that space, which used to be occupied by B. Because relationship pointers are addresses, this will make a relink of that node neccessary.) C = [avl tree extent]; (Move away from here) [B plus AVL KEY] = [C plus AVL KEY]; [B plus AVL DATA] = [C plus AVL DATA]; [B plus AVL LEFT] = [C plus AVL LEFT]; (Relative is unchanged!) [B plus AVL RIGHT] = [C plus AVL RIGHT]; (Relative is unchanged!) [B plus AVL PARENT] = [C plus AVL PARENT]; (Relative is unchanged!) [B plus AVL FLAGS] = [C plus AVL FLAGS]; (includes possible userdata) ([C plus AVL LEFT] and [C plus AVL RIGHT] refer to the children, which still are in place, so that we don't need to modify these fields. But we must tell the children, if existing, that their parent was relocated. There is no point to predict anything here: Typically, half all nodes have in average one such children, but they of course may have both or also none.) ? [B plus AVL LEFT] = minus 1 -> Avl Remove Node Relocate No Left; D = [B plus AVL LEFT]; (Get address of left child) [D plus AVL PARENT] = B; (Tell child about parent address) "Avl Remove Node Relocate No Left" ? [B plus AVL RIGHT] = minus 1 -> Avl Remove Node Relocate No Right; D = [B plus AVL RIGHT]; (Get address of right child) [D plus AVL PARENT] = B; (Tell child about parent address) "Avl Remove Node Relocate No Right" (Similarly, the parent of B refers to a node which still is at its old place, so we don't need to modify that field. But we must tell the parent, that its child was relocated. That child can be the parent's left or the right child. If the relocated node is the root, we need to set the system address, but then no parent exists to be informed.) D = [B plus AVL PARENT]; (Get address of parent, may be -1) (UPDATE 1.3. Branch prediction optimization. The chance to relocate the root is marginal if even encountered at all in ordinary tree usage: the vast majority shall fall through. Once again, we move the relevant code fragment down so that it is out of our highway and enables fall-through.) ? D = minus 1 -> Avl Remove Node Relocate Is Root; (-> Avl Remove Node Relocate Not Root;) "Avl Remove Node Relocate Not Root" (The parent D needs to know its child's new position B, it still thinks it's at C, the old bottommost node. It can be the left or the right child, 50% chance for either case.) ? C = [D plus AVL RIGHT] -> Avl Remove Node Relocate Parent Right; [D plus AVL LEFT] = B; (Parent's left child's new addr.) -> Avl Remove Node Compact; (Reduce tree memory) "Avl Remove Node Relocate Parent Right" [D plus AVL RIGHT] = B; (Parent's right child's new addr.) (-> Avl Remove Node Compact;) (Reduce tree memory) (-------------------------------------------------------------------------) (The tree area has been compacted, i.e. the former node at library variable [avl tree extent] has been moved to B, the position of the just removed node, and the neccessary pointers have been updated. The position at [avl tree extent] is obsolete now and can be freed for further usage. Note, that it will /not/ be nulled: should that position be occupied again by a newly added node, we don't need a clean record. If the library user makes use of the space between [avl tree bottom] and [avl tree extent], it is in his responsibility to clean the area, when he expects a such to be nulled.) "Avl Remove Node Compact" (All nodes incl. root emptying tree) [avl tree extent] + AVL NODE SIZE; (+, not -; shrinking is upwards) (=========================================================================) (EXIT) "Avl Remove Node Exit OK" <-- E; <-- D; <-- C; <-- B; <-- A; [avl error] = AVL ERR NONE; end; "Avl Remove Node Exit Failed" <-- E; <-- D; <-- C; <-- B; <-- A; fail; (=========================================================================) (UPDATE 1.3 START. Error handlers moved down from inline handling for Branch Prediction Optimization.) "Avl Remove Node Init Not Ok" [avl error] = AVL ERR NO INIT; -> Avl Remove Node Exit Failed; "Avl Remove Node No Root" [avl error] = AVL ERR EMPTY TREE; -> Avl Remove Node Exit Failed; "Avl Remove Node Not Existing" [avl error] = AVL ERR NO SUCH KEY; -> Avl Remove Node Exit Failed; (UPDATE 1.3 END.) (=========================================================================) (UPDATE 1.3 START. Code flow optimization. Moved down from above to handle the rare case that there is a child C which needs to be relinked to parent B.) "Avl Remove Node Case Three Succ Has Right" [C plus AVL PARENT] = B; (New parent for right kid of E) -> Avl Remove Node Case Three Succ No Right; (Target is above) (-------------------------------------------------------------------------) (Moved down from above to handle the rare case that we remove the root in case 3.) "Avl Remove Node Case Three Was Root" (A becomes the new root.) [A plus AVL PARENT] = minus 1; (A has no parent anymore) [avl tree root] = A; (Register new root) -> Avl Remove Node Relinked; (Target is above) (-------------------------------------------------------------------------) (Handling the exceptional case that we need to relocate the root to keep the tree compact.) "Avl Remove Node Relocate Is Root" (The relocated node currently is the root. No parent to inform, value remains -1, but the library needs to know the root's new position.) [avl tree root] = B; (New root position) -> Avl Remove Node Compact; (Reduce tree memory, see above) (UPDATE 1.3 END) (=========================================================================) (*****************************************************************************) "Avl Find Node" (=========================================================================) (Finds an existing node by Key. Returns the content of its data field.) (-------------------------------------------------------------------------) (Input: E: Key of the node. The key is a 32 bit integer. Output: D: 32 bit Data associated with the key when inserting node. All registers except D unchanged. If routine returns with an error, then even D is unchanged. May return errors, check [avl error] after 'failed'. It may contain: AVL ERR NO INIT: The library was not initalized with 'Avl Init'. AVL ERR EMPTY TREE: There are no nodes at all in the tree. AVL ERR NO SUCH KEY: A node with the requested key does not exist. ) (=========================================================================) (ENTRY) B -->; (=========================================================================) (VERIFICATION) (Verify, that successfully initialized with 'Avl Init'. If not, leave with an error: we would likely crash if we operated with nonsensical pointers.) (UPDATE 1.3. Branch prediction optimization. Handler at bottom of routine.) ? [avl status] - AVL STATUS INIT -> Avl Find Node Init Not Ok; (-------------------------------------------------------------------------) (If the tree is empty, there is no node to be found. The tree is empty if there is no root. There is no root, if the status bit AVL STATUS ROOT is clear. This bit is clear just after 'Avl Init' before the insertion of the first node, and also when the last node has been deleted.) ? [avl status] - AVL STATUS ROOT -> Avl Find Node Root Not Existing; (-------------------------------------------------------------------------) (There already exist nodes. Search the node to be found. When the node can be found, the flag AVL STATUS FOUND will be set in the status word. If this flag is not set, we leave with error AVL ERR NO SUCH KEY: we can not return data for not existing nodes.) (E = E;) (Key as per input) => Avl Locate Node; (B has found node or pot. parent) (Terminates with 'leave': do not query 'ok' or 'failed'!) ? [avl status] - AVL STATUS FOUND -> Avl Find Node Node Not Existing; (=========================================================================) (PROCESS) ("Avl Find Node Exists") (A node with key E does exist, its address is in B. Load its data into D and return successfully to caller. B won't be unveiled to caller: the address may change.) D = [B plus AVL DATA]; (-> Avl Find Node Exit OK;) (=========================================================================) (EXIT) "Avl Find Node Exit OK" <-- B; [avl error] = AVL ERR NONE; end; "Avl Find Node Exit Failed" <-- B; fail; (=========================================================================) (UPDATE 1.3 START. Error handlers moved down from inline handling for Branch Prediction Optimization.) "Avl Find Node Init Not Ok" [avl error] = AVL ERR NO INIT; -> Avl Find Node Exit Failed; "Avl Find Node Root Not Existing" [avl error] = AVL ERR EMPTY TREE; -> Avl Find Node Exit Failed; "Avl Find Node Node Not Existing" [avl error] = AVL ERR NO SUCH KEY; -> Avl Find Node Exit Failed; (UPDATE 1.3 END.) (=========================================================================) (*****************************************************************************) "Avl Locate Node" (=========================================================================) (This routine is used internally. It assumes, that we already checked for a proper initialization. It also assumes, that at least 1 node, the root, already exists. If this restriction is observed by a client, the routine can be used safely also from the outside world.) (-------------------------------------------------------------------------) (Follows the tree from the root down to the node of the searched key. If a node with the given key can not be found, then that node is provided which would become the new node's parent, when we would insert it.) (-------------------------------------------------------------------------) (Input: E: Key of the node. The key is a 32 bit integer. Output: B: Address of found node, or of potential parent node. [avl status]: Contains the information if the searched node was found [the flag AVL STATUS FOUND is set], or if it only is the potential parent [cleared flag]. All registers except B unchanged. Causes no errors. Terminates with 'leave': do not query 'ok' or 'failed'! ) (=========================================================================) (ENTRY) C -->; (=========================================================================) (PROCESS) (Clear the 'found' flag and the 'left' flag from previous operations.) [avl status] & ffff ffffh minus AVL STATUS FOUND minus AVL STATUS LEFT; (B is loaded with the root, then with the node of the appropriate subtree that we take, until we either find the key or the parent under which it would be inserted. The two cases are distinguished by the AVL STATUS FOUND flag.) B = [avl tree root]; (-------------------------------------------------------------------------) (If the key matches the input, we found the node and leave.) "Avl Locate Node Test" (Loop start: Jumped to from below) ? E = [B plus AVL KEY] -> Avl Locate Node Found; (-------------------------------------------------------------------------) (Keep this section efficient! It typically executes millions of times.) (Decide, what subtree to follow: Left if E is smaller than the current node's key, right when B is larger. Since this is a height-balanced AVL tree, the chance is approx. 50% for either side.) ? E > [B plus AVL KEY] -> Avl Locate Node Follow Right; (Follow Left, if child exists. If not, the actual node is a potential parent for the insertion of a /left/ node. The direction is stored into the status word for evaluation by the internal caller.) ("Avl Locate Node Follow Left") [avl status] | AVL STATUS LEFT; (We took the left subtree) C = [B plus AVL LEFT]; (Address left subtree or -1) [B plus AVL FLAGS] | AVL FLAG TRAV LEFT; (followed left) -> Avl Locate Node Followed; (Follow Right, if child exists. If not, the actual node is a potential parent for the insertion of a /right/ node. The direction is stored into the status word for evaluation by the internal caller.) "Avl Locate Node Follow Right" [avl status] & ffff ffffh minus AVL STATUS LEFT; (took right subtree) C = [B plus AVL RIGHT]; (Address right subtree or -1) [B plus AVL FLAGS] & ffff ffffh minus AVL FLAG TRAV LEFT; (right) (-> Avl Locate Node Followed;) "Avl Locate Node Followed" (Both branches meet here) (-------------------------------------------------------------------------) (If C is -1, there is no such node as searched. The node at which it would be attached as a left or right child is B. If it would be B's left child, the flag AVL STATUS LEFT in the [avl status] word is set; if it would be the right child, the flag is clear. That's all the client needs to know: We leave.) ? C = minus 1 -> Avl Locate Node Exit OK; (C was not -1, thus it is the node to be made the active node now. From there, everything repeats.) B = C; (New active node) -> Avl Locate Node Test; (Test for match) (-------------------------------------------------------------------------) (When this routine finds the searched node, a flag is set. This flag then is interpreted by the caller: it can be good or bad to find a node.) "Avl Locate Node Found" [avl status] | AVL STATUS FOUND; (-> Avl Locate Node Exit OK;) (=========================================================================) (EXIT) "Avl Locate Node Exit OK" <-- C; [avl error] = AVL ERR NONE; leave; (No 'end: never fails) (=========================================================================) (*****************************************************************************) "Avl Successor" (=========================================================================) (This routine is used internally. It assumes, that we already checked for a proper initialization.) (-------------------------------------------------------------------------) (Finds a node's inorder successor. NOTE: Do not use externally to find a node's real successor: Usually we need to distinguish two cases, namely if there is a right child or not. However, in the circumstances we call this routine, there /always/ is a right child.) (-------------------------------------------------------------------------) (Input: B: Address of the node. Output: A: Address of successor node, or -1 if there is none. All registers except A unchanged. Causes no errors. Terminates with 'leave': do not query 'ok' or 'failed'! ) (=========================================================================) (PROCESS) (The inorder successor is just the leftmost node in B's right subtree. Get B's right subtree. If it does not exist, leave with -1 as result.) A = [B plus AVL RIGHT]; ? A = minus 1 -> Avl Successor Node Exit OK; (From there, follow left nodes as long as existing.) (-------------------------------------------------------------------------) "Avl Successor Follow Left" (Loop start) ? [A plus AVL LEFT] = minus 1 -> Avl Successor Node Exit OK; A = [A plus AVL LEFT]; -> Avl Successor Follow Left; (-------------------------------------------------------------------------) (EXIT) "Avl Successor Node Exit OK" [avl error] = AVL ERR NONE; leave; (No 'end': never fails) (=========================================================================) (*****************************************************************************) "Avl Rebalance" (=========================================================================) (This routine is used internally. It assumes, that we already checked for a proper initialization. Don't use directly from anywhere outside of this library: a crash will inevitably occur.) (-------------------------------------------------------------------------) (Rebalances a too heavy subtree.) (-------------------------------------------------------------------------) (Input: A: Address of too heavy node. Output: - All registers unchanged. Causes no errors. Terminates with 'leave': do not query 'ok' or 'failed'! ) (=========================================================================) (ENTRY) A -->; B -->; C -->; D -->; E -->; (=========================================================================) (PROCESS) (The rotations make a relink of the heavy node A neccessary. It will be detached from its parent, unless its the root itself, of course. In that case we will receive a new root. A's parent is stored into E.) E = [A plus AVL PARENT]; (A's parent, or -1 if A is root) ? E = minus 1 -> Avl Rebalance Parent Ok; (A is not the root node. Determine, if A is the parent's left or right child. Store the info into the status word's flag AVL STATUS LCOP. 'LCOP' means 'Left Child Of Parent'.) (*************************************************************************) (UPDATE 1.3 START. Code flow optimization. All nodes had a conditional branch, all left children an additional unconditional jump. The latter was removed by rearranging the code fragment.) [avl status] & FFFF FFFFh minus AVL STATUS LCOP; (Assume right) ? A = [E plus AVL RIGHT] -> Avl Rebalance Parent Ok; [avl status] | AVL STATUS LCOP; (Admit left) (UPDATE 1.3 END) (*************************************************************************) "Avl Rebalance Parent Ok" (A's relationship to its parent is as follows: E = -1: A is root, has no parent E != -1 and AVL STATUS LCOP set A is E's left child E != -1 and AVL STATUS LCOP clear A is E's right child) (-------------------------------------------------------------------------) (The node at address A is heavy. Which rotations need to be executed to rebalance this subtree depends on the balance direction of node at A /and also/ on the balance of the outer child of A. The outer child is the left child for leftheavy A, or the right child for rightheavy A, resp. Determine the needed rotation.) ? [A plus AVL FLAGS] - AVL FLAG UNB LEFT -> Avl Rebalance Right; (-> Avl Rebalance Left;) (-------------------------------------------------------------------------) (A is leftheavy, examine left child's balance.) "Avl Rebalance Left" C = [A plus AVL LEFT]; (In insertions, C can only be left-balanced or right-balanced, not absolutely balanced. However, in deletions it also can be absolutely balanced: this can be handled with LL, only that the heights for A and C won't be set to 0 each, but to LB and RB, resp.) (HEAVY x x x x UNB LEFT 0 0 1 1 UNBALANCED 0 1 0 1 Rotation LL LR LL LL) ? [C plus AVL FLAGS] - AVL FLAG UNBALANCED -> Avl Rebalance LL; ? [C plus AVL FLAGS] - AVL FLAG UNB LEFT -> Avl Rebalance LR; -> Avl Rebalance LL; (-------------------------------------------------------------------------) (A is rightheavy, examine right child's balance.) "Avl Rebalance Right" C = [A plus AVL RIGHT]; (The child C can only be left-balanced or right-balanced, not absolutely balanced. Examination of one flag suffices to tell about the needed rebalance rotation.) (In insertions, C can only be left-balanced or right-balanced, not absolutely balanced. However, in deletions it also can be absolutely balanced: this can be handled with RR, only that the heights for A and C won't be set to 0 each, but to RB and LB, resp.) (HEAVY x x x x UNB LEFT 0 0 1 1 UNBALANCED 0 1 0 1 Rotation RR RR RR RL) ? [C plus AVL FLAGS] - AVL FLAG UNBALANCED -> Avl Rebalance RR; ? [C plus AVL FLAGS] - AVL FLAG UNB LEFT -> Avl Rebalance RR; -> Avl Rebalance RL; (-------------------------------------------------------------------------) "Avl Rebalance LL" (A is leftheavy, child C is left-balanced) (Next to nodes E, A and C, also C's right child, if existing, will be relinked. It is stored in D.) D = [C plus AVL RIGHT]; (may be -1 if not existing) (Changes: Node Parent Left Right E unchanged old:A, new:C A old:E or root, new:C old:C, new:D unchanged C old:A, new:E or root unchanged old:D,new:A D old:C, new A unchanged unchanged) (If A was root, then C becomes root now) (UPDATE 1.3. Branch prediction optimization. It is a rare case that A was the root. Using fall-through for the major case, and moving the code fragment updating the root to the end of the routine.) ? E = minus 1 -> Avl Rebalance LL New Root; (A was not root) ? [avl status] - AVL STATUS LCOP -> Avl Rebalance LL2; [E plus AVL LEFT] = C; (ELeft <-- C) -> Avl Rebalance LL3; "Avl Rebalance LL2" [E plus AVL RIGHT] = C; (ERight <-- C) (-> Avl Rebalance LL3;) "Avl Rebalance LL3" (UPDATE 1.3. Stall optimization. Rearranged the following 4 instructions to remove 2 WWS and 1 WRS for 1 WRS only.) [A plus AVL PARENT] = C; [C plus AVL PARENT] = E; (C --> E, E is -1 if C is new root) [A plus AVL LEFT] = D; [C plus AVL RIGHT] = A; ? D = minus 1 -> Avl Rebalance LL4; [D plus AVL PARENT] = A; (A <-> D done) "Avl Rebalance LL4" (All meet here again) (New weights:) (A was leftheavy, C was left-balanced. Both are absolutely balanced now. Only in deletions it is possible, that A was leftheavy, but C absolutely balanced: in this situation A becomes left-balanced, and C right-balanced.) ? [C plus AVL FLAGS] - AVL FLAG UNB MASK -> Avl Rebalance LL5; (A was leftheavy, C was left-balanced. Both are absolutely balanced now.) [A plus AVL FLAGS] & ffff ffffh minus AVL FLAG UNB MASK; [C plus AVL FLAGS] & ffff ffffh minus AVL FLAG UNB MASK; -> Avl Rebalance Exit OK; "Avl Rebalance LL5" (A was leftheavy and becomes left-balanced now, C was absolutely balanced and becomes right-balanced now.) [A plus AVL FLAGS] & ffff ffffh minus AVL FLAG UNB MASK; [C plus AVL FLAGS] & ffff ffffh minus AVL FLAG UNB MASK; [A plus AVL FLAGS] | AVL FLAG LEFT BALANCED; [C plus AVL FLAGS] | AVL FLAG RIGHT BALANCED; -> Avl Rebalance Exit OK; (-------------------------------------------------------------------------) "Avl Rebalance LR" (A is leftheavy, child C is right-balanced) (This involves a /double/ rotation: Left around C and right around A. Both are compacted into one go, but we need a register more than actually available, thus E, initially the parent, is re-used. See follwing remarks.) (Next to nodes E, A and C, also C's right child will be relinked. It is stored in D. Furthermore we need D's left child, if existing, stored in B, and D's right child, if existing. If the latter does exist, it is relinked after the parent has been handled, using register E for it. In the 'changes diagram' below, it is denoted with E2, while the parent is labeled E1.) D = [C plus AVL RIGHT]; B = [D plus AVL LEFT]; (may be -1 if not existing) (Changes: Node Parent Left Right E1 unchanged old:A, new:D A old:E1 or root, new:D old:C, new:E2 unchanged B old:D, new:C unchanged unchanged C old:A, new:D unchanged old:D, new:B D old:C, new E1 or root old:B, new:C old:E2, new:A E2 old:D, new A unchanged unchanged) (If A was root, then D becomes root now) (UPDATE 1.3. Branch prediction optimization. It is a rare case that A was the root. Using fall-through for the major case, and moving the code fragment updating the root to the end of the routine.) ? E = minus 1 -> Avl Rebalance LR New Root; (A was not root) ? [avl status] - AVL STATUS LCOP -> Avl Rebalance LR2; [E plus AVL LEFT] = D; (ELeft <-- D) -> Avl Rebalance LR3; "Avl Rebalance LR2" [E plus AVL RIGHT] = D; (ERight <-- D) (-> Avl Rebalance LR3;) "Avl Rebalance LR3" [D plus AVL PARENT] = E; (D --> E, E is -1 if D is new root) (Now E will be loaded with D's right child, taking the role E2 as per above diagram) E = [D plus AVL RIGHT]; (E2, not the old parent of A!) (Relink the rest) (UPDATE 1.3. Stall optimization. Rearranged the following 6 instructions to remove 3 Write/Write-Stalls.) [A plus AVL PARENT] = D; [C plus AVL PARENT] = D; [A plus AVL LEFT] = E; [D plus AVL LEFT] = C; [C plus AVL RIGHT] = B; [D plus AVL RIGHT] = A; ? B = minus 1 -> Avl Rebalance LR4; [B plus AVL PARENT] = C; (Does not neccessarily exist) "Avl Rebalance LR4" ? E = minus 1 -> Avl Rebalance LR5; [E plus AVL PARENT] = A; (Does not neccessarily exist) "Avl Rebalance LR5" (All meet here) (New weights:) (C was right-balanced. If D was right-balanced, then C will become left- balanced now, if it was left-balanced, then C will become absolutely balanced. If D was leaf and had no sub-trees, it was absolutely balanced and also C will become absolutely balanced.) [C plus AVL FLAGS] & ffff ffffh minus AVL FLAG UNB MASK; (A was left-heavy. If D was right-balanced, then A will become absolutely balanced now; if D was left-balanced, then A will become right-balanced. If D was leaf and has no subtrees, then it was absolutely balanced and also A will become absolutely balanced.) [A plus AVL FLAGS] & ffff ffffh minus AVL FLAG UNB MASK; (Now the dependencies on D's former balance) ? [D plus AVL FLAGS] - AVL FLAG UNBALANCED -> Avl Rebalance LR7; ? [D plus AVL FLAGS] - AVL FLAG UNB LEFT -> Avl Rebalance LR6; [A plus AVL FLAGS] | AVL FLAG RIGHT BALANCED; -> Avl Rebalance LR7; "Avl Rebalance LR6" [C plus AVL FLAGS] | AVL FLAG LEFT BALANCED; "Avl Rebalance LR7" (D was left- or right-balanced: it is absolutely balanced in both cases now. B and E2 are unchanged.) [D plus AVL FLAGS] & ffff ffffh minus AVL FLAG UNB MASK; -> Avl Rebalance Exit Ok; (-------------------------------------------------------------------------) "Avl Rebalance RL" (A is rightheavy, child C is left-balanced) (This involves a /double/ rotation: Right around C and left around A. Both are compacted into one go, but we need a register more than actually available, thus E, initially the parent, is re-used. See follwing remarks.) (Next to nodes E, A and C, also C's left child will be relinked. It is stored in D. Furthermore we need D's right child, if existing, stored in B, and D's left child, if existing. If the latter does exist, it is relinked after the parent has been handled, using register E for it. In the 'changes diagram' below, it is denoted with E2, while the parent is labeled E1.) D = [C plus AVL LEFT]; B = [D plus AVL RIGHT];