B tree family rules 1) B tree of order m is an m-way search tree\\ B+ tree of order m is an m-way search trie.\\ B* tree of oirder m is an m-way search trie. The rules below apply to both B trees, B+ trees and B* trees. As you should have come to expect, the keys are in every node of the B tree of order m. The B+ tree and B*~tree keys are all in the leaves; the internal nodes contain guide values to direct search to the correct leaf. 2) all leaf nodes appear at the same level 3) root is either a leaf or has at least two non-empty subtrees 4) all internal nodes of a B tree of order m or a B+ tree of order m must have at least ceiling(m/2) and at most m non-empty subtrees. all internal nodes of a B* tree of order m must have at least ceiling(2m/3) non-empty subtrees and atr most m non-empty subtrees. 5) all leaf nodes of a B tree of order m or a B+ tree of order m are at "least half full", and those of a B* tree of order m are "at least two thirds full" For the B and B+ trees of order m, this is made more precise by saying that a leaf must contain at least ceiling(p/2) and at most p keys ---the real data values, and associated record or record pointer, where there are no restrictions on the values of p and m, other than keeping the nodes at least half full. For the B* trees of order m, this is made more precise by saying that a leaf must contain at least ceiling(2p/3) and at most p keys ---the real data values, and associated record or record pointer, where there are no restrictions on the values of p and m, other than keeping the nodes at least two thirds full. Please, be self consistent. note: since the b+ tree leaf nodes that you will implement will not have children permitted, there is a reasonable argument that the B+ tree leaves are techically extensions of a k-ary tree of order m (or an m-ary tree). since the lowest level nodes contain the true keys, and each node is to be treated as a page in memory, we really have two different types of objects: internal nodes (also called guide nodes) and external nodes (also called leaf nodes) Now, a bit more clarification--the previous points were the rules that the both the B tree and B+ tree has to satify. Since B+ trees are the most commonly used for data bases and directory structures we will focus on them first. Warning: you need to read shaffer no offense implied, you need to read the whole section of the chapter--not just look at the picture and read the pseudo code--to understand why the number of KEYSs (corresponding to actual data records) in the leaves may be different than the number of GUIDE values in internal nodes. however, before we can do that, we need to get some nomenclature regarding IMPLEMENTATION a B+ tree of order m. 1) internal nodes and external nodes are devised to take one page of main memory. that is, the order, m, of the B+ tree is selected so that the amount of space required by all the stuff in the node is as close to a page as possible, without exceeding a page. (to find out what a page is, read some of the page-specific email available, or go visit google using "virtual memory page faults" or " memory page faults" 2) internal nodes contain space for "pointers" or "references" or LINKS to m child nodes, which can be either internal or external nodes. internal nodes also contain space for the values, called GUIDE values, that are used to determine which of the m child subtrees of the node will be accessed next. So, space for m-1 of these guides is needed. 3) external nodes contain space for the actual KEY values and associated data record or data record pointer. the question is, how much? that is determined by the size of the KEY and the size of the data record or data record pointer So, the only specification is that the leaf must be half full. I typically say that we assume that each leaf can contain, say, p KEYs, where p need not have any relation to m, the order of the B+ tree 4) To make B+ tree maintenance easier, nodes on each level are typically connected as a doubly linked lists. This has the added advantage of supporting range queries. note: this is all consistent with shaffer, and will match the part 2 spec when i get it fixed.