In your Anatomy of an Index section, you outline exactly how B-Tree indexes are traversed. This was exemplified in this picture (Figure 1.3): To me, it seems odd that each entry is processed in ascending order. You also state that each node packs as many entries as will possibly fit on the page, which could be thousands (if, say, only a single 32bit integer was being indexed). To me, it would seem that there would be no payoff using an index until you had more than one node to search, as the efficiency would be O(n) either way. I'm curious as to why databases don't locate entries in a node using a binary search algorithm instead. Is there a reason why this wouldn't work, or does processing each entry in ascending order offer some advantage that I'm not seeing? Thanks! asked Sep 28 '12 at 22:41 Hiawatha |
Hi all, just one data point: a quick look at the PostgreSQL 9.3 source ( Cheers, answered Jul 11 '13 at 09:08 Teggy |
Because to do that binary search (aka dichotomy) you need to know where rows are located on the disk. Which you don't unless you have an index. And if you meant doing the dichotomy on the index, it doesn't work (on big tables) because you can't create big continuous index (as Markus explained in one of his chapters). You need to update the index (add & remove columns), which would have a tremendous price on big contiguous indexes. That's why the indexes are fragmented into a tree. Each leaf of this tree is cheaply readable, cachable, and updatable. By the way this tree is the same idea than the dichotomy, except that there are n items per iteration and not just 2. answered Jul 06 '13 at 14:09 Simon |