In your Anatomy of an Index section, you outline exactly how B-Tree indexes are traversed. This was exemplified in this picture (Figure 1.3):
To me, it seems odd that each entry is processed in ascending order. You also state that each node packs as many entries as will possibly fit on the page, which could be thousands (if, say, only a single 32bit integer was being indexed). To me, it would seem that there would be no payoff using an index until you had more than one node to search, as the efficiency would be O(n) either way.
I'm curious as to why databases don't locate entries in a node using a binary search algorithm instead. Is there a reason why this wouldn't work, or does processing each entry in ascending order offer some advantage that I'm not seeing? Thanks!
asked Sep 28 '12 at 22:41
Absolutely correct. That's the reasons why databases often refuse to use indexes on very small tables.
Well, I guess I should write this somewhere in the book as well: Chapter 1, "Anatomy of an Index" only describes the indexes to that level that you can properly understand how to use indexes. Limiting it to this purpose, I was able to skip many details. In other words: it is a model to understand indexes from the user point, but its not sufficient to implement a database.
Databases may or may not use a binary search, or even a sub-B-Trees, or whatever in the index nodes. I don't know. These things are usually not documented, and I think its not required to know that to use indexes properly.
What's best (binary search, linear search, whatever) is probably depending on micro optimizations like cache-line alignments and branch predictions. The last time I was involved with optimizations on that level is a decade ago. I don't have any more details on that. But drop us a note if you find something.
answered Sep 29 '12 at 14:22
Markus Winand ♦♦
just one data point: a quick look at the PostgreSQL 9.3 source (
answered Jul 11 '13 at 09:08
Because to do that binary search (aka dichotomy) you need to know where rows are located on the disk. Which you don't unless you have an index.
And if you meant doing the dichotomy on the index, it doesn't work (on big tables) because you can't create big continuous index (as Markus explained in one of his chapters). You need to update the index (add & remove columns), which would have a tremendous price on big contiguous indexes.
That's why the indexes are fragmented into a tree. Each leaf of this tree is cheaply readable, cachable, and updatable.
By the way this tree is the same idea than the dichotomy, except that there are n items per iteration and not just 2.