principleMinor
Is it possible to use a copy-on-write strategy to modify a B+ tree?
Viewed 0 times
writepossiblestrategyusetreemodifycopy
Problem
Alright, I'm not sure if this is more of a stack overflow question, but I'm going to try here because you folks seem more suited.
CouchDB makes an interesting claim about using an "append only" B+ tree to index its documents. Specifically...
"In a B-tree, data is kept only in leaf nodes. CouchDB B-trees append
data only to the database file that keeps the B-tree on disk and grows
only at the end. Add a new document? The file grows at the end. Delete
a document? That gets recorded at the end of the file."
and
"CouchDB is actually using a B+ tree, which is a slight variation of
the B-tree that trades a bit of (disk) space for speed. When we say
B-tree, we mean CouchDB’s B+ tree."
SOURCE: CouchDB Documentation
This paper, describes the problem with an "append only" or "copy on write" implementation of a B+ tree. It suggests that such optimistic concurrency strategies are only available to a standard B tree. Specifically...
"In a regular b-tree leaves are chained together. This is used for
tree rebalancing and range lookups. In a b-tree that is updated
using copy-on- write leaves cannot be linked together. For example,
Figure 2 shows a tree whose rightmost leaf node is C and where the
leaves are linked from left to right. If C is updated the entire tree
needs to be shadowed. Without leaf-pointers only C, B, and A require
shadowing."
SOURCE: Rodeh O.
So the actual question...
Is possible to implement a copy-on-write B+ tree?
CouchDB makes an interesting claim about using an "append only" B+ tree to index its documents. Specifically...
"In a B-tree, data is kept only in leaf nodes. CouchDB B-trees append
data only to the database file that keeps the B-tree on disk and grows
only at the end. Add a new document? The file grows at the end. Delete
a document? That gets recorded at the end of the file."
and
"CouchDB is actually using a B+ tree, which is a slight variation of
the B-tree that trades a bit of (disk) space for speed. When we say
B-tree, we mean CouchDB’s B+ tree."
SOURCE: CouchDB Documentation
This paper, describes the problem with an "append only" or "copy on write" implementation of a B+ tree. It suggests that such optimistic concurrency strategies are only available to a standard B tree. Specifically...
"In a regular b-tree leaves are chained together. This is used for
tree rebalancing and range lookups. In a b-tree that is updated
using copy-on- write leaves cannot be linked together. For example,
Figure 2 shows a tree whose rightmost leaf node is C and where the
leaves are linked from left to right. If C is updated the entire tree
needs to be shadowed. Without leaf-pointers only C, B, and A require
shadowing."
SOURCE: Rodeh O.
So the actual question...
Is possible to implement a copy-on-write B+ tree?
Solution
Yes it is. The trick is to use the size of file (where is the tree stored) as the means of calculating offset of the new root. In other words, when one modifies a leaf, most of the tree wont change and the blocks which do change get appended at the end, with root coming last.
More here: http://www.bzero.se/ldapd/btree.html
More here: http://www.bzero.se/ldapd/btree.html
Context
StackExchange Computer Science Q#51590, answer score: 3
Revisions (0)
No revisions yet.