Does Boost have a library that defines a file-based container, perhaps based on a B-tree or equivalent, for very large collections? Or does anyone know of such a thing, or has there been talk about developing one?
We have a few file-based collections with millions of entries. We don't especially want to install a database. The file is too large to load into memory. The file's format could be changed to make it compatible with
At 11:23 AM 2/23/2005, John.Wismar@autozone.com wrote: the
type of collection I'm thinking of, but for the moment, we're doing binary searches by reading the file on disc. As you might guess, this can be quite slow, especially when the file is on a DVD and not the HD.
Anyway, I'd like to be able to write something like:
//file.data contains 100,000,000 entries file_map
some_data("file.data"); file_map ::iterator it = some_data.find(56789012); and have the container (for example via a B-tree or whatever) be able to locate the item with very few disc accesses.
Any libraries that can do this type of thing?
In 1999 I did some work on a B-tree implementation for Boost, but became discouraged. For maximum portability and very high performance, it would have had to have a lot of limitations; memcopy'able POD types only, an interface that differed in several ways from the standard library containers, etc. Even with those limitations, the prototype took twice as long for typical operations as the old C code B-tree I've used for 20 years. Perhaps someone else will give it a try; a B-tree is an incredibly valuable tool and it would be great to have a good one in Boost. --Beman