On 03/13/2016 01:28 PM, Glen Fernandes wrote:
1. Gathering these benchmarks is always useful, even if block_ptr was not designed to be a drop-in replacement for share_ptr. So you had the right idea, even if the execution needed some work. It's important to get the execution correct, though, otherwise the results are not meaningful - which is where I (and I believe Rob, also) was trying to steer you.
You and Rob pointed out the errors rather quickly ;)
2. Looking at the numbers you obtained after modifying my benchmark example, they are close enough that at least you could say "Using block_ptr over shared_ptr won't be a significant performance loss":
unique_ptr (new): 47.7686 unique_ptr (make_unique): 46.8545 shared_ptr (new): 77.8261 shared_ptr (make_shared): 50.8072 shared_ptr (allocate_shared_noinit): 33.021 block_ptr (new): 69.6554
It would be interesting to see the performance of block_ptr_base<> as well. Perhaps I could make block_ptr<> cherry pick any smart_ptr with a template parameter eventually but that's just a thought, not a necessity for now.
3. There are more benchmarks beyond just creation (though creation is likely the most meaningfully expensive one). Copying overhead might be one: I haven't looked at what is involved in copying a block_ptr, but some work happens when you copy a shared_ptr.
Copying is another story because copying a block_proxy would refer to the same pointers; i.e. it wouldn't be copying the nodes of a container for example. Perhaps I can work on that.
4. shared_ptr by way of allocate_shared allows creation to not involve 'new' expressions, or even a call to '::operator new(std::size_t)' at all.
i.e. For those C++ projects that have a requirement for all dynamic allocation in their project to involve some stateful custom allocator instances of some stateful custom allocator type, they can still use shared_ptr (with allocate_shared). Is this possible with block_ptr?
An easy way to use a custom allocator is by deriving from block<>:
template <typename T>
struct userblock : public block
I hope to take a look at the block_ptr motivation, design, and implementation when I have some time next week. The title of the thread caught my attention; "X is 600% faster than Y" always has a high excitement potential.
The motivation is mainly to get rid of garbage collectors once and for all. I am working on WebKit and in Brave New World, its memory manager would be deterministic but unfortunately the garbage collector in well inlaid into WebKit so it's not an easy task to replace. But I am looking at alternatives like Duktape, etc. Also the documentation is outdated as proxies are now explicit but it'll give you the general idea. Thanks again for all your help! Regards, -Phil