Oleg Grunin
Glen Fernandes
writes: Dear Boost community,
The formal review of Louis Dionne's Hana library starts Monday, 10th June and ends on 24th June.
After playing for a few of hours with Hana I take my hat off to Louis for the impressive feat he accomplished with the library and the excellent documentation. MPL unlocked a whole new world of C++ programming that few, at the time, suspected existed. I can only hope that Hana will do the same. But I am not yet certain of it. Maybe it is due to the heavy paradigm shift which I haven't yet made. In MPL and Fusion there was a clear distinction between the compile and run-time worlds (In the former pretty much everything, save for_each, was compile-time, and in latter all the compile time stuff lived in result_of namespace ). With Hana that line is fuzzy. I understand that Louis considers it its big achievement and it may be so but I am not yet convinced. To me type manipulations (metaprogramming) were always something that went on between the programmer and the compiler and were entirely gone from the resulting binary.
They should be entirely gone from the resulting binary, and this is the case with basically everything I've tried so far. The optimizer has all the necessary information to remove this code entirely. The example you are presenting is worthy of filing a bug against Clang, which I will do once I'm done reducing the test case. See below for details.
Here is a small example that uses an mpl type map and the corresponding code I put together (likely sub-optimally) using hana:
[...]
Indeed, you are using a Map when what you really want is a Tuple, since
your keys are integers. Here's the version I would have written:
#include
Despite Louis's claims of a significant compile-time speed up with Hana, I get the following with clang 3.6.0:
$time clang++ -O3 -D_USE_HANA -stdlib=libc++ -std=c++14 hana.cpp -o hana real 0m55.233s user 0m54.190s sys 0m0.631s
$time clang++ -O3 -stdlib=libc++ -std=c++14 hana.cpp -o mpl real 0m2.162s user 0m1.632s sys 0m0.108s
There are several reasons for the above timings.
In order of importance:
1. Map and Set (the associative containers) are naively implemented right
now. Optimizing and improving those data structures is the subject of
my GSoC project of this summer, which I will focus on more seriously
after the review (which is time consuming). I am aware of several
possible optimizations that can be applied to heterogeneous
associative structures, some of which would make your use case
much more efficient.
As you may have noticed, I never actually show benchmarks for
associative data structures in the tutorial, but only for Tuple,
because I know the performance of the associative containers to be
bad (but not as bad as you shown, frankly). The compile-time
performance claims made in the tutorial may be a bit bold for the
time being, so I added a note about the associative data structures
until they are optimized.
2. I think the compiler is incorrectly losing time for generating and
optimizing code that should not be in the executable at all. Like I
said, I consider such bad codegen to be a compiler bug.
3. You're using the wrong data structure for the task. This should
have some impact on the performance, but never as much as here.
I've reimplemented your example in several different ways. You can find
the full code in the Gist at [1]. Here is a chart showing the compilation
time of the different solutions as a function of the number of
`std::array
And at run-time: (run a loop of 10000)
$ time ./hana 100000 real 0m2.834s user 0m2.825s sys 0m0.006s
$ time ./mpl 100000 real 0m0.021s user 0m0.002s sys 0m0.001s
Which clearly shows that the stuff intended to be purely compile-time has leaked into the run-time.
There are several reasons for the above timings. In order of importance (I think): 1. Clang generates horrible code for your example. I think the fault is principally on the compiler, since it generates perfect code for up to ~20 elements, and then starts emitting complete crap. I'll follow up with the Clang team about this. 2. The fault is also partially on the library, which does not compress the storage of empty types right now. Hence, your compile-time map, which contains (int_<...>, type<...>) pairs, actually has a huge sizeof, whereas it should have sizeof(char) since all those pairs are empty. Indeed, I was able to make the runtime performance issue go away by compressing the storage of empty types in Hana pairs. I opened an issue [2] to track the progress of this feature. By the way, I consider this a high priority feature. Here is a chart showing the execution time for the examples in the Gist: http://i.imgur.com/CWUoDtB.png Here is a chart showing the executable size for the examples in the Gist: http://i.imgur.com/uCThWXb.png As you can see, only your original Map example is bad, and it starts sudenly at about 30 elements, which reinforces my thinking that it's a compiler bug or quality of implementation issue. As you can see, my version using Hana Tuple is indeed 0 overhead, just like the MPL. I am confident that compressing empty types and making the Map implementation more clever should resolve the issue you presented.
Also it appears that Hana lets the compiler swallow all kinds of code that doesn't appear to have any practical meaning:
I understand (maybe erroneously) that a hana::map can have a useful application only if its keys are compile-time constructs, and yet I was able to compile: make_map(make_pair(1, "one"), make_pair(2, "two"));
You are correct when you state that Hana Maps require the keys to be
comparable at compile-time, i.e. the comparison of two keys must return
an IntegralConstant.
The above usage, which is generally meaningless, is impossible to catch
at compile-time in general, because the type of the object returned by
the comparison of 1 with something else may depend on that something
else. In other words, we can only flag the above usage if there exists
no 'x' (of any type) such that '1 == x' returns an IntegralConstant.
However there is no way to assess that in general.
However, if you tried to access an element of your map, then
you would see a compile-time failure:
auto map = make_map(make_pair(1, "one"), make_pair(2, "two"));
std::string one = map[1];
// error:
error: static_assert failed
"hana::value<T>() requires T to be a Constant"
static_assert(_models and even
make_map(1,2,3); This one is easy to catch, because make_map requires Products as
arguments. This was fixed and you should now get the following
pretty compile-time error:
error: static_assert failed
"hana::make<Map>(pairs...) requires all the 'pairs' to be Products"
static_assert(hana::all(are_pairs),
^ ~~~~~~~~~~~~~~~~~~~~
note: in instantiation of function template specialization
'hana::make_impl