Re: [boost] BOOST_PP array extension proposal

11 Sep 2015


      On 9/10/2015 7:58 PM, Matt Calabrese wrote:
...
On Thu, Sep 10, 2015 at 4:15 PM, Edward Diener <eldiener@tropicsoft.com>
wrote:
...
I admit I have never done any benchmarking of preprocessor code. This is
not only compile time code but is preprocessor code, which occurs pretty
earlier in the compilation phases. So I have never thought very hard and
long about how one would measure the time spent by the compiler in macro
expansion depending on whether you use one Boost PP construct versus
another. Any thoughts about how anybody could accurately benchmark such
time would be most welcome.
My experiences are anecdotal, so I don't want to make precise claims, I'm
just raising this as something to consider and it might be necessary to
benchmark before making too many recommendations. When I was working on
Boost.Generic I at one point reached a blocking point where preprocessing
was consuming so much memory that I'd run out of address space (32-bit)!
I encountered this with gcc quite often when testing VMD until I 
specified '-ftrack-macro-expansion=0' and that solved the problem for 
gcc. I have also encountered an inner compiler error from clang on one 
of my VMD tests, most probably due to some clang limit being exceeded; I 
reported it to clang but so far no resolution has occurred. I also ran 
into situations where VC++ would give errors only to have everything 
work without errors when I reran the VMD tests; that sounds like some 
out of memory error.
...
I
just couldn't proceed when dealing with complicated concepts until I
revised how I did my repetition, which brought down the memory usage
considerably (switching between several disparate fold operations with
small states and a single fold operation with a large state is one change
that I remember vividly). I imagine that looping constructs are always more
directly the culprit for these types of issues, though if you are deep
inside of some repetition I wonder if even the difference of tuple and
array can have noticeable impact, especially for a large number of
elements. I really don't know as I've never really analyzed the problem or
done rigorous profiling of these types of things, but I've stopped making
too many assumptions as I've been bitten before. It also could be pretty
compiler-dependent as well.
...
I have no doubt manipulating tuples are probably slower than manipulating
arrays when variadic macros are being used, since calculating the tuple
size is slower than having it there as a preprocessor number.
To be clear, I'm not strictly sure about even that even though my initial
intuition was to prefer array, I'm just hesitant to state for sure that
recommending a preference of tuple is necessarily the best recommendation
or the best default. Some operations are probably simpler for tuples (I.E.
I'm imagining that joining two tuples together is probably faster than
joining two arrays, since you can just expand both by way of variadics
without caring about or having to calculate the size of the result). There
could even be no or minimal measurable difference in all practical cases.
Agreed.
...
I've just in practice seen surprising behavior of implementations during
preprocessing that I wouldn't have expect had I not seen it happen. Memory
usage in particular tends to be surprisingly intense during preprocessing
(surprising to me, at least), especially if tracking of macro expansions is
enabled in your compiler.
See the gcc note above.
...
I should have specified that "phase out the use of Boost PP arrays" does
...
not mean that they will ever be eliminated from Boost PP AFAICS.
Okay, that's good, then.
I am pretty sure I know Paul's stance since he was the one who mentioned to
...
me that with the use of variadic macros the Boost PP array is "obsolete".
I usually assume whatever Paul suggests is best when in this domain, so my
paranoia could be unfounded here.
My recommendation is simply based on ease of use and syntactic 
simplicity. Without variadic macros PP arrays are mechanisms which track 
the number of elements whereas with PP tuples the number of elements has 
to be either hardcoded for a particular use ( is known and never changes 
) or passed separately. Once variadic macros are supported the size of 
the PP tuple is always known and therefore much easier to use, but of 
course the size can not be 0 elements as it can with a PP array. In VMD 
you can pass "emptiness" as a tuple of 0 elements if you like, but of 
course you need to check for emptiness and act accordingly. As I said I 
think I can improve the latter.