On 9/10/2015 7:58 PM, Matt Calabrese wrote:
On Thu, Sep 10, 2015 at 4:15 PM, Edward Diener
wrote: I admit I have never done any benchmarking of preprocessor code. This is not only compile time code but is preprocessor code, which occurs pretty earlier in the compilation phases. So I have never thought very hard and long about how one would measure the time spent by the compiler in macro expansion depending on whether you use one Boost PP construct versus another. Any thoughts about how anybody could accurately benchmark such time would be most welcome.
My experiences are anecdotal, so I don't want to make precise claims, I'm just raising this as something to consider and it might be necessary to benchmark before making too many recommendations. When I was working on Boost.Generic I at one point reached a blocking point where preprocessing was consuming so much memory that I'd run out of address space (32-bit)!
I encountered this with gcc quite often when testing VMD until I specified '-ftrack-macro-expansion=0' and that solved the problem for gcc. I have also encountered an inner compiler error from clang on one of my VMD tests, most probably due to some clang limit being exceeded; I reported it to clang but so far no resolution has occurred. I also ran into situations where VC++ would give errors only to have everything work without errors when I reran the VMD tests; that sounds like some out of memory error.
I just couldn't proceed when dealing with complicated concepts until I revised how I did my repetition, which brought down the memory usage considerably (switching between several disparate fold operations with small states and a single fold operation with a large state is one change that I remember vividly). I imagine that looping constructs are always more directly the culprit for these types of issues, though if you are deep inside of some repetition I wonder if even the difference of tuple and array can have noticeable impact, especially for a large number of elements. I really don't know as I've never really analyzed the problem or done rigorous profiling of these types of things, but I've stopped making too many assumptions as I've been bitten before. It also could be pretty compiler-dependent as well.
I have no doubt manipulating tuples are probably slower than manipulating arrays when variadic macros are being used, since calculating the tuple size is slower than having it there as a preprocessor number.
To be clear, I'm not strictly sure about even that even though my initial intuition was to prefer array, I'm just hesitant to state for sure that recommending a preference of tuple is necessarily the best recommendation or the best default. Some operations are probably simpler for tuples (I.E. I'm imagining that joining two tuples together is probably faster than joining two arrays, since you can just expand both by way of variadics without caring about or having to calculate the size of the result). There could even be no or minimal measurable difference in all practical cases.
Agreed.
I've just in practice seen surprising behavior of implementations during preprocessing that I wouldn't have expect had I not seen it happen. Memory usage in particular tends to be surprisingly intense during preprocessing (surprising to me, at least), especially if tracking of macro expansions is enabled in your compiler.
See the gcc note above.
I should have specified that "phase out the use of Boost PP arrays" does
not mean that they will ever be eliminated from Boost PP AFAICS.
Okay, that's good, then.
I am pretty sure I know Paul's stance since he was the one who mentioned to
me that with the use of variadic macros the Boost PP array is "obsolete".
I usually assume whatever Paul suggests is best when in this domain, so my paranoia could be unfounded here.
My recommendation is simply based on ease of use and syntactic simplicity. Without variadic macros PP arrays are mechanisms which track the number of elements whereas with PP tuples the number of elements has to be either hardcoded for a particular use ( is known and never changes ) or passed separately. Once variadic macros are supported the size of the PP tuple is always known and therefore much easier to use, but of course the size can not be 0 elements as it can with a PP array. In VMD you can pass "emptiness" as a tuple of 0 elements if you like, but of course you need to check for emptiness and act accordingly. As I said I think I can improve the latter.