Re: [boost] Efficient tuple implementation

9 Jun 2014

      On 8 Jun 2014 at 18:37, Stephan T. Lavavej wrote:
...
...
Do also bear in mind that MSVC is not an AST based compiler,
so all your benchmarks will look totally different on MSVC anyway.
What may have O(N^N) scaling on an AST compiler might well be O(N)
on MSVC - in fact, that is exactly why MSVC isn't AST based,
as internal Microsoft code is so complex it is uncompilable by an AST based compiler.
1. That's not why. The reason is that manipulating tokens seemed like a
good idea 20-30 years ago, when this was sufficient to deal with C and
C++ (before templates, variadic templates, decltype, constexpr,
expression SFINAE, etc.) when computers were far smaller and slower.
For a good chunk of MSVC's (and C++'s) userbase, template use never 
really exceeds cookie cutter use i.e. template<class T> class vector. 
And for those guys - which includes a good chunk of Microsoft itself 
- MSVC's approach makes perfect sense. It is after all blindingly 
quick, and uses virtually no RAM relatively speaking.

By "internal Microsoft code is so complex it is uncompilable by an 
AST based compiler" I was referring to a conversation I had with Gaby 
and James McNellis at C++ Now with Chandler occasionally jumping in 
to confirm the insanity :). They were telling me about the problems 
of getting 18k line long functions to compile with *any* AST based 
compiler, and that a unique problem facing MSVC is that a ton of 
Microsoft internal code is automatically generated in a way done 
under the expectation that MSVC doesn't mind really long statements 
such as an if() statement spanning a few thousand lines. One then 
faces the choice of retrofixing those automated generators to produce 
less insane output, or keep MSVC non-AST based for non-template 
non-constexpr code. I was told to date the latter has been chosen.
...
2. MS has a whole bunch of code, but only a tiny fraction approaches the
complexity of Boost (in terms of "doing tricky things with the
language"; stuff that's just loops and pointers is not complicated by
this metric, no matter how twisty). Our STL implementation is usually
the high water mark for complexity, and while we do tricky stuff, I know
that Boost is trickier.
Oh sure. I didn't clarify I meant things like a single boolean if 
containing thousands of tests. I meant sheer size complexity, not 
language complexity.
...
3. I am not really qualified to talk about compiler tech, but I cannot
possibly understand how keeping an AST around would change the
*asymptotic* complexity of translation (it introduces constant factors
which are an issue in practice). The AST has a strict superset of the
information available by looking at a few tokens at a time. By
definition, any speedups with a token-based approach that can't be
obtained through an AST would be due to nonconformance.
The problem is RAM consumption, because you need to create an AST for 
the item being compiled before you compile it. Louis' empirical tests 
for MPL11 (now Hana) implementation solutions did a lot of time and 
space comparisons for various metaprogramming techniques. Both clang 
and GCC are AST based, so their space consumption graphs looked not 
dissimilar.

MSVC has taken considerable effort to use sane amounts of RAM under 
very trying conditions, so I would suspect without evidence that MSVC 
would produce *very* different graphs - some worse, but probably many 
better. What is O(N^2) on clang or GCC I could see being O(N) on MSVC 
for example simply due to lack of two phase lookup.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/

Re: [boost] Efficient tuple implementation

Niall Douglas