----- Original Message -----
From: "Thomas Wenisch"
On Mon, 9 Dec 2002, Paul Mensonides wrote:
[ snip: lengthy example of 3 different ways to generate repeated code over a cartesian product of substitutions ]
These examples were very enlightening, thanks for going to the trouble of posting them to the list for all of us to see. I have a few questions:
No problem. Any questions about the Preprocessor Library from anyone here I'd be happy to answer. I am also happy to help anyone that needs it if I can--they don't call me the "PP Guru" for nothing, and if I can't (i.e. no time (or patience) ;)) I can still point you in the right direction.
1) Is there any reason to prefer lists over sequences on any compilers? It seems that the syntax for sequences is much nicer, since you don't have to spend time counting paranthesis.
Yes. A list has direct support for the nil state. I.e. 'BOOST_PP_NIL' is a nil list. Sequences, on the other hand, cannot be empty. In many application areas this doesn't matter, but in others the necessary "hacking around" that you have to do to deal with faked empty sequences makes it more worthwhile to use lists. Incidently, when/if C++ gets the preprocessor upgrades from C, a nil state will be directly supported. (C99 allows empty arguments.) Dealing with sequences is typically faster than lists and can easily replace tuples (e.g. (x, y, z)) for random data caches. I.e. element-wise sequence access is a *very* fast operation, even if the sequence is huge (supported up to 256, I believe). Lists cannot be effectively used this way--you'd bring Comeau C++ (for example) to a grinding halt. Also, appending to either the front or back of a sequence is as fast as you can get: #define SEQ (a)(b)(c) BOOST_PP_SEQ_ELEM(4, SEQ (x)(y)(z) SEQ) // y BOOST_PP_SEQ_ELEM(6, SEQ (x)(y)(z) SEQ) // a Sequences have many advantages over lists, but there is no such thing as an "empty" sequence--and that can be a major disadvantage. Therefore, both are supported (plus there is a lot of "legacy" code that uses lists) in the current CVS. Note that the 1.29 release does not contain the sequence implementation.
2) What is the data structure you use in the file iteration example (the structure which requires prepended lengths). Are the hardcoded lengths neccessary for file iteration to work, or could one of the other data structures (ie sequences) be used instead? I really like the simple syntax of the sequence type. Nothing to screw up :)
I assume that you mean this one --> (3, (a, b, c)). That data structure is called an "array," which is just an arbitrary name choice. It is, in effect, a high-level tuple that encodes its own size. The actual file iteration parameters are *required* to be arrays. The reason is that arrays and file-iteration existed before sequences, and I needed away to pass variable sized datasets into the iteration mechanism. I wanted the sample implementation earlier in this thread to automatically adjust if the size of the datasets increased. This meant I had to use a data type whose size I could calculate: array, sequence, list, but *not* tuple. Lists would be a *really* bad choice here because they a terrible choice for random access. Furthermore, sequences are not part of the 1.29 release so I was keeping the options open. Altogether, there are four data types that the library currently supports: tuple: (a, b, c) The strength of tuples is that element access is a fast as you can get. The downsides are that the size must be known (since it can't be detected without variadic macros) to access it, the maximum size of a tuple is limited to around 25, and tuples are no good for anything that requires "resizing" the tuple. array: (3, (a, b, c)) The strengths of arrays is that element access is nearly (but not quite) as fast a tuples, the size is built into the structure itself (so it is not necessary for element access), and resizing is directly supported by the library with various primitives. The downsides are that arrays have the same size limitations as tuples and they must have their size specified when they are created. Ultimately, the difference between tuples and arrays is that tuples require the size on access while arrays require the size on construction. Either way, conversion back and forth is fairly trivial. list: (a, (b, (c, BOOST_PP_NIL))) Lists are typical, well-understood, singly-linked lists. They directly support an empty state and can be very good for "algorithmic" like manipulation. As with a runtime list, random access is terrible relative to the other structures, but random access is not the typical use of lists. They are good for folding (a.k.a. accumulation) and other pursuits that often need to deal with a nil state. sequence: (a)(b)(c) // also known as "seq" by me ;) This is kind of a universal type. It is good for everything that lists are good for--except the nil state issue--and is still very fast for random access so it can replace tuples and arrays in many situations. The size is unnecessary (it can be computed easily and efficiently) for access or for construction. This structure is a great "general purpose" type. Also, you can't beat it for appending efficiency since to append seq1 to seq2 requires only: seq1 seq2.
I must admit that I haven't read the docs of Boost.PP since before 1.28 was released, so I know very little about file iteration. My apologies if these questions are already answered in the docs.
The docs (and nearly the entire library) have been rewritten with the release of 1.29--it is mostly backward compatible but not quite (there are docs that discuss incompatibilities). I highly urge people to get the latest CVS sources of the PP lib (and the PP lib docs) because it is better. In particular, I added the sequence support and full-fledge support for the array types mentioned above. As for file-iteration, it is conceptually very simple, but it requires a slightly different thinking about how preprocessing works. I personally like to think of it as an "execution path" with file iteration representing a for-loop that iterates over files (or parts of files). It is a very powerful tool--just ask the Python guys or the MPL guy (Aleksey). In any case, there is a topic devoted to file-iteration in the docs, so it might interest you to read that. Basically, the differences between 1.28 and 1.29+ versions of the library are phenomenal. There are *massive* speed increases on EDG-based preprocessors as well as significantly more functionality. Specifically, preprocessor metaprogramming has moved out of "just macros" into other interesting areas (file-iteration is an example of this).
Regards, -Tom Wenisch Computer Architecture Lab Carnegie Mellon University
If anyone here has any questions about the PP lib, I'll answer as best as I can. Regards, Paul Mensonides