Re: [Boost-users] boost::pp application
Thanks! Today I managed to get a little bit further, let me tell you where I am. First I aliased every symbol of the alphabet as an integer. #define A 1 #define B 2 ... Then I guess there are two options: 1) #define NOT_EQ(s,E1,E2) \ BOOST_PP_NOT_EQUAL(E1,E2) #define REMOVE_ALL(Elem,Seq) \ BOOST_PP_SEQ_FILTER(NOT_EQ, Elem, Seq) #define REMOVE_ELEM(s,Seq,Elem) REMOVE_ALL(Elem,Seq)(Elem) #define REMOVE_DUPLICATES(Seq) \ BOOST_PP_SEQ_FOLD_LEFT(REMOVE_ELEM, Seq, Seq) This starts with the full sequence, and for each element, removes all occurences and add it in the end. 2) The other option is similar but instead of starting with the full sequence and remove each element, it starts with an empty sequence and adds each element if it is not already there (kind of insertion sort but without sorting). I think the solution 1) is faster, but I confess it's only an intuition :) Do you see any other way, more efficient than this? Thanks a lot for trying to help! Marco On Friday 03 February 2006 22:48, you wrote:
Hi Marco,
I just say this post now, so I'll get back to it after I get home from work. I the pp-lib can help you with this.
Regards, Paul Mensonides
-----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Marco Correia
I would like to ask if there is a way boost::preprocessor may be used to solve the following problem:
The problem is to define a macro that expands to the concatenation of their arguments, without duplicates. The arguments are symbols belonging to a known alphabet.
--
Marco Correia
-----Original Message----- From: Marco Correia [mailto:mvc@netcabo.pt]
Thanks!
Today I managed to get a little bit further, let me tell you where I am.
First I aliased every symbol of the alphabet as an integer.
#define A 1 #define B 2 ...
This is a good start, but the wrong way to do this part of it. Instead, you should do something like: #define CONVERT(sym) BOOST_PP_CAT(CONVERT_, sym) #define CONVERT_A 1 #define CONVERT_B 2 ...
#define NOT_EQ(s,E1,E2) \ BOOST_PP_NOT_EQUAL(E1,E2)
#define REMOVE_ALL(Elem,Seq) \ BOOST_PP_SEQ_FILTER(NOT_EQ, Elem, Seq)
#define REMOVE_ELEM(s,Seq,Elem) REMOVE_ALL(Elem,Seq)(Elem)
#define REMOVE_DUPLICATES(Seq) \ BOOST_PP_SEQ_FOLD_LEFT(REMOVE_ELEM, Seq, Seq)
This starts with the full sequence, and for each element, removes all occurences and add it in the end.
I like that you're thinking about this algorithmically. This is an interesting solution. The downside is that it is redundantly adding and later removing elements of the result sequence when the input sequence contains the same element again.
Do you see any other way, more efficient than this?
This is how I did it:
#include
Hi Paul, Your algorithm perfoms much better than mine, and I think it would never occur to me. Thanks a lot! Marco On Saturday 04 February 2006 12:57, Paul Mensonides wrote:
-----Original Message----- From: Marco Correia [mailto:mvc@netcabo.pt]
Thanks!
Today I managed to get a little bit further, let me tell you where I am.
First I aliased every symbol of the alphabet as an integer.
#define A 1 #define B 2 ...
This is a good start, but the wrong way to do this part of it. Instead, you should do something like:
#define CONVERT(sym) BOOST_PP_CAT(CONVERT_, sym) #define CONVERT_A 1 #define CONVERT_B 2 ...
#define NOT_EQ(s,E1,E2) \ BOOST_PP_NOT_EQUAL(E1,E2)
#define REMOVE_ALL(Elem,Seq) \ BOOST_PP_SEQ_FILTER(NOT_EQ, Elem, Seq)
#define REMOVE_ELEM(s,Seq,Elem) REMOVE_ALL(Elem,Seq)(Elem)
#define REMOVE_DUPLICATES(Seq) \ BOOST_PP_SEQ_FOLD_LEFT(REMOVE_ELEM, Seq, Seq)
This starts with the full sequence, and for each element, removes all occurences and add it in the end.
I like that you're thinking about this algorithmically. This is an interesting solution. The downside is that it is redundantly adding and later removing elements of the result sequence when the input sequence contains the same element again.
Do you see any other way, more efficient than this?
This is how I did it:
#include
#include #include #include #include #include #define CONVERT(sym) BOOST_PP_CAT(CONVERT_, sym)
#define CONVERT_0 0 #define CONVERT_A 1 #define CONVERT_B 2 #define CONVERT_C 3 #define CONVERT_D 4 #define CONVERT_E 5 #define CONVERT_F 6 #define CONVERT_G 7 #define CONVERT_H 8 #define CONVERT_I 9 #define CONVERT_J 10 // ...
#define ID(x) (x)
#define PRED_1(d, tuple) \ CONVERT( \ BOOST_PP_SEQ_HEAD(BOOST_PP_TUPLE_ELEM(2, 1, tuple)) \ ) \ /**/ #define PRED_2(s, x, y) \ BOOST_PP_NOT_EQUAL(CONVERT(x), CONVERT(y)) \ /**/
#define OP_1(d, tuple) OP_2 tuple #define OP_2(result, seq) \ ( \ result(BOOST_PP_SEQ_HEAD(seq)), \ BOOST_PP_SEQ_FILTER( \ PRED_2, BOOST_PP_SEQ_HEAD(seq), \ BOOST_PP_SEQ_TAIL(seq) \ ) \ ) \ /**/
#define REMOVE_DUPLICATES(seq) \ BOOST_PP_TUPLE_ELEM( \ 2, 0, \ BOOST_PP_WHILE( \ PRED_1, OP_1, \ (ID, seq(0)) \ ) \ ) \ /**/
REMOVE_DUPLICATES( (A)(B)(A)(C)(B)(B)(A)(C) ) // (A)(B)(C)
Instead of using FOLD_LEFT, this one works by using WHILE to iterate over the elements of the input sequence. This allows you to *change* the sequence that you're iterating over. At the beginning of the algorithm, a (0) is added as a rogue element to the end of the sequence (since you can't have nil sequences). An extra CONVERT_ macro is defined that converts 0 to 0 (i.e. a no-op).
The predicate passed to WHILE tests whether the head of the to-be-processed sequence is non-zero after being CONVERT'ed.
The operation passed to WHILE appends the first element of the sequence to the result, and removes all occurrences of that element in the to-be-processed sequence.
Note that we "initialize" the result sequence to ID, but append to the tail of the result (and thus ID removes itself when the first element gets added to the result).
With each application of the operation, the tuple changes:
( ID , (A)(B)(A)(C)(B)(B)(A)(C)(0) ) // start ( (A) , (B)(C)(B)(B)(C)(0) ) ( (A)(B) , (C)(C)(0) ) ( (A)(B)(C) , (0) ) // end
At the end of the loop, we simply extract the first element of tuple. From there, if you want to concatenate all the elements of the result together, just use SEQ_CAT.
The worst-case scenario for any given length of input sequence occurs when there are no duplicates. E.g.
REMOVE_DUPLICATES( (A)(B)(C)(D)(E)(F)(G)(H)(I)(J) )
Even though this version is more complex, it is consistently faster because it doesn't do any redundant work.
Regards, Paul Mensonides
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
--
Marco Correia
participants (2)
-
Marco Correia
-
Paul Mensonides