Two tokenizer questions 1) I do not understand the constructor parameters to the char_delimits_separator class which is the default TokenizerFunction model. I assume the constructor would enable one to specify the delimiters for each token. I assume that the "returnable" parameter is a string literal of delimiters which are also returned as tokens and that "nonreturnable" is a string literal of delimiters which are not returned as tokens. I assume that the intersection of these string literals consists of all the delimiters. Given these assumptions, what is the purpose of the bool "return_delims" first parameter ? If "returnable" specifies a non-empty string literal, then why shouldn't these be the delimiters returned as tokens irregardless of the setting of "return_delims". If "returnable" specifies an empty string literal, then no delimiters will be returned as tokens once again irregardless of the setting of "return_delims". Would someone please explain why "return_delims" exists as a parameter in conjunction with the meaning of the "returnable" parameter ? It seems utterly redundant and unnecessary. 2) How does one create one own's TokenizerFunction to be plugged into the tokenizer class template as the first template type ? The documentation on TokenizerFunction concept leaves me utterly confused on a practical level. I assume I am creating a template class here despite the name which suggests it is a function, but the documentation never explains what it is. Maybe someone from Boost or John Bandela should consider updating it.
Hi Edward,
Several others have also found the return_delims parameter confusing, and
I've proposed that we remove that parameter.
John Bandela, would you mind if I go ahead and make that change?
On 2/4/02 2:03 PM, "Edward Diener"
Two tokenizer questions
1) I do not understand the constructor parameters to the char_delimits_separator class which is the default TokenizerFunction model. I assume the constructor would enable one to specify the delimiters for each token. I assume that the "returnable" parameter is a string literal of delimiters which are also returned as tokens and that "nonreturnable" is a string literal of delimiters which are not returned as tokens. I assume that the intersection of these string literals consists of all the delimiters. Given these assumptions, what is the purpose of the bool "return_delims" first parameter ? If "returnable" specifies a non-empty string literal, then why shouldn't these be the delimiters returned as tokens irregardless of the setting of "return_delims". If "returnable" specifies an empty string literal, then no delimiters will be returned as tokens once again irregardless of the setting of "return_delims". Would someone please explain why "return_delims" exists as a parameter in conjunction with the meaning of the "returnable" parameter ? It seems utterly redundant and unnecessary.
2) How does one create one own's TokenizerFunction to be plugged into the tokenizer class template as the first template type ? The documentation on TokenizerFunction concept leaves me utterly confused on a practical level. I assume I am creating a template class here despite the name which suggests it is a function, but the documentation never explains what it is. Maybe someone from Boost or John Bandela should consider updating it.
John, will you address this? Cheers, Jeremy -- Jeremy Siek http://www.osl.iu.edu/~jsiek Ph.D. Student, Indiana Univ. B'ton email: jsiek@osl.iu.edu C++ Booster (http://www.boost.org) office phone: (812) 855-3608
Hi All,
This is what I've done to fix the confusing constructor of
char_delimiters_separator.
I basically wanted to add another less-confusing constructor to the class,
and document the old constructor as deprecated (don't want to break
existing code). However, because of the default arguments in the original
constructor, there would be bad interactions between the new constructor
and the old.
Therefore, I instead just created a new class named char_separator and put
a note in the documention for char_delimiters_separator that the whole
class is deprecated.
I've checked in the code, tests, examples, and documentation for
char_separator. Let me know what you think.
Cheers,
Jeremy
P.S. It is still a bit annoying that it takes several lines to create a
tokenizer. When I get the time I'll had type generators and helper
functions (a la the iterator adaptor library) to make it easier to create
tokenizers (or if anyone else has the time, go for it!).
On Mon, 4 Feb 2002, Jeremy Siek wrote:
jsiek> Hi Edward,
jsiek>
jsiek> Several others have also found the return_delims parameter confusing, and
jsiek> I've proposed that we remove that parameter.
jsiek>
jsiek> John Bandela, would you mind if I go ahead and make that change?
jsiek>
jsiek> On 2/4/02 2:03 PM, "Edward Diener"
participants (2)
-
Edward Diener
-
Jeremy Siek