Two issues with lexical_cast

newer
Boost graph problem removing last...

older
[boost.preprocessor] generating...

Yuval Ronen

15 Jan 2005 15 Jan '05

4:47 p.m.

Hi. I'm probably just another one out of a million who make these remarks, but I found nothing in the documentation about them, so I'll mention them anyway. 1. lexical_cast(Source arg) takes the source argument by value. Why not by const& ? Calling lexical_cast(some_basic_string) now makes a copy of the source argument for no reason. 2. lexical_cast ignores trailing whitespaces, but not leading whitespaces. Meaning that int a = lexical_cast<int>("3 "); // three-space will work and return 3, but int a = lexical_cast<int>(" 3"); // space-three will throw. Why? There's no rationale here. Both cases should be treated in the same manner, and IMHO, that manner should be throwing. After a look at the code, it seems that the trailing-whitespace-ignore code ('stream >> std::ws' in line 151) was specifically added, so there was supposed to be a reason for this. But as I said, I believe this is wrong, at least as the default behaviour when converting a string to something else. Maybe some policies (IgnoreLeadingWhitespace, IgnoreTrailingWhitespace) can be added to control this? And again, I suppose I'm not the first to think of it... Thanks, Yuval

Show replies by date

Yuval Ronen

18 Jan 18 Jan

9:25 p.m.

Anyone? Hello? Please...

Yuval Ronen

20 Jan 20 Jan

9:01 p.m.

Any chance of hearing from the official booster on these matters? Does he agree with me or maybe I'm talking rubbish? Should I be looking forward to the next boost version where it will be fixed, or not to bother and continue to use my implementation of lexical_cast? Any info would be greatly appreciated. Thanks, Yuval

Gennadiy Rozental

9:20 p.m.

...

Any chance of hearing from the official booster on these matters?

If you mean original author of lexical_cast, I wouln't count on it. Kevlin is rare guest here. As for "official" part: you are as official as anybody else. Gennadiy

Yuval Ronen

10:12 p.m.

...

...
Any chance of hearing from the official booster on these matters?

If you mean original author of lexical_cast, I wouln't count on it. Kevlin is rare guest here. As for "official" part: you are as official as anybody else.

Does that mean that I can obtain access to the boost CVS with write permissions and start making changes? I hope that the boost code is a bit better protected. If any hooligan like myself can come and make changes to fit his desires, then it sounds weird... Or in a more humoristic note: "I don't care to belong to any club that will have me as a member" (Groucho Marx of course, inaccurate quote) :-)

Terje Slettebø

29 Jan 29 Jan

10:22 p.m.

...

From: "Yuval Ronen" <ronen_yuval@yahoo.com>

...

...
...
Any chance of hearing from the official booster on these matters?

...

...
Gennadiy Rozental:

If you mean original author of lexical_cast, I wouln't count on it. Kevlin is rare guest here. As for "official" part: you are as official as anybody else.

Does that mean that I can obtain access to the boost CVS with write permissions and start making changes?

Yes, if there's a good reason for it, such as if you have a library that has been accepted into Boost, or is helping maintaining one of them.

...

I hope that the boost code is a bit better protected. If any hooligan like myself can come and make changes to fit his desires, then it sounds weird...

You first need to get permission. :) That typically also means that at least someone among admin, or other boosters, knows you well enough to think you're not a "hooligan". Also, I think anyone starting to screw up the CVS will see their write permission revoked faster than you can say "commit". :) Any files "deleted" may always be brought back, for example. I think you'll find that the Boost library is in good hands.

...

Or in a more humoristic note: "I don't care to belong to any club that will have me as a member" (Groucho Marx of course, inaccurate quote)

:) Regards, Terje

John A Meinel

18 Jan 18 Jan

10:18 p.m.

Yuval Ronen wrote:

...

Hi. I'm probably just another one out of a million who make these remarks, but I found nothing in the documentation about them, so I'll mention them anyway.

1. lexical_cast(Source arg) takes the source argument by value. Why not by const& ? Calling lexical_cast(some_basic_string) now makes a copy of the source argument for no reason.

There actually was a reason. Something about certain functions not working properly with a const&. Like the std::limits or something like that. (I ran into this problem because lexical_cast(Parent p) when given a child will actually create a parent object, so no amount of overriding virtuals changes things.)

...

2. lexical_cast ignores trailing whitespaces, but not leading whitespaces. Meaning that

int a = lexical_cast<int>("3 "); // three-space

will work and return 3, but

int a = lexical_cast<int>(" 3"); // space-three

Here, I agree with you, but I don't know the rationale at all.

...

will throw. Why? There's no rationale here. Both cases should be treated in the same manner, and IMHO, that manner should be throwing. After a look at the code, it seems that the trailing-whitespace-ignore code ('stream >> std::ws' in line 151) was specifically added, so there was supposed to be a reason for this. But as I said, I believe this is wrong, at least as the default behaviour when converting a string to something else. Maybe some policies (IgnoreLeadingWhitespace, IgnoreTrailingWhitespace) can be added to control this? And again, I suppose I'm not the first to think of it...

Thanks, Yuval

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

John =:->

Yuval Ronen

19 Jan 19 Jan

6:56 a.m.

...

...
1. lexical_cast(Source arg) takes the source argument by value. Why not by const& ? Calling lexical_cast(some_basic_string) now makes a copy of the source argument for no reason.

There actually was a reason. Something about certain functions not working properly with a const&. Like the std::limits or something like that. (I ran into this problem because lexical_cast(Parent p) when given a child will actually create a parent object, so no amount of overriding virtuals changes things.)

The thing about the parent being created is serious. So the problem is even more severe than I thought...

Gennadiy Rozental

18 Jan 18 Jan

10:22 p.m.

...

1. lexical_cast(Source arg) takes the source argument by value. Why not by const& ? Calling lexical_cast(some_basic_string) now makes a copy of the source argument for no reason.

There numerous optimization/enhancements that could be done to lexical_cast. In my version I use different interpretators, type wrappers, specializations e.t.c. And still in my recent study I found that lexical_cast ~ 20 times slower then sprintf and ~200 times slower then my own handcrafted version for decimal integer parcing. Conclusion: do NOT use lexical_cast in performance critical (even performance aware) parts of your code. The only place where I use it is during initialization. Accordingly no need to bother about reverence vs. value semantic.

...

2. lexical_cast ignores trailing whitespaces, but not leading whitespaces. Meaning that

My recomendation is: trim the value on both sodes before cast. And do not rely on lexical cast to detect your whitespaces.

...

Thanks, Yuval

HTH, Gennadiy

Yuval Ronen

19 Jan 19 Jan

7:17 a.m.

...

...
1. lexical_cast(Source arg) takes the source argument by value. Why not by const& ? Calling lexical_cast(some_basic_string) now makes a copy of the source argument for no reason.

There numerous optimization/enhancements that could be done to lexical_cast. In my version I use different interpretators, type wrappers, specializations e.t.c. And still in my recent study I found that lexical_cast ~ 20 times slower then sprintf and ~200 times slower then my own handcrafted version for decimal integer parcing. Conclusion: do NOT use lexical_cast in performance critical (even performance aware) parts of your code. The only place where I use it is during initialization. Accordingly no need to bother about reverence vs. value semantic.

...
2. lexical_cast ignores trailing whitespaces, but not leading whitespaces. Meaning that

My recomendation is: trim the value on both sodes before cast. And do not rely on lexical cast to detect your whitespaces.

Yep, that's a possibility that will work (and will add another string copy to the mix)...

Cory Nelson

5:37 a.m.

For 1, I'm not sure. For 2, It seems more like a problem with stringstream - I think lexical_cast does: stringstream ss; ss << input; ss >> output; return output; On Sat, 15 Jan 2005 18:47:57 +0200, Yuval Ronen <ronen_yuval@yahoo.com> wrote:

...

Hi. I'm probably just another one out of a million who make these remarks, but I found nothing in the documentation about them, so I'll mention them anyway.

1. lexical_cast(Source arg) takes the source argument by value. Why not by const& ? Calling lexical_cast(some_basic_string) now makes a copy of the source argument for no reason.

2. lexical_cast ignores trailing whitespaces, but not leading whitespaces. Meaning that

int a = lexical_cast<int>("3 "); // three-space

will work and return 3, but

int a = lexical_cast<int>(" 3"); // space-three

will throw. Why? There's no rationale here. Both cases should be treated in the same manner, and IMHO, that manner should be throwing. After a look at the code, it seems that the trailing-whitespace-ignore code ('stream >> std::ws' in line 151) was specifically added, so there was supposed to be a reason for this. But as I said, I believe this is wrong, at least as the default behaviour when converting a string to something else. Maybe some policies (IgnoreLeadingWhitespace, IgnoreTrailingWhitespace) can be added to control this? And again, I suppose I'm not the first to think of it...

Thanks, Yuval

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

-- Cory Nelson http://www.int64.org

Yuval Ronen

6:53 a.m.

...

For 2, It seems more like a problem with stringstream - I think lexical_cast does:

stringstream ss; ss << input; ss >> output; return output;

Not quite. In lexical_cast there's a code to specifically *not* ignore leading whitespace - look for for the line 'stream.unsetf(std::ios::skipws)'. It also contains a code to specifically *ignore* trailing whitespace (but not other characters) - look for 'stream >> std::ws'. So this was done intentionally, even though mistakenly in my opinion. My version of 'lexical_cast<T, string>' looks like (I hope I remember correctly): template <typename T> T convert_from_string(const string &str) { T val; istringstream iss(str); iss >> noskipws >> val; if (iss.fail() || !iss.eof()) throw ...; return val; } and it works fine. No problems with the stringstream.

Terje Slettebø

29 Jan 29 Jan

10:06 p.m.

...

From: "Yuval Ronen" <ronen_yuval@yahoo.com>

Hi.

...

I'm probably just another one out of a million who make these remarks, but I found nothing in the documentation about them, so I'll mention them anyway.

Very good. :) Actually, the second issue hasn't to my knowledge been brought up, before, and while the first one has, it's a good question that deserves reconsideration. I've brought this thread to Kevlin Henney's attention, as I've tended to take care of lexical_cast related questions on the list (including forwarding things needing his consideration, such as any changes to lexical_cast, to him). I'm replying here to let you know that your report is being taken care of.

...

1. lexical_cast(Source arg) takes the source argument by value. Why not by const& ? Calling lexical_cast(some_basic_string) now makes a copy of the source argument for no reason.

I heard from Kevlin that the reason was that there originally were some issues with binding to string literals. Changing it to use const reference in my local copy still gives a lot of error messages, having to do with std::numeric_limits being used in the implementation. If we find a way to fix this (yet letting it still work on compilers with no class template partial specialisation, like MSVC 6), without breaking existing programs, then apparently, this change should be ok. If anyone makes a patch for this, it would be welcome.

...

2. lexical_cast ignores trailing whitespaces, but not leading whitespaces. Meaning that

int a = lexical_cast<int>("3 "); // three-space

will work and return 3, but

int a = lexical_cast<int>(" 3"); // space-three

will throw. Why? There's no rationale here. Both cases should be treated in the same manner, and IMHO, that manner should be throwing.

In Kevlin's reply to me, he agreed with this. This has been fixed, and will likely be committed soon, and a note posted about the update. Thanks for your feedback, and sorry for the late response. Regards, Terje

Yuval Ronen

10:57 p.m.

Terje Slettebø wrote:

...

I've brought this thread to Kevlin Henney's attention, as I've tended to take care of lexical_cast related questions on the list (including forwarding things needing his consideration, such as any changes to lexical_cast, to him). I'm replying here to let you know that your report is being taken care of.

Many, many thanks!

...

...
1. lexical_cast(Source arg) takes the source argument by value. Why not by const& ? Calling lexical_cast(some_basic_string) now makes a copy of the source argument for no reason.

I heard from Kevlin that the reason was that there originally were some issues with binding to string literals. Changing it to use const reference in my local copy still gives a lot of error messages, having to do with std::numeric_limits being used in the implementation. If we find a way to fix this (yet letting it still work on compilers with no class template partial specialisation, like MSVC 6), without breaking existing programs, then apparently, this change should be ok. If anyone makes a patch for this, it would be welcome.

Hmmm, interesting. I'll give it a look. Maybe I'll be lucky enough to think of something...

...

...
2. lexical_cast ignores trailing whitespaces, but not leading whitespaces. Meaning that

int a = lexical_cast<int>("3 "); // three-space

will work and return 3, but

int a = lexical_cast<int>(" 3"); // space-three

will throw. Why? There's no rationale here. Both cases should be treated in the same manner, and IMHO, that manner should be throwing.

In Kevlin's reply to me, he agreed with this. This has been fixed, and will likely be committed soon, and a note posted about the update.

Great! Thanks again.

Terje Slettebø

30 Jan 30 Jan

2:54 a.m.

...

From: "Yuval Ronen" <ronen_yuval@yahoo.com>

...

Terje Slettebø wrote:

...
I've brought this thread to Kevlin Henney's attention, as I've tended to take care of lexical_cast related questions on the list (including forwarding things needing his consideration, such as any changes to lexical_cast, to him). I'm replying here to let you know that your report is being taken care of.

Many, many thanks!

...

...
...
1. lexical_cast(Source arg) takes the source argument by value. Why not by const& ? Calling lexical_cast(some_basic_string) now makes a copy of the source argument for no reason.

I heard from Kevlin that the reason was that there originally were some issues with binding to string literals. Changing it to use const reference in my local copy still gives a lot of error messages, having to do with std::numeric_limits being used in the implementation. If we find a way to fix this (yet letting it still work on compilers with no class template partial specialisation, like MSVC 6), without breaking existing

...

...
then apparently, this change should be ok. If anyone makes a patch for

Thanks, yourself. :) Feedback is always welcome (regardless who's library it is); it's an opportunity to improve it. programs, this,

...

...
it would be welcome.

Hmmm, interesting. I'll give it a look. Maybe I'll be lucky enough to think of something...

I've been looking at this, as well, and after trying one approach (doing a compile-time switch to select between the ordinary, and a "dummy" std::numeric_limits (if the source type is an array)), I bumped into another problem: the specialisations for wchar_t, which assumes "const wchar_t *", not "const wchar_t[...]". This might again be fixed, using partial specialisation, to work with array, but another approach could be to "go back to square one" and do the array-to-pointer decay "manually": template<typename Target, typename Source> Target lexical_cast(const Source &arg) { // Array-to-pointer decay typedef typename mpl::if_< is_array<Source>, add_pointer<add_const<remove_bounds<Source>::type>::type>::type, Source >::type NewSource; detail::lexical_stream<Target, NewSource> interpreter; Target result; if(!(interpreter << arg && interpreter >> result)) throw_exception(bad_lexical_cast(typeid(NewSource), typeid(Target))); return result; } This passes all tests on Intel C++ 7.1, MSVC 7.1 and g++ 3.2. I've sent it as a suggestion to Kevlin, as well, I think he's pretty busy with other things. Anyone having a better suggestion than the above is of course welcome to contribute it. One thing about the above change is that it requires a few more includes, but this is really the simplest way I found to be able to use pass by const reference for lexical_cast. The additional includes are: #include <boost/type_traits/is_array.hpp> #include <boost/type_traits/remove_bounds.hpp> #include <boost/type_traits/add_pointer.hpp> #include <boost/type_traits/add_const.hpp> #include <boost/mpl/if.hpp> Regards, Terje

Terje Slettebø

11:30 a.m.

...

From: "Terje Slettebø" <tslettebo@broadpark.no>

...

...
From: "Yuval Ronen" <ronen_yuval@yahoo.com>

...
...
...
1. lexical_cast(Source arg) takes the source argument by value. Why not by const& ? Calling lexical_cast(some_basic_string) now makes a copy of the source argument for no reason.

template<typename Target, typename Source> Target lexical_cast(const Source &arg) { // Array-to-pointer decay

typedef typename mpl::if_< is_array<Source>, add_pointer<add_const<remove_bounds<Source>::type>::type>::type, Source >::type NewSource;

detail::lexical_stream<Target, NewSource> interpreter; Target result;

if(!(interpreter << arg && interpreter >> result)) throw_exception(bad_lexical_cast(typeid(NewSource), typeid(Target))); return result; }

The additional includes are:

#include <boost/type_traits/is_array.hpp> #include <boost/type_traits/remove_bounds.hpp> #include <boost/type_traits/add_pointer.hpp> #include <boost/type_traits/add_const.hpp> #include <boost/mpl/if.hpp>

Come to think of it, it can be done simpler and clearer, and avoiding any extra includes, like this: namespace detail { template<class T> struct array_to_pointer_decay { typedef T type; }; template<class T, size_t N> struct array_to_pointer_decay<T[N]> { typedef const T * type; }; } template<typename Target, typename Source> Target lexical_cast(const Source &arg) { typedef typename detail::array_to_pointer_decay<Source>::type NewSource; detail::lexical_stream<Target, NewSource> interpreter; Target result; if(!(interpreter << arg && interpreter >> result)) throw_exception(bad_lexical_cast(typeid(NewSource), typeid(Target))); return result; } This requires support for partial specialisation, but so does remove_bounds used in the first suggestion, so it doesn't change the compiler requirements. Besides, the current version of lexical_cast in CVS doesn't work with MSVC 6, anyway. Of course, if this was very important, I guess it would be possible to fall back to pass by value for MSVC 6, and possibly fix other problems with it. I also see that MSVC 6 is no longer part of the Boost regression test. Regards, Terje

Yuval Ronen

9:20 p.m.

...

Come to think of it, it can be done simpler and clearer, and avoiding any extra includes, like this:

namespace detail { template<class T> struct array_to_pointer_decay { typedef T type; };

template<class T, size_t N> struct array_to_pointer_decay<T[N]> { typedef const T * type; }; }

template<typename Target, typename Source> Target lexical_cast(const Source &arg) { typedef typename detail::array_to_pointer_decay<Source>::type NewSource;

detail::lexical_stream<Target, NewSource> interpreter; Target result;

if(!(interpreter << arg && interpreter >> result)) throw_exception(bad_lexical_cast(typeid(NewSource), typeid(Target))); return result; }

I haven't tried it, but I have a feeling that it will work for string literals, but not for c-string variables, so another specialization might be needed. But it shouldn't be too hard to add, so no real problem here. Or maybe c-string variables aren't supposed to be supported anyway?

...

This requires support for partial specialisation, but so does remove_bounds used in the first suggestion, so it doesn't change the compiler requirements. Besides, the current version of lexical_cast in CVS doesn't work with MSVC 6, anyway. Of course, if this was very important, I guess it would be possible to fall back to pass by value for MSVC 6, and possibly fix other problems with it. I also see that MSVC 6 is no longer part of the Boost regression test.

Do you mean that boost has finally dropped support for MSVC 6? Congratulations, it's about time... :-) Anyway, the above solution looks excellent to me.

Terje Slettebø

31 Jan 31 Jan

7:07 p.m.

...

From: "Yuval Ronen" <ronen_yuval@yahoo.com>

...

...
Come to think of it, it can be done simpler and clearer, and avoiding any extra includes, like this:

namespace detail { template<class T> struct array_to_pointer_decay { typedef T type; };

template<class T, size_t N> struct array_to_pointer_decay<T[N]> { typedef const T * type; }; }

template<typename Target, typename Source> Target lexical_cast(const Source &arg) { typedef typename detail::array_to_pointer_decay<Source>::type NewSource;

detail::lexical_stream<Target, NewSource> interpreter; Target result;

if(!(interpreter << arg && interpreter >> result)) throw_exception(bad_lexical_cast(typeid(NewSource), typeid(Target))); return result; }

I haven't tried it, but I have a feeling that it will work for string literals, but not for c-string variables

With "c-string variables", do you mean pointer to char? Or something like MFC's CString? Anyway, the first is supported as before. The above just ensures that the type passed to the rest of lexical_cast is the same type it used to be, when it used pass by value. So arrays will decay to pointers, and everything else will pass through unaltered. I ran the above through the unit test, and it passes all, on the three compilers I tested it with. I've since talked with Kevlin, and this was also what he had in mind, to make pass by const reference work.

...

...
This requires support for partial specialisation, but so does remove_bounds used in the first suggestion, so it doesn't change the compiler requirements. Besides, the current version of lexical_cast in CVS doesn't work with MSVC 6, anyway. Of course, if this was very important, I guess it would be possible to fall back to pass by value for MSVC 6, and possibly fix other problems with it. I also see that MSVC 6 is no longer part of the Boost regression test.

Do you mean that boost has finally dropped support for MSVC 6?

Well, eh, not really. :) I asked about this on the developer list, and got this reply: --- Start quote --- From: "Jeff Garland" <jeff@crystalclearsoftware.com> To: <boost@lists.boost.org>; <boost@lists.boost.org> Sent: Sunday, January 30, 2005 7:33 PM Subject: Re: [boost] lexical_cast: Needing to work for MSVC 6 (and similar)? On Sun, 30 Jan 2005 17:55:55 +0100, Terje Slettebø wrote

...

Because of this, Kevlin Henney and I was wondering whether or not MSVC 6 support for this component (and components in Boost in general) is no longer demanded, or in general, what the required conformance level is?

We've discussed this a number of times, but there is no 'boost-wide' policy. Some libraries have slowly started dropping VC6 support -- spirit comes to mind. Some authors (as I recall the Boost.Python folks) have a strong user base still using VC6 and don't want to cut it off. With date-time I've been taking the approach of attempting to maintain existing functionality, but new functionality that breaks VC6 is essentially simply marked. Even maintaining existing functions takes significant effort because of the VC6 fragilities. For example, someone recently reported that VC6 with STLPort and date-time creates an ICE. Personally, I think the time has come to cut the cord on VC6 testing and compatibility. In my mind Boost as a whole is being held back by the time Boost developers waste hacking apart their implementations and answering questions for this 7 year old, seriously non-compliant compiler. Perhaps the fact that we move on will spur some people to upgrade compilers, improving the life of a few c++ developers along the way (I suppose this also might increase sales in Redmond -- of course they already mention Boost on the VC7 packaging...). And the folks stuck with VC6 can always use older versions of Boost or test out new versions against VC6 for themselves, submit patches, or pay someone to do this work. BTW, I have an interest in your decision b/c date-time depends on lexical_cast. So far whatever is breaking the regression tests doesn't seem to be affecting date-time... Jeff --- End quote --- However, since lexical_cast is such a simple component, and that - like it or not - it seems that quite a few people are still using MSVC 6 at work (possibly due to the codebase not handling a newer compiler...), it would be a shame not to make it available for them, as well. Therefore, we'll use the code above (which requires partial specialisation), and supply preprocessor-switched workaround-code for compilers lacking PTS, using full specialisation.

...

Congratulations, it's about time... :-)

I wish... :) One problem is that it's only a couple of years since MSVC 7.1 came, and even though MSVC 6 is old, 7.1 was the first version with PTS. So we might have to live with 6.0 a little longer. At least for existing libraries.

...

Anyway, the above solution looks excellent to me.

Thanks. :) Regards, Terje

Yuval Ronen

1 Feb 1 Feb

6:50 a.m.

...

...
I haven't tried it, but I have a feeling that it will work for string literals, but not for c-string variables

With "c-string variables", do you mean pointer to char? Or something like MFC's CString?

pointer to char

...

Anyway, the first is supported as before. The above just ensures that the type passed to the rest of lexical_cast is the same type it used to be, when it used pass by value. So arrays will decay to pointers, and everything else will pass through unaltered. I ran the above through the unit test, and it passes all, on the three compilers I tested it with. I've since talked with Kevlin, and this was also what he had in mind, to make pass by const reference work.

Ok. As I said, this comment was made without trying it, so it's quite possible (and even probable) that I was mistaking. I'll be more careful next time (spank on the tush for me...)

7469

Age (days ago)

7486

Last active (days ago)

List overview

Download

0 comments

1 participants

participants (1)

Cory Nelson
Gennadiy Rozental
John A Meinel
Terje Slettebø
Yuval Ronen