[Serialization] XML: float format is scientific instead of human-readable since Boost 1.57
Hello Boost guys,
when I switched from Boost 1.55 to 1.57 there was a change in the
formation of floats and doubles for xml archives. Look at this example:
#include <fstream>
#include
Frank Stähr wrote
I compiled it with VS 2013 for 32 bit. The output file has the line:
2.99999999999999990e-002 In Boost 1.55 it was much better human-readable: 0.03.Is this behaviour (this change) intended? In any case, what is the best way to format floats and doubles? (i. e. a manipulator for the of stream?)
This is surely an unintended side effect of changes to guarantee correct "round tripping" of floating point numbers. That is that the the floating point number read in is bit for bit the same as the floating point number written out. This was seen as good thing. I'm not sure if this creates some unintended problems. Robert Ramey -- View this message in context: http://boost.2283326.n4.nabble.com/Serialization-XML-float-format-is-scienti... Sent from the Boost - Users mailing list archive at Nabble.com.
-----Original Message----- From: Boost-users [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Robert Ramey Sent: 28 January 2015 15:17 To: boost-users@lists.boost.org Subject: Re: [Boost-users] [Serialization] XML: float format is scientific instead of human-readable since Boost 1.57
Frank Stähr wrote
I compiled it with VS 2013 for 32 bit. The output file has the line:
2.99999999999999990e-002 In Boost 1.55 it was much better human-readable: 0.03.Is this behaviour (this change) intended? In any case, what is the best way to format floats and doubles? (i. e. a manipulator for the of stream?)
This is surely an unintended side effect of changes to guarantee correct "round tripping" of floating point numbers. That is that the the floating point number read in is bit for bit the same as the floating point number written out.
This was seen as good thing.
Short answer:
If you want to be sure that what you get back from de-serialization is bit-for-bit identical to what you serialized,
then it is definitely a Good Thing.
For some MS compilers at least, it is also necessary to use scientific format for this to avoid some
very intermittent failures to 'round-trip' correctly. (This was discovered by a real-time user - and was only pinned down by random testing!)
Long Answer:
For more than you will want to know see:
Exploring Binary
But the other thing is that by setting precision to 17 lexical_cast is bloating the string representations of the doubles with lots of 9s in both Visual Studio 2010 and Visual Studio 2013. Setting precision to 15 instead prevents this, and makes the original test run faster even with Visual Studio 2013 (about 4 seconds rather than 10).
In order to be sure of 'round-tripping' one needs to output std::numeric_limits<FPT>::max_digits10 decimal digits. max_digits10 is 17 for double enough to ensure that all *possibly* significant digits are used. digits10 is 15 for double and using this will work for *your* example, but will fail to 'round-trip' exactly for *some* values of double. The reason for a rewrite *might* be that for VS <=11, there was a slight 'feature' ('feature' according to Microsoft, 'bug' according to many, though the C++ Standard does NOT require round-tripping to be exact. Recent GCC and Clang achieve exact round-tripping.) // The original value causing trouble using serialization was 0.00019075645054089487; // wrote 0.0019075645054089487 // read 0.0019075645054089489 // a increase of just 1 bit. // Although this test uses a std::stringstream, it is possible that // the same behaviour will be found with ALL streams, including cout and cin? // The wrong inputs are only found in a very narrow range of values: // approximately 0.0001 to 0.004, with exponent values of 3f2 to 3f6 // and probably every third value of significand (tested using nextafter). However, a re-test reveals that this 'feature' is still present using VS2013 (version 12.0). (This tests uses random double values to find round-trip or loopback failures). So the price of accuracy is lots of digits (and time to output and re-digest them) :-( Paul --- Paul A. Bristow Prizet Farmhouse Kendal UK LA8 8AB +44 (0) 1539 561830
Am 28.01.2015 um 16:16 schrieb Robert Ramey:
Frank Stähr wrote
I compiled it with VS 2013 for 32 bit. The output file has the line:
2.99999999999999990e-002 In Boost 1.55 it was much better human-readable: 0.03.This is surely an unintended side effect of changes to guarantee correct "round tripping" of floating point numbers. That is that the the floating point number read in is bit for bit the same as the floating point number written out.
Yes, I read about "round tripping" in some mail archives, thanks for clarification. In my opinion it would be more sensible to use the format options from the ofstream. This would suit much more elegant to the std … but anyway, I am not familiar with round tripping …
This was seen as good thing. I'm not sure if this creates some unintended problems.
Not really, this just makes it harder to read or edit XMLs now. Is it possible for us to write the “old” XML format? Frank
-----Original Message----- From: Boost-users [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Frank Stähr Sent: 28 January 2015 17:07 To: boost-users@lists.boost.org Subject: Re: [Boost-users] [Serialization] XML: float format is scientific instead of human-readable since Boost 1.57
Am 28.01.2015 um 16:16 schrieb Robert Ramey:
Frank Stähr wrote
I compiled it with VS 2013 for 32 bit. The output file has the line:
2.99999999999999990e-002 In Boost 1.55 it was much better human-readable: 0.03.This is surely an unintended side effect of changes to guarantee correct "round tripping" of floating point numbers. That is that the the floating point number read in is bit for bit the same as the floating point number written out.
Yes, I read about "round tripping" in some mail archives, thanks for clarification. In my opinion it would be more sensible to use the format options from the ofstream. This would suit much more elegant to the std … but anyway, I am not familiar with round tripping …
As long as you don't mind the risk (small but not zero) that some values in your archive will be slightly different after de-serialization. I'd worry about that personally. Most would assume that it won't happen? Paul --- Paul A. Bristow Prizet Farmhouse Kendal UK LA8 8AB +44 (0) 1539 561830
On 29-01-2015 11:30, Paul A. Bristow wrote:
-----Original Message----- From: Boost-users [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Frank Stähr
Yes, I read about "round tripping" in some mail archives, thanks for clarification. In my opinion it would be more sensible to use the format options from the ofstream. This would suit much more elegant to the std … but anyway, I am not familiar with round tripping …
As long as you don't mind the risk (small but not zero) that some values in your archive will be slightly different after de-serialization.
I'd worry about that personally. Most would assume that it won't happen?
Yes, we really don't want to loose information. It's so hard to track down errors originating from these kind of silent changes. -Thorsten
On Fri, Jan 30, 2015 at 4:04 AM, Frank Stähr
Here a reminder to my most important question:
On 28.01.2015 at 18:07, Frank Stähr wrote:
Not really, this just makes it harder to read or edit XMLs now. Is it possible for us to write the “old” XML format?
So I guess no?
The way our company chose to create human-readable XML is to serialize strings instead of doubles or floats. We used a stringstream with std::setprecision to get a certain number of decimal places. The number has a chance of being slightly different after de-serialization and pulling it out of the stringstream, just like the old serialization method. This is okay for us, as we need human-readable XML more than exact round tripping. Hope this helps. _______________________________________________
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
From: Boost-users [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Clark Cianfarini
Sent: 30 January 2015 14:45
To: boost-users@lists.boost.org
Subject: Re: [Boost-users] [Serialization] XML: float format is scientific instead of human-readable since Boost 1.57
On Fri, Jan 30, 2015 at 4:04 AM, Frank Stähr
Frank Stähr wrote
Here a reminder to my most important question:
On 28.01.2015 at 18:07, Frank Stähr wrote:
Not really, this just makes it harder to read or edit XMLs now. Is it possible for us to write the “old” XML format?
So I guess no?
Not necessarily. The archive classes have provision for options to be set upon opening. One way would be too add a new option - no_roundtripping or whatever. Of course that means tweaking the implementation of the library. These days I only do this in response to a cogently formulated request entered into the trac system. The problem is that these things always to turn out to have unintended consequences. Since we test serialization pretty thoroughly (though not as thoroughly as I would like), this always ends up as more work than anticipated so I tend to drag my feet in these matters. Speaking from memory, I believe that all the text base archives use set_precision on the underlying stream (and reset back to what it was upon leaving). So maybe the best thing would be to create and option - no_stream_modication which would suppress this behavior (speaking from memory, there might even be such and option specified already!). So if it's really important to you and you want to do a little work, feel free to delve into this aspect of the library and propose an enhancement/modification. If you're really confident you can fork the serialization library on github, test out your updates (don't forget update to the documentation!!!!). Then you can issue a pull request and start bugging me to include it. I've generally (though not always) been willing to merge these into the library. So I'm willing to consider accommodating this request. The real question is: is it sufficiently important for you to do the work or only important enough for me to do the work? Robert Ramey -- View this message in context: http://boost.2283326.n4.nabble.com/Serialization-XML-float-format-is-scienti... Sent from the Boost - Users mailing list archive at Nabble.com.
participants (5)
-
Clark Cianfarini
-
Frank Stähr
-
Paul A. Bristow
-
Robert Ramey
-
Thorsten Ottosen