On Tue, Aug 31, 2004 at 05:04:48PM +0300, Peter Dimov wrote:
Carlo Wood wrote:
On Tue, Aug 31, 2004 at 02:27:02PM +0300, Peter Dimov wrote:
A general "binary serialization class" would never know where builtin-variables begin and end (or what type they are) and therefore cannot swap bytes on Big-Endian machines.
You may be right, depending on the meaning of "general", but useful binary serializers do exist. You just need to decide how to represent the built-in types. For example, I've chosen that the external representation of a char is 8 bits, a short 16, int and long 32. This is perfectly portable, as long as the values of my variables do not exceed these limits.
What does size have to do with endianness? if you have the following struct:
struct Data { char c1; char lt[3]; int s; unsigned short p1; unsigned short p2; bool init; bool flags[5]; unsigned char t[8]; };
Then how would your general (== does not know anything about the internals of 'Data') serializer write that to a TCP/IP socket when running on a Big-Endian machine, such that both Big-Endian and Little-Endian machines can read the result from the network?
No way. But my serializer isn't "general" by your definition (endianness is only one of the problems, there's also sizeof() and padding). It is an ordinary boost::serialization-style serializer that happens to use a binary external format.
Hmm, since this subject is coming up here and has also just popped up on comp.lang.c++.moderated, I'll venture an offering. A little over a year ago, I wrote a general-purpose binary packing/unpacking class (inspired by Perl and Python's pack() and unpack()) that has held up pretty well in production for over a year now. I've always wanted to ask my company for permission to release it into the public domain (maybe even into Boost?!?), but haven't pushed it as I'd not heard anyone looking for anything quite like it--until now. If anyone's interested, I'll pursue getting permission from my company to open it up ASAP. To solve the endianness, size, and padding problems, the approach I'd take with my class would be to make sure the protocol expected a single character indicating the byte order of the data stream first; then you'd grab the appropriate unpacker from the factory and start unpacking away. It's up to the protocol writer to know how to unpack the bytes in the proper order into the proper structures from there. Of course, packing/unpacking isn't quite the same as serialization, I'm sure, but maybe serialization isn't exactly the term that applies here? Mike