Same semantics as, e.g., stringstream, but it operates on binary data. I already have the analogues to istringstream/stringbuf (you can take a look at ibitstream/bitbuf at https://github.com/dplong/bstream) There are some interesting features, but I won't go into them here. If there is interest, I would implement obitstream and bitstream and possibly support Boost::dynamic_bitset. The rationale is _succinct_ expression of predictive parsing of binary data. I've used ibitstream in production code for decoding RTP headers and frames of various video encodings. Below is a simple function that decodes an RTP header using ibitstream; I also have a GitHub repo that contains more complex code for high-level parsing of H.264 frames at https://github.com/dplong/rtspudph264 struct RtpHeader { bool padding, marker; bitset<7> payloadType; WORD sequenceNumber; DWORD timestamp, ssrcIdentifier; vector<DWORD> csrcIdentifier; struct { bool present; WORD identifier; vector<BYTE> contents; } extension; }; bool ParseRtpHeader(const BYTE *buffer, RtpHeader &rtp) { ibitstream bin(buffer); static const bitset<2> version(0x2); bitset<4> csrcCount; WORD extensionLength; bin >> version >> rtp.padding >> rtp.extension.present >> csrcCount >> rtp.marker >> rtp.payloadType >> rtp.sequenceNumber >> rtp.timestamp >> rtp.ssrcIdentifier >> setrepeat(csrcCount.to_ulong()) >> rtp.csrcIdentifier; if (rtp.extension.present) { bin >> rtp.extension.identifier >> extensionLength >> setrepeat(extensionLength * sizeof(DWORD)) >> rtp.extension.contents; } return bin.good(); }
I would be interested in such a stream type. However, only if there is an optional way to force the endianess (via implicit conversion) or any other feature allowing easy serialization on multiple platforms. Joel Lamotte
On 6/28/2013 9:33 AM, Klaim - Joël Lamotte wrote:
I would be interested in such a stream type. However, only if there is an optional way to force the endianess (via implicit conversion)
There are two endianesses (?) at play here--in the bit stream and on the platform. To which are you referring? If the latter, there isn't anything to worry about because ibitstream makes no assumptions about the endianess of the platform, but it does assume network order, or big endian, in the bit stream. If you are concerned about the bit-stream endianess, we'd need to work out the specifics, because I don't think the semantics would be obvious for other endians. For example, can the bit endianess be specified separate from the byte endianess? (Hmm... that's reminds me--it currently supports STL bitsets but I don't think it supports C bitfields. I'll have to look into that.) Also, what bit-stream endianesses should bitstream support--big, little, mixed, middle?
or any other feature allowing easy serialization on multiple platforms.
Like I said, it should already behave the same on all platforms, but maybe I'm missing something. What is your specific concern? Paul
On Fri, Jun 28, 2013 at 8:03 PM, Paul Long
Like I said, it should already behave the same on all platforms, but maybe I'm missing something. What is your specific concern?
I meant platform A uses one endianness, platform B uses another, but if i put data from either into the bitstream, the bitstream will use one specific endianness whatever the platform. Which is basically what you describe (the network packet case indeed) if I understood correctly. Joel Lamotte
On 6/28/2013 3:44 PM, Klaim - Joël Lamotte wrote:
I meant platform A uses one endianness, platform B uses another, but if i put data from either into the bitstream, the bitstream will use one specific endianness whatever the platform. Which is basically what you describe (the network packet case indeed) if I understood correctly. Correct, but it is important to note that integrals are always encoded big endian in the bitstream. I suppose it's possible that someone would want some other endian, but that's got to be an obscure use case.
Actually, I can think of an example. Years ago, I worked on a WinTel, little-endian product, and a colleague didn't use any serialization for passing objects between two network components. The thinking was, pshaw, it'll always be a Windows product, so why bother with serialization. Then, of course, Apple paid us to port it to their 68000, big-endian platform while maintaining interoperability. In a use case like this, it might be useful to support other-than-big endianness. What do you think? Should the bitstream library support multiple endian schemes in the bit stream or is big endian enough? Paul
On Sat, Jun 29, 2013 at 12:50 AM, Paul Long
What do you think? Should the bitstream library support multiple endian schemes in the bit stream or is big endian enough?
In all the use cases I am thinking about in my own projects, the application have a client and server parts and both client and server can/will be on different platforms (win/mac/linux plus some platform I'm not sure). Currently I'm using Raknet to communicate between the different processes, it provide a specific stream type but it's not very general. Does it answer your question? Joel Lamotte
On Sat, Jun 29, 2013 at 1:16 AM, Klaim - Joël Lamotte
Currently I'm using Raknet to communicate between the different processes, it provide a specific stream type but it's not very general.
To be clear: curently there is only one module of one of my applications that uses RakNet and should depend on it, but the other modules should be independant. Still they need to provide some data. So I'm using a std::string as raw data "streams" and move data into the module which will then push the data into Raknet facilities. It would be better if I was using a bitstream (or something similar) in this specific case, instead of std::string (in parts of the application which should not use Raknet). Joel Lamotte
On 6/28/2013 6:19 PM, Klaim - Joël Lamotte wrote:
Currently I'm using Raknet to communicate between the different processes, it provide a specific stream type but it's not very general. To be clear: curently there is only one module of one of my applications
On Sat, Jun 29, 2013 at 1:16 AM, Klaim - Joël Lamotte
wrote: that uses RakNet and should depend on it, but the other modules should be independant. Still they need to provide some data. So I'm using a std::string as raw data "streams" and move data into the module which will then push the data into Raknet facilities. It would be better if I was using a bitstream (or something similar) in this specific case, instead of std::string (in parts of the application which should not use Raknet). (Hah! I've used std::string to hold raw binary data, too.)
Oh, so you need to be interoperable with whatever that Raknet component does on the wire? If it does not serialize, the components of a different endianness have to convert to their native endianness. Yuck. If unserialized data is common "in the wild," perhaps the proposed bitstream library should support various endianness in the bit stream. I could either implement this initially or defer it until the need is recognized. Paul
On Sat, Jun 29, 2013 at 2:44 AM, Paul Long
Oh, so you need to be interoperable with whatever that Raknet component does on the wire? If it does not serialize, the components of a different endianness have to convert to their native endianness. Yuck. If unserialized data is common "in the wild," perhaps the proposed bitstream library should support various endianness in the bit stream. I could either implement this initially or defer it until the need is recognized.
It's ok, the data I'm sending is generated via protobuf (which makes my request not really necessary I guess). Joel Lamotte
On 6/28/2013 6:16 PM, Klaim - Joël Lamotte wrote:
On Sat, Jun 29, 2013 at 12:50 AM, Paul Long
wrote: What do you think? Should the bitstream library support multiple endian schemes in the bit stream or is big endian enough? In all the use cases I am thinking about in my own projects, the application have a client and server parts and both client and server can/will be on different platforms (win/mac/linux plus some platform I'm not sure). Currently I'm using Raknet to communicate between the different processes, it provide a specific stream type but it's not very general. Does it answer your question? Not really, since you didn't tell me what you thought the proposed bitstream library should support, but I think your answer would be that big endian in the bit stream is indeed enough.
BTW, I took a look at Raknet. Since C++ does not support reflection and I see a Raknet example where an entire struct is being written, the Raknet::Bitstream class must effectively call memcpy or, as Mathias Gaunard said in his email, call something like write(&x, sizeof x). However, I also see that if one undefines the manifest constant, __BITSTREAM_NATIVE_END, before calling Bitstream I/O functions for fundamental types or just using the Serialize() member function for fundamental types, Raknet performs "endian swapping." This implies a canonical form which is most likely big endian. Since the components of your projects are interoperable across platforms with different endianness, I assume you have undefined this constant (or are using Serialize()) and are therefore most likely conveying data in network byte order on the wire. This is how my bitstream library works--integrals are normalized to network byte order in the bit stream. Although conveying data without serialization consumes fewer CPU cycles--less work to do--it is of course not interoperable between platforms of different endianness. Even without worrying about endianness, it is also not interoperable between the same platforms running software built with different compiler settings regarding alignment and padding within structs, unions, and classes. In other words, it's a really bad idea. That's why I personally think that the proposed bitstream library does not need to support anything other than big endian in the bit stream. But I'll go with whatever the consensus is. Paul
Hi, Paul Long wrote:
What do you think? Should the bitstream library support multiple endian schemes in the bit stream or is big endian enough?
Various communication protocols or files may define the endianess of data differently. There are even cases like TIFF files which define the endianess of data in the header. Therefore this type of library should support different endianesses. It should also probably support switching them for the same stream multiple times. E.g. some manipulators could be provided: namespace bs = boost::bitstream; mystream >> bs::big >> my_int16 >> my_IEEE754_float_32 >> bs::little >> other_int16; Of course the endianess of variables on a specific platform should be taken into account. It could also support some non-C++ formats like 16-bit half precision or 128-bit quad precision floats. Some typedefs would probably be required in namespace boost::bitstream. Best Regards, Adam Wulkiewicz
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Adam Wulkiewicz Sent: Saturday, June 29, 2013 12:43 PM To: boost@lists.boost.org Subject: Re: [boost] Any interest in bitstream class?
Hi,
Paul Long wrote:
What do you think? Should the bitstream library support multiple endian schemes in the bit stream or is big endian enough?
Various communication protocols or files may define the endianess of data differently. There are even cases like TIFF files which define the endianess of data in the header. Therefore this type of
library should
support different endianesses. It should also probably support switching them for the same stream multiple times. E.g. some manipulators could be provided:
namespace bs = boost::bitstream;
mystream >> bs::big >> my_int16 >> my_IEEE754_float_32 >> bs::little >> other_int16;
Of course the endianess of variables on a specific platform should be taken into account.
It could also support some non-C++ formats like 16-bit half precision or 128-bit quad precision floats. Some typedefs would probably be required in namespace boost::bitstream.
A proposal for floating-point typedefs http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3626.pdf may provide some help - eventually ;-) For example: "Specifying 128-bit precision The proposed typedef float128_t provides a standardized way to specify quadruple-precision (Quadruple-precision floatingpoint format) in C++." Paul --- Paul A. Bristow, Prizet Farmhouse, Kendal LA8 8AB UK +44 1539 561830 07714330204 pbristow@hetp.u-net.com
On 29/06/13 14:06, Paul A. Bristow wrote:
The proposed typedef float128_t provides a standardized way to specify quadruple-precision (Quadruple-precision floatingpoint format) in C++."
I wish all compilers actually provided 128-bit floating-point. I suspect that even if this is accepted, a lot of implementations will not have that typedefs simply because they won't have float128 support.
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Mathias Gaunard Sent: Saturday, June 29, 2013 6:52 PM To: boost@lists.boost.org Subject: Re: [boost] Any interest in bitstream class?
On 29/06/13 14:06, Paul A. Bristow wrote:
The proposed typedef float128_t provides a standardized way to specify quadruple-precision (Quadruple-precision floatingpoint format) in C++."
I wish all compilers actually provided 128-bit floating-point. I suspect that even if this is accepted, a lot of implementations will not have that typedefs simply because they won't have float128 support.
In that case, using Boost.Multiprecision will probably be an option (if probably slower than a hardware or hand-coded or compiler solution). But I doubt this is the main use case for bitstreaming? Paul
On 6/29/2013 6:42 AM, Adam Wulkiewicz wrote:
... this type of library should support different endianesses. It should also probably support switching them for the same stream multiple times. E.g. some manipulators could be provided:
namespace bs = boost::bitstream;
mystream >> bs::big >> my_int16 >> my_IEEE754_float_32 >> bs::little
other_int16;
Good idea. I have started assembling a to-do list in the bstream wiki on GitHub: https://github.com/dplong/bstream/wiki/To-do
Of course the endianess of variables on a specific platform should be taken into account.
This is what I wrote in a previous post: "[ibitstream] does not assume any particular endianness of the platform (for example, it does not blindly copy platform memory to the bit stream) ... The effect, however, is that ... integrals on the platform are always their native endian. That said, awareness of platform endianness could inform future optimizations. "
It could also support some non-C++ formats like 16-bit half precision or 128-bit quad precision floats. Some typedefs would probably be required in namespace boost::bitstream.
Okay, sure. BTW, ibitstream consumers can easily support new types by overloading operator>>. For example, this is how bool is supported: ibitstream &operator>>(ibitstream &ibs, bool &b) { bitfield value; ibs.read(value, 1); b = value != 0; return ibs; } Paul
Paul Long wrote:
It could also support some non-C++ formats like 16-bit half precision or 128-bit quad precision floats. Some typedefs would probably be required in namespace boost::bitstream.
Okay, sure. BTW, ibitstream consumers can easily support new types by overloading operator>>. For example, this is how bool is supported:
ibitstream &operator>>(ibitstream &ibs, bool &b) { bitfield value; ibs.read(value, 1); b = value != 0;
return ibs; }
Ok, however those would only apply for C++-like types. What if we have some file which stores 16-bit floats or floats with different radix than the native one, or even in some other standard? And this is jus a tip of an iceberg. There is a lot of different ways to describe similar types to C++ native types. Should the user be forced to e.g. wrap native types and write overloads? Should those different formats be supported by bitstream somehow (e.g. by predefined wrappers)? And who should define how those types in stored and native format should be converted between each other? Regards, Adam
On 6/30/2013 7:50 AM, Adam Wulkiewicz wrote:
Ok, however those would only apply for C++-like types. What if we have some file which stores 16-bit floats or floats with different radix than the native one, or even in some other standard? And this is jus a tip of an iceberg. There is a lot of different ways to describe similar types to C++ native types.
Oh, okay, I understand. I think you're asking for serialization functionality, which would be akin to wanting stringstream to handle, say, JSON. That is an "anti-goal" of bitstream, which I list here: https://github.com/dplong/bstream/wiki/Goals
Should the user be forced to e.g. wrap native types and write overloads? Should those different formats be supported by bitstream somehow (e.g. by predefined wrappers)? And who should define how those types in stored and native format should be converted between each other?
Yes, unless you can convince me that my thinking is wrong and this should inherently be part of a bit-streaming library. Paul
On Jun 28, 2013, at 6:50 PM, Paul Long
On 6/28/2013 3:44 PM, Klaim - Joël Lamotte wrote:
I meant platform A uses one endianness, platform B uses another, but if i put data from either into the bitstream, the bitstream will use one specific endianness whatever the platform. Which is basically what you describe (the network packet case indeed) if I understood correctly. Correct, but it is important to note that integrals are always encoded big endian in the bitstream. I suppose it's possible that someone would want some other endian, but that's got to be an obscure use case.
There are plenty of little endian only contexts that would want to avoid the host->network->host conversions for all transfers. ___ Rob (Sent from my portable computation engine)
On 6/29/2013 10:04 AM, Rob Stewart wrote:
There are plenty of little endian only contexts that would want to avoid the host->network->host conversions for all transfers.
..and that's one reason why support for little endian is now planned (https://github.com/dplong/bstream/wiki/To-do). I'd like to localize the endianess behavior so that a consumer could even define an otherwise unsupported endianess via, for example, overloading a single function. Paul
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Paul Long Sent: Friday, June 28, 2013 2:44 AM To: boost@lists.boost.org Subject: [boost] Any interest in bitstream class?
Same semantics as, e.g., stringstream, but it operates on binary data. I already have the analogues to istringstream/stringbuf (you can take a look at ibitstream/bitbuf at https://github.com/dplong/bstream) There are some interesting features, but I won't go into them here. If there is interest, I would implement obitstream and bitstream and possibly support Boost::dynamic_bitset.
The rationale is _succinct_ expression of predictive parsing of binary data.
I've used ibitstream in production code for decoding RTP headers and frames of various video encodings. Below is a simple function that decodes an RTP header using ibitstream; I also have a GitHub repo
that
contains more complex code for high-level parsing of H.264 frames at https://github.com/dplong/rtspudph264
struct RtpHeader { bool padding, marker; bitset<7> payloadType; WORD sequenceNumber; DWORD timestamp, ssrcIdentifier; vector<DWORD> csrcIdentifier; struct { bool present; WORD identifier; vector<BYTE> contents; } extension; };
bool ParseRtpHeader(const BYTE *buffer, RtpHeader &rtp) { ibitstream bin(buffer); static const bitset<2> version(0x2); bitset<4> csrcCount; WORD extensionLength;
bin >> version >> rtp.padding >> rtp.extension.present >> csrcCount >> rtp.marker >> rtp.payloadType >> rtp.sequenceNumber >> rtp.timestamp >> rtp.ssrcIdentifier >> setrepeat(csrcCount.to_ulong()) >> rtp.csrcIdentifier;
if (rtp.extension.present) { bin >> rtp.extension.identifier >> extensionLength >> setrepeat(extensionLength * sizeof(DWORD)) >> rtp.extension.contents; }
return bin.good(); }
Looks very useful - for some. To get into Boost (in fact to get much further) it needs some tests (using Boost.test of course) and some fully worked and commented examples, and of course some docs. However I'm delighted to see the you already have lots of Doxygen comments in the code. These can be used with text and tutorial stuff in easy-to-use mark-up language Quickbook to provide an indexed reference section that will meet people's needs. (Contact me privately if you are thinking of embarking on this - I can build a prototype version that will get you going quickly). Paul --- Paul A. Bristow, Prizet Farmhouse, Kendal LA8 8AB UK +44 1539 561830 07714330204 pbristow@hetp.u-net.com
To get into Boost (in fact to get much further) it needs some tests (using Boost.test of course) and some fully worked and commented examples, and of course some docs. I just included a link to the source as part of my initial query, not necessarilly to be reviewed. I know I need lots more documentation, but
On 6/28/2013 10:07 AM, Paul A. Bristow wrote: thanks for an outline of what I need next. I think I have an assert-based test suite somewhere that I can convert to using Boost.test.
However I'm delighted to see the you already have lots of Doxygen comments in the code. These can be used with text and tutorial stuff in easy-to-use mark-up language Quickbook Oh, I hadn't heard about Quickbook. I'll look into it. (Contact me privately if you are thinking of embarking on this - I can build a prototype version that will get you going quickly). I will contact you privately. Thanks.
Based on your and Joel Lamotte's positive responses, I will proceed with a full bitstream submission. Paul
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Mathias Gaunard Sent: Friday, June 28, 2013 9:28 PM To: boost@lists.boost.org Subject: Re: [boost] Any interest in bitstream class?
On 28/06/13 17:07, Paul A. Bristow wrote:
To get into Boost (in fact to get much further) it needs some tests (using Boost.test of course)
Using Boost.Test is not a requirement AFAIK.
No - but it is very popular and well understood, and simple to use for simple cases, and works well with the suite of testers on the many platforms. I can't think of any reason why not to use it. Perhaps a few simple loopback tests are all the is really essential? (I note, with approval, that lots of potential 'naughty' abuses are already caught by assert compile-time checks in the code). And some examples already exist, but a one or two really simple ones would be desirable too. The latter could be referenced from the Quickbook text as snippets of code. Paul PS I will reply privately about prototype docs. --- Paul A. Bristow, Prizet Farmhouse, Kendal LA8 8AB UK +44 1539 561830 07714330204 pbristow@hetp.u-net.com
On 29/06/13 10:11, Paul A. Bristow wrote:
Using Boost.Test is not a requirement AFAIK.
No - but it is very popular and well understood, and simple to use for simple cases, and works well with the suite of testers on the many platforms. I can't think of any reason why not to use it.
Two main reasons: - you already have a test suite built with another tool. In which case, all that's needed is that the output can be collected by the regression scripts. - you want something more powerful than Boost.Test or more specialized to testing your problem domain
On 6/29/2013 12:32 PM, Mathias Gaunard wrote:
Two main reasons: - you already have a test suite built with another tool. In which case, all that's needed is that the output can be collected by the regression scripts. - you want something more powerful than Boost.Test or more specialized to testing your problem domain
I cannot find the regression tests I thought existed for ibitstream (if they ever existed); therefore, I will write new ones using Boost.Test. I assume this is acceptable. Paul
On 28/06/13 03:44, Paul Long wrote:
Same semantics as, e.g., stringstream, but it operates on binary data. I already have the analogues to istringstream/stringbuf (you can take a look at ibitstream/bitbuf at https://github.com/dplong/bstream) There are some interesting features, but I won't go into them here. If there is interest, I would implement obitstream and bitstream and possibly support Boost::dynamic_bitset.
The rationale is _succinct_ expression of predictive parsing of binary data.
I've used ibitstream in production code for decoding RTP headers and frames of various video encodings. Below is a simple function that decodes an RTP header using ibitstream; I also have a GitHub repo that contains more complex code for high-level parsing of H.264 frames at https://github.com/dplong/rtspudph264
A couple of questions: - how do you overload operator>> or operator<< for your stream? How do you avoid conflicts with text-based overloads? - what does writing an int to a stream actually do? Is it the same behaviour as write(&i, sizeof(int)), or does it translate the int to big-endian first?
- how do you overload operator>> or operator<< for your stream? I'm not sure what you mean by "how." To date, I have only implemented
How do you avoid conflicts with text-based overloads? Maybe I misunderstand your question, but I'm not even aware of how there can be a conflict. Since these operators are overloaded with left hand of bitstream, ibitstream, or obitstream, they cannot be applied to the std::iostream derivatives. Is that what you mean? - what does writing an int to a stream actually do? Is it the same behaviour as write(&i, sizeof(int)), or does it translate the int to big-endian first? Effectively the latter. It does not assume any particular endianness of
On 6/28/2013 3:32 PM, Mathias Gaunard wrote: the input side and so haven't overloaded operator<<. For operator>>, I have overloaded it for various right-hand parameters, including integrals and a few stream manipulators, such as aligng(). The left hand is always an ibitstream reference (which will be extended to bitstream and obitstream references once I support output streams). the platform (for example, it does not blindly copy platform memory to the bit stream) and therefore does not actually "translate" between endians. The effect, however, is that integrals in the bit stream are always big endian and integrals on the platform are always their native endian. That said, awareness of platform endianness could inform future optimizations. Paul
On Jun 28, 2013, at 7:18 PM, Paul Long
On 6/28/2013 3:32 PM, Mathias Gaunard wrote:
- how do you overload operator>> or operator<< for your stream? I'm not sure what you mean by "how." To date, I have only implemented the input side and so haven't overloaded operator<<. For operator>>, I have overloaded it for various right-hand parameters, including integrals and a few stream manipulators, such as aligng(). The left hand is always an ibitstream reference (which will be extended to bitstream and obitstream references once I support output streams).
I suspect that Mathias is assuming ibitstream derives from istream, so there would be issues selecting the right extraction overload for a UDT.
How do you avoid conflicts with text-based overloads? Maybe I misunderstand your question, but I'm not even aware of how there can be a conflict. Since these operators are overloaded with left hand of bitstream, ibitstream, or obitstream, they cannot be applied to the std::iostream derivatives. Is that what you mean?
If you derive from the standard library types, then the text based operators can also apply. In that case, any bitstream overload not provided for a UDT will imply use of the text based operator instead. That would mean silently getting text based encoding.
- what does writing an int to a stream actually do? Is it the same behaviour as write(&i, sizeof(int)), or does it translate the int to big-endian first? Effectively the latter. It does not assume any particular endianness of the platform (for example, it does not blindly copy platform memory to the bit stream) and therefore does not actually "translate" between endians.
Surely you're doing host to network swaps. That means you assume input and output in host order, and transmission in network order. That fits "translation" well enough I think. ___ Rob (Sent from my portable computation engine)
On 6/29/2013 10:12 AM, Rob Stewart wrote:
I suspect that Mathias is assuming ibitstream derives from istream, so there would be issues selecting the right extraction overload for a UDT.
ibitstream does not derive from istream. I wanted to but found that istream is not sufficiently abstract--it is intentionally and specifically a character-streaming class. I just couldn't make it work for bit streaming. I could have another look at it, though.
If you derive from the standard library types, then the text based operators can also apply. In that case, any bitstream overload not provided for a UDT will imply use of the text based operator instead. That would mean silently getting text based encoding.
Yeah, I understand. That would be nice.
Surely you're doing host to network swaps. That means you assume input and output in host order, and transmission in network order. That fits "translation" well enough I think.
No, ibistream does not do that. As an example, given the definitions, uint32_t u; char networkByteOrder[sizeof u]; ...this would not be portable: memcpy(networkByteOrder, &u, sizeof networkByteOrder); ...but this would: networkByteOrder[0] = u; networkByteOrder[1] = u >> CHAR_BIT; networkByteOrder[2] = u >> CHAR_BIT * 2; networkByteOrder[3] = u >> CHAR_BIT * 3; ibitstream does not do this specific thing, but it shows how one can write portable code without regard to platform endianess. Therefore, ibitstream does not "translate," at least IMO. Paul
On Jun 29, 2013, at 10:49 PM, Paul Long
On 6/29/2013 10:12 AM, Rob Stewart wrote:
If you derive from the standard library types, then the text based operators can also apply. In that case, any bitstream overload not provided for a UDT will imply use of the text based operator instead. That would mean silently getting text based encoding.
Yeah, I understand. That would be nice.
Maybe. Text-based encodings are not necessarily reversible and the are inefficient. I was actually thinking that the silent fallback was a problem. OTOH, explicitly reusing a text-based operator, via a customization point, might work well.
Surely you're doing host to network swaps. That means you assume input and output in host order, and transmission in network order. That fits "translation" well enough I think.
No, ibistream does not do that. As an example, given the definitions,
uint32_t u; char networkByteOrder[sizeof u];
...this would not be portable:
memcpy(networkByteOrder, &u, sizeof networkByteOrder);
...but this would:
networkByteOrder[0] = u; networkByteOrder[1] = u >> CHAR_BIT; networkByteOrder[2] = u >> CHAR_BIT * 2; networkByteOrder[3] = u >> CHAR_BIT * 3;
ibitstream does not do this specific thing, but it shows how one can write portable code without regard to platform endianess. Therefore, ibitstream does not "translate," at least IMO.
If you do the same thing on all platforms, then you have no portability. If you do that only on little endian platforms, then you've reinvented the network/host byte ordering mechanism. If something else, then I don't understand you. ___ Rob (Sent from my portable computation engine)
On 6/30/2013 6:25 AM, Rob Stewart wrote:
... explicitly reusing a text-based operator, via a customization point, might work well.
I don't understand. What's a "customization point?"
uint32_t u; char networkByteOrder[sizeof u];
...this would not be portable:
memcpy(networkByteOrder, &u, sizeof networkByteOrder);
...but this would:
networkByteOrder[0] = u; networkByteOrder[1] = u >> CHAR_BIT; networkByteOrder[2] = u >> CHAR_BIT * 2; networkByteOrder[3] = u >> CHAR_BIT * 3;
I realized, laying in bed this morning, that I made a mistake. _This_ would produce network byte order, or big endian: networkByteOrder[3] = u; networkByteOrder[2] = u >> CHAR_BIT; networkByteOrder[1] = u >> CHAR_BIT * 2; networkByteOrder[0] = u >> CHAR_BIT * 3;
ibitstream does not do this specific thing, but it shows how one can write portable code without regard to platform endianess. Therefore, ibitstream does not "translate," at least IMO. If you do the same thing on all platforms, then you have no portability. If you do that only on little endian platforms, then you've reinvented the network/host byte ordering mechanism. If something else, then I don't understand you.
Unless you're referring to my mistake, this should be portable because integrals are operated on as _values_, not according their representation in memory. It's only when one aliases them as a char array that their representation, e.g., endianess, becomes relevant. Paul
On Jun 30, 2013, at 12:14 PM, Paul Long
On 6/30/2013 6:25 AM, Rob Stewart wrote:
... explicitly reusing a text-based operator, via a customization point, might work well.
I don't understand. What's a "customization point?"
A CP is a template your code uses, to effect some desired behavior, that users specialize to control your code. Thus, you could call a static member function of a class template to determine how to convert a type. The primary specialization, for supported types, invokes your bitstream functionality, and is not implemented for unsupported/unknown types. Thus, there is no support for such types built into the library. Users can specialize the CP for their type to use, say, insertion into an ostringstream and then use your string support for insertion into a bitstream.
uint32_t u; char networkByteOrder[sizeof u];
...this would not be portable:
memcpy(networkByteOrder, &u, sizeof networkByteOrder);
...but this would:
networkByteOrder[0] = u; networkByteOrder[1] = u >> CHAR_BIT; networkByteOrder[2] = u >> CHAR_BIT * 2; networkByteOrder[3] = u >> CHAR_BIT * 3;
I realized, laying in bed this morning, that I made a mistake. _This_ would produce network byte order, or big endian:
networkByteOrder[3] = u; networkByteOrder[2] = u >> CHAR_BIT; networkByteOrder[1] = u >> CHAR_BIT * 2; networkByteOrder[0] = u >> CHAR_BIT * 3;
ibitstream does not do this specific thing, but it shows how one can write portable code without regard to platform endianess. Therefore, ibitstream does not "translate," at least IMO. If you do the same thing on all platforms, then you have no portability. If you do that only on little endian platforms, then you've reinvented the network/host byte ordering mechanism. If something else, then I don't understand you.
Unless you're referring to my mistake, this should be portable because integrals are operated on as _values_, not according their representation in memory. It's only when one aliases them as a char array that their representation, e.g., endianess, becomes relevant.
You are aliasing a value as a char array, so you are relying on the endianness, and if you do that for every value on every platform, then the bitstream's order depends upon the endianness of the host that creates it and isn't portable. If you only do such things on little endian hosts, and not on big endian hosts, then you're doing the normal host->network->host order swaps of network communications, albeit with your own reordering code. ___ Rob (Sent from my portable computation engine)
On Mon, 1 Jul 2013 05:16:22 -0400, Rob Stewart wrote:
On Jun 30, 2013, at 12:14 PM, Paul Long
wrote: uint32_t u; char networkByteOrder[sizeof u];
networkByteOrder[3] = u; networkByteOrder[2] = u >> CHAR_BIT; networkByteOrder[1] = u >> CHAR_BIT * 2; networkByteOrder[0] = u >> CHAR_BIT * 3;
ibitstream does not do this specific thing, but it shows how one can write portable code without regard to platform endianess. Therefore, ibitstream does not "translate," at least IMO. If you do the same thing on all platforms, then you have no portability. If you do that only on little endian platforms, then you've reinvented the network/host byte ordering mechanism. If something else, then I don't understand you.
Unless you're referring to my mistake, this should be portable because integrals are operated on as _values_, not according their representation in memory. It's only when one aliases them as a char array that their representation, e.g., endianess, becomes relevant.
You are aliasing a value as a char array, so you are relying on the endianness, and if you do that for every value on every platform, then the bitstream's order depends upon the endianness of the host that creates it and isn't portable.
If you only do such things on little endian hosts, and not on big endian hosts, then you're doing the normal host->network->host order swaps of network communications, albeit with your own reordering code.
I'm still not sure what you mean. I wrote encode and decode functions on http://codepad.org/nml6RjX5 which apparently runs on a big-endian platform. Maybe that'll help our understanding. I'm saying that, using this technique, one can write portable code--without the hton family of functions--that makes no assumptions about the underlying endianess of the platform. Of course, one will have to choose an endianess of the encoding. In this example, I have chosen big endian. Is that what you're concerned about--that any decoder will have to know the endianess of the encoded value and therefore it must "translate?" I agree that a decoder will have to know the endianess of the encoded value, but it does not need to know the endianess of the platform on which it is running; see my decode function on http://codepad.org/nml6RjX5. As an aside, I bet there is a dichotomy of developers. Some, like me, mostly write code that supports a standardized encoding or protocol, in which the endianess is prescribed. Others are just interested in communicating data between proprietary entities and don't much care what endian is used, so it might as well be the endian of one and hopefully all of the platforms. Paul
On 29/06/13 01:18, Paul Long wrote:
- how do you overload operator>> or operator<< for your stream? I'm not sure what you mean by "how." To date, I have only implemented
On 6/28/2013 3:32 PM, Mathias Gaunard wrote: the input side and so haven't overloaded operator<<. For operator>>, I have overloaded it for various right-hand parameters, including integrals and a few stream manipulators, such as aligng(). The left hand is always an ibitstream reference (which will be extended to bitstream and obitstream references once I support output streams).
What should I do if I want to define the operators on my own user-defined type?
How do you avoid conflicts with text-based overloads? Maybe I misunderstand your question, but I'm not even aware of how there can be a conflict. Since these operators are overloaded with left hand of bitstream, ibitstream, or obitstream, they cannot be applied to the std::iostream derivatives. Is that what you mean?
See Rob's response. I assumed ibitstream derived from istream. Is that not the case? Does the ibitstream even use similar virtual functions and buffering mechanisms as istream?
Effectively the latter. It does not assume any particular endianness of the platform (for example, it does not blindly copy platform memory to the bit stream) and therefore does not actually "translate" between endians. The effect, however, is that integrals in the bit stream are always big endian and integrals on the platform are always their native endian. That said, awareness of platform endianness could inform future optimizations.
I suspect writing the value directly to the stream (possibly after byte swapping) would be faster than shifts and bitwise ands and writing it byte per byte. Nevertheless this answers my question about semantics.
On 6/29/2013 12:40 PM, Mathias Gaunard wrote:
What should I do if I want to define the operators on my own user-defined type?
Write them. It should be easy. For example, this is all it takes for
ibitstream to support std::bitset:
template
See Rob's response. I assumed ibitstream derived from istream. Is that not the case?
That is not the case. See my reply to Mathias' similar query at http://article.gmane.org/gmane.comp.lib.boost.devel/242607
Does the ibitstream even use similar virtual functions and buffering mechanisms as istream?
Well, it follows std::istringstream and therefore std::istream. As far as possible, ibitstream and bitbuf have the same member functions and semantics as std::istringstream and std::stringbuf.
I suspect writing the value directly to the stream (possibly after byte swapping) would be faster than shifts and bitwise ands and writing it byte per byte.
I agree. However, my main concern was stability through simplicity. There is plenty of room for optimization, which I plan to do. Paul
On 30/06/13 05:09, Paul Long wrote:
On 6/29/2013 12:40 PM, Mathias Gaunard wrote:
What should I do if I want to define the operators on my own user-defined type?
Write them. It should be easy. For example, this is all it takes for ibitstream to support std::bitset:
template
ibitstream &operator>>(ibitstream &ibs, std::bitset<N> &bs) { decltype(bs.to_ulong()) value; ibs.read(value, N);
That is surprising. istream's read takes a char* and a size in bytes. ibitstream's read takes a unsigned long& and a size in bits!?
On 6/30/2013 8:45 AM, Mathias Gaunard wrote:
On 30/06/13 05:09, Paul Long wrote:
template
ibitstream &operator>>(ibitstream &ibs, std::bitset<N> &bs) { decltype(bs.to_ulong()) value; ibs.read(value, N); That is surprising. istream's read takes a char* and a size in bytes.
ibitstream's read takes a unsigned long& and a size in bits!?
The entire class hierarchy to which istream belongs processes sequences of _characters_--it should really be called icharstream--so of course it takes a size in bytes. The proposed bitstream library processes unaligned sequences of bits, so it takes a size in bits. What's surprising? Paul
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Paul Long Sent: Saturday, June 29, 2013 12:19 AM To: boost@lists.boost.org Subject: Re: [boost] Any interest in bitstream class?
- how do you overload operator>> or operator<< for your stream? I'm not sure what you mean by "how." To date, I have only implemented the input side and so haven't overloaded operator<<. For operator>>, I have overloaded it for various right-hand parameters, including integrals and a few stream manipulators, such as aligng(). The left hand is always an ibitstream reference (which will be extended to bitstream and obitstream references once I support output streams). How do you avoid conflicts with text-based overloads? Maybe I misunderstand your question, but I'm not even aware of how there can be a conflict. Since
- what does writing an int to a stream actually do? Is it the same behaviour as write(&i, sizeof(int)), or does it translate the int to big-endian first? Effectively the latter. It does not assume any particular endianness of the platform (for example, it does not blindly copy platform memory to the bit stream) and therefore does not actually "translate" between endians. The effect, however, is that integrals in the bit stream are always big endian and integrals on
On 6/28/2013 3:32 PM, Mathias Gaunard wrote: these operators are overloaded with left hand of bitstream, ibitstream, or obitstream, they cannot be applied to the std::iostream derivatives. Is that what you mean? the platform are always their native endian. That said, awareness of platform endianness could inform future optimizations.
Don't forget that a Boost.Endian library exists and has been accepted - even if the final implementation has been delayed by Beman being busy with C++11 standards work. Docs are here http://boost.cowic.de/rc/endian/doc/ I have no idea if this is relevant to the bitstream library and endian streams. Paul --- Paul A. Bristow, Prizet Farmhouse, Kendal LA8 8AB UK +44 1539 561830 07714330204 pbristow@hetp.u-net.com
Hi Paul, Thanks for offering this. Paul Long wrote:
bool ParseRtpHeader(const BYTE *buffer, RtpHeader &rtp) { ibitstream bin(buffer); static const bitset<2> version(0x2); bitset<4> csrcCount; WORD extensionLength;
bin >> version >> rtp.padding >> rtp.extension.present >> csrcCount >> rtp.marker >> rtp.payloadType >> rtp.sequenceNumber >> rtp.timestamp >> rtp.ssrcIdentifier >> setrepeat(csrcCount.to_ulong()) >> rtp.csrcIdentifier;
if (rtp.extension.present) { bin >> rtp.extension.identifier >> extensionLength >> setrepeat(extensionLength * sizeof(DWORD)) >> rtp.extension.contents; }
return bin.good(); }
Have you measured the performance of your implementation, compared to something more like this (with the shifts etc. obviously made up here): version = (data[0] >> 14) & 0x07f; rtp.padding = (data[0] >> 6) & 0x0ff; rtp.extension.present = data[0] & 0x020; etc. etc. ? I'm curious to know if you are doing bit-by-bit processing of the fields with shift amounts known only at run time, compared to extracting groups of contiguous bits simultaneously with shift amounts known at compile time - and if so, how much that affects performance. I have previously tried to do some basic MPEG stream processing on an embedded device and started trying to write a general-purpose facility like yours, but decided that it was not fast enough. My conclusion at the time was that the ideal solution would be to create a "yacc-like" tool that would parse the bitstream spec (preferably copied straight from the MPEG documentation) and generate stream decoding code in the style shown. This was never more than a vague idea of mine. Perhaps a Boost.Spirit kind of parser would be another possibility. Regards, Phil.
On 6/29/2013 9:48 AM, Phil Endecott wrote:
Have you measured the performance of your implementation ... ?
No, I have not measured the performance of ibitstream.
I'm curious to know if you are doing bit-by-bit processing of the fields with shift amounts known only at run time, compared to extracting groups of contiguous bits simultaneously with shift amounts known at compile time - and if so, how much that affects performance.
ibitstream performs bit-by-bit processing. It can be optimized to a certain degree as a follow-on task. I suppose that by exposing some bitstream code to the calling context via inlining, the compiler could perform some optimizations, e.g., strength reduction, but I haven't thought about it.
I have previously tried to do some basic MPEG stream processing on an embedded device and started trying to write a general-purpose facility like yours, but decided that it was not fast enough.
Hmm... I guess I have been thinking that bitstream would be used for something less intensive and more high level like packetization as opposed to being the sole means of a writing and reading bits for a media codec. I believe the performance--once optimized--can be quite good and often good enough, but it will never be as efficient as hand-coded logic. Paul
Paul Long wrote:
ibitstream performs bit-by-bit processing.
it will never be as efficient as hand-coded logic.
You should be more ambitious! Let's start with this:
(Please consider all of this code pseudo-code!)
class ibitstream {
uint64_t buffer;
unsigned int bits_in_buffer;
uint32_t get_next_32() { .... }
template <unsigned int bits>
void require() {
if (bits_in_buffer < bits) {
buffer |= get_next_32() << bits_in_buffer;
}
}
public:
template
Date: Thu, 27 Jun 2013 20:44:19 -0500 From: plong@packetizer.com
Same semantics as, e.g., stringstream, but it operates on binary data. I already have the analogues to istringstream/stringbuf (you can take a look at ibitstream/bitbuf at https://github.com/dplong/bstream) There are some interesting features, but I won't go into them here. If there is interest, I would implement obitstream and bitstream and possibly support Boost::dynamic_bitset.
Looking at the header file "bstream.h" 1. Can't "uintmax_t" be used for "bitfield"? 2. Why doesn't "bitbuf" have constructors that take "char *"? 3. The names used for the methods of the Standard text stream family of classes are horrid. Since you're not inheriting from the old classes, can you improve the names? Importantly, the legacy names used "ImportantName" for the protected implementation functions, forcing the use of "pubImportantName" for the user-facing interface. You should reverse them to protected "do_action" and public "action." (The standard locale facets did this fix.) 4. The "ibitstream" class has an "operator !" without a Boolean conversion operator? There's no way to use a stream in a test. You should create an (explicit) "bool" conversion operator. Then you can use a stream in built-in Boolean tests (including "operator!"!). Also, this operator should be moved to the ios_base/basic_ios class when you create it. (Make sure to "using" it in the istream, ostream, and iostream classes.) Daryle W.
On Mon, 1 Jul 2013 08:45:26 -0400, Daryle Walker wrote:
Looking at the header file "bstream.h"
1. Can't "uintmax_t" be used for "bitfield"?
It certainly could be. IIRC, I chose the current definition arbitrarily.
2. Why doesn't "bitbuf" have constructors that take "char *"?
It... does. Are you asking why bitbuf constructors have char * modifiers, i.e., signed and unsigned? Since the signedness of char sans modifier is implementation dependent, I thought I'd be explicit. Is that a problem? Is it unnecessary?
3. The names used for the methods of the Standard text stream family of classes are horrid.
heh heh heh
Since you're not inheriting from the old classes, can you improve the names?
Sure. However, I thought mimicking stringstream made the classes more familiar and therefore was more beneficial than using my own, possibly more intuitive and consistent names. I'm certainly open to new names, but I'd like to hear what others think. Familiar or intuitive?
Importantly, the legacy names used "ImportantName" for the protected implementation functions, forcing the use of "pubImportantName" for the user-facing interface. You should reverse them to protected "do_action" and public "action." (The standard locale facets did this fix.)
FWIW, as currently planned, locale does not apply to the proposed bitstream library.
4. The "ibitstream" class has an "operator !" without a Boolean conversion operator? There's no way to use a stream in a test. You should create an (explicit) "bool" conversion operator. Then you can use a stream in built-in Boolean tests (including "operator!"!).
Okay, thanks! Will do.
Interesting... VS2012 never overloads operator bool. Instead it
provides this functionality via a void * overload in xiobase:
__CLR_OR_THIS_CALL operator void *() const
{ // test if any stream operation has failed
return (fail() ? 0 : (void *)this);
}
gcc 4.7.3 does the same thing in basic_ios.h for class basic_ios:
operator void*() const
{ return this->fail() ? 0 : const_cast
Also, this operator should be moved to the ios_base/basic_ios class when you create it. (Make sure to "using" it in the istream, ostream, and iostream classes.)
Yeah, I had been thinking about mimicking the stringstream class hierarchy more closely, all the way up to ios_base. Not sure why, offhand, but it must have been done that way for a reason. Paul
Date: Mon, 1 Jul 2013 12:47:36 -0500 From: plong@packetizer.com
On Mon, 1 Jul 2013 08:45:26 -0400, Daryle Walker wrote:
Looking at the header file "bstream.h"
1. Can't "uintmax_t" be used for "bitfield"?
It certainly could be. IIRC, I chose the current definition arbitrarily.
2. Why doesn't "bitbuf" have constructors that take "char *"?
It... does. Are you asking why bitbuf constructors have char * modifiers, i.e., signed and unsigned? Since the signedness of char sans modifier is implementation dependent, I thought I'd be explicit. Is that a problem? Is it unnecessary?
I mean "char," not "unsigned char" or "signed char." The "char" type is a strong-typedef for either "unsigned char" or "signed char," which one is implementation-defined. But since it's a strong-typedef (which only exists for some built-in integer types), you have to give it a separate overload, "char*" is NOT covered by what you currently have.
3. The names used for the methods of the Standard text stream family of classes are horrid.
heh heh heh
Since you're not inheriting from the old classes, can you improve the names?
Sure. However, I thought mimicking stringstream made the classes more familiar and therefore was more beneficial than using my own, possibly more intuitive and consistent names. I'm certainly open to new names, but I'd like to hear what others think. Familiar or intuitive?
Importantly, the legacy names used "ImportantName" for the protected implementation functions, forcing the use of "pubImportantName" for the user-facing interface. You should reverse them to protected "do_action" and public "action." (The standard locale facets did this fix.)
FWIW, as currently planned, locale does not apply to the proposed bitstream library.
That's just an example; I wasn't saying to include a std::locale analogue (somehow).
4. The "ibitstream" class has an "operator !" without a Boolean conversion operator? There's no way to use a stream in a test. You should create an (explicit) "bool" conversion operator. Then you can use a stream in built-in Boolean tests (including "operator!"!).
Okay, thanks! Will do.
Interesting... VS2012 never overloads operator bool. Instead it provides this functionality via a void * overload in xiobase:
__CLR_OR_THIS_CALL operator void *() const { // test if any stream operation has failed return (fail() ? 0 : (void *)this); }
gcc 4.7.3 does the same thing in basic_ios.h for class basic_ios:
operator void*() const { return this->fail() ? 0 : const_cast
(this); } but provides an operator bool overload "downstream" in istream/ostream for basic_istream/basic_ostream:
operator bool() const { return _M_ok; } };
VS2012 never provides an overload for operator bool. The proposed bitstream library has neither, though, so I definitely need to fix this.
Before C++11, we had to use the "Safe-bool" implementation, using a pointer (or, better, pointer-to-member) type since it can't sliently be used for regular numeric operations (unlike "bool").
Also, this operator should be moved to the ios_base/basic_ios class when you create it. (Make sure to "using" it in the istream, ostream, and iostream classes.)
Yeah, I had been thinking about mimicking the stringstream class hierarchy more closely, all the way up to ios_base. Not sure why, offhand, but it must have been done that way for a reason.
You don't need both "ios_base" and "basic_ios," since there's no character and/or traits templating; use a single class as an analogue for both. This class is meant to hold data that's common to both input and output (and dual) streams. Daryle W.
On Mon, 1 Jul 2013 15:24:16 -0400, Daryle Walker wrote:
Date: Mon, 1 Jul 2013 12:47:36 -0500 From: plong@packetizer.com
On Mon, 1 Jul 2013 08:45:26 -0400, Daryle Walker wrote:
Looking at the header file "bstream.h"
2. Why doesn't "bitbuf" have constructors that take "char *"?
It... does. Are you asking why bitbuf constructors have char * modifiers, i.e., signed and unsigned? Since the signedness of char sans modifier is implementation dependent, I thought I'd be explicit. Is that a problem? Is it unnecessary?
I mean "char," not "unsigned char" or "signed char." The "char" type is a strong-typedef for either "unsigned char" or "signed char," which one is implementation-defined. But since it's a strong-typedef (which only exists for some built-in integer types), you have to give it a separate overload, "char*" is NOT covered by what you currently have.
Hmm... What is best? Overloads for unsigned char *, signed char *, and char *, plus their const versions? IOW, must I provide overloads for all six variations?
(The standard locale facets did this fix.)
FWIW, as currently planned, locale does not apply to the proposed bitstream library.
That's just an example; I wasn't saying to include a std::locale analogue (somehow).
I understood. That's why I said, "FWIW."
You don't need both "ios_base" and "basic_ios," since there's no character and/or traits templating; use a single class as an analogue for both. This class is meant to hold data that's common to both input and output (and dual) streams.
Okay, great. Paul
On 07/01/2013 07:47 PM, Paul Long wrote:
Interesting... VS2012 never overloads operator bool. Instead it provides this functionality via a void * overload in xiobase:
For an explanation see: http://www.artima.com/cppsource/safebool.html
C++03 mandated operator void *() in iostreams. C++11 mandates explicit operator bool(), which has been implemented in VS 2013 Preview. See http://blogs.msdn.com/b/vcblog/archive/2013/06/28/c-11-14-stl-features-fixes... for a description of the compiler and STL changes in VS 2013, some of which will affect Boost. STL -----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Bjorn Reese Sent: Tuesday, July 02, 2013 2:13 AM To: boost@lists.boost.org Subject: Re: [boost] Any interest in bitstream class? On 07/01/2013 07:47 PM, Paul Long wrote:
Interesting... VS2012 never overloads operator bool. Instead it provides this functionality via a void * overload in xiobase:
For an explanation see: http://www.artima.com/cppsource/safebool.html _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (10)
-
Adam Wulkiewicz
-
Bjorn Reese
-
Daryle Walker
-
Klaim - Joël Lamotte
-
Mathias Gaunard
-
Paul A. Bristow
-
Paul Long
-
Phil Endecott
-
Rob Stewart
-
Stephan T. Lavavej