Fwd: Concatenated input streams
Hi, I'm interested in working with an input stream which comprises multiple input streams. The resulting stream comprises all of the input from the first stream (until EOF), then all of the input from the second, etc. For what it's worth, such an input stream is called a concatenated input stream in at least one other context (namely Common Lisp). The input streams in question are file input streams, although conceptually I don't know if that changes anything. I am hoping to find something equivalent to $ cat file1 file2 file3 ... > alldata and then reading alldata. Is there a Boost class which implements such a thing? I know that this is a pretty simple concept, and therefore it's just a small matter of programming, but I'm hoping that someone has already worked out the details. If there is such a thing in some library other than Boost, I would be interested to hear about that too. Thank you for your time, I appreciate your help very much. Robert Dodier
On 8/03/2022 16:00, Robert Dodier wrote:
I'm interested in working with an input stream which comprises multiple input streams. The resulting stream comprises all of the input from the first stream (until EOF), then all of the input from the second, etc. For what it's worth, such an input stream is called a concatenated input stream in at least one other context (namely Common Lisp).
The input streams in question are file input streams, although conceptually I don't know if that changes anything. I am hoping to find something equivalent to
$ cat file1 file2 file3 ... > alldata
and then reading alldata.
Is there a Boost class which implements such a thing? I know that this is a pretty simple concept, and therefore it's just a small matter of programming, but I'm hoping that someone has already worked out the details.
It's not streams, but you might find this (and related) of interest (in particular the scatter/gather section right at the bottom): https://www.boost.org/doc/libs/1_78_0/doc/html/boost_asio/reference/buffer.h... Having said that, while ASIO does provide file streams as well, this is sort of opposite to your use case, and it doesn't directly address what you're looking for. Still, it would be possible to use similar techniques to write your own multi-stream that could be read as if it were one stream. Although writing ASIO integrations is not for the faint of heart, so this is likely overkill unless you're already wanting to mix in ASIO for other reasons.
Hi Gavin, thanks for your reply. Given that there does not appear to be a concatenated stream class in Boost, I wonder what is the minimal set of member functions which must be implemented for a new input stream class. For example, it is plausible that it is necessary to define a constructor and a member function to get the next character (ignoring efficiency concerns). If there is any 25 words or less summary, or a pointer to a tutorial or blog post, etc., about implementing new input stream classes, I would be very interested to hear about it. All the best, Robert Dodier
On 15/03/2022 07:13, Robert Dodier wrote:
Given that there does not appear to be a concatenated stream class in Boost, I wonder what is the minimal set of member functions which must be implemented for a new input stream class. For example, it is plausible that it is necessary to define a constructor and a member function to get the next character (ignoring efficiency concerns). If there is any 25 words or less summary, or a pointer to a tutorial or blog post, etc., about implementing new input stream classes, I would be very interested to hear about it.
Boost.Iostreams is a library dedicated to making creation of new stream types easier. Having said that, it's still rooted in the C++03 world, so it might be missing some potential optimisations from modern C++. Mind you, so are streams. Surprisingly, it doesn't look like Iostreams already has a concatenating Source, which seems like it might have been a good fit for your use case and the library itself. Perhaps you could contribute one?
Hi Gavin, hi everyone. I'm thinking about the concatenated input
stream concept off and on, and I'd like to ask for some help in
organizing a solution.
The easy part to me is figuring out how to manage a collection of
input doodads (be they streams, stream buffers, sources, or whatever)
and satisfy read requests by getting the requested characters from one
or more of them. I have some experience with C++ development (from
about 20+ years ago, so C++03 is, for better or worse, maybe more
familiar to me ...) and so I feel pretty comfortable with the whole
business about classes, constructors, methods, etc.
What I need is for someone to give me some advice about how to
organize the class declarations. Of which existing Boost class should
I be deriving from? What is the type of the underlying resources that
are to be marshalled?
To be a more concrete, here is a snippet of code which uses existing
Boost classes to open an input stream.
boost::iostreams::file_source fs ("foo.txt");
boost::iostreams::stream_buffer boost::iostreams::file_source sb (fs);
std::istream is (&sb);
Now I'd like to say something like
concatenated_source cs (<list of resources to be marshalled goes here>);
boost::iostreams::stream_buffer
On 28/03/2022 19:17, Robert Dodier wrote:
What I need is for someone to give me some advice about how to organize the class declarations. Of which existing Boost class should I be deriving from? What is the type of the underlying resources that are to be marshalled?
To be a more concrete, here is a snippet of code which uses existing Boost classes to open an input stream.
boost::iostreams::file_source fs ("foo.txt"); boost::iostreams::stream_buffer boost::iostreams::file_source sb (fs); std::istream is (&sb);
Now I'd like to say something like
concatenated_source cs (<list of resources to be marshalled goes here>); boost::iostreams::stream_buffer
sb (cs); std::istream is (&sb); What seems obvious to me is to say
class concatenated_source: public boost::iostreams::source { ... }
but apparently source = device<input> and device says, according to the docs (https://www.boost.org/doc/libs/1_78_0/libs/iostreams/doc/classes/device.html...),
template
struct device { typedef Ch char_type; typedef see below category; void close(); void close(std::ios_base::openmode); void imbue(const std::locale&); }; Hmm. I was kind of assuming that a generic source would require something about reading some bytes and maybe opening or otherwise creating something to read bytes from, but that doesn't appear to be the case here.
As the doc states, the "source" class is intended as a base class, but only as an aid to implementation, not as a requirement. It does not implement all the methods of the Source concept because (like the standard library) it relies on template duck-typing and not virtual methods. As such, a Source need not actually derive from source. Have a read of https://www.boost.org/doc/libs/1_78_0/libs/iostreams/doc/concepts/source.htm... for the full interface. (You'll also have to read some other concept pages for additional methods "inherited" but not via C++ inheritance.) In addition to the required methods, you can define additional ones as needed by your specific implementation. You may also want to read the actual implementation of e.g. file_source, but only to compare it with the concept documentation. Note that due to the duck-typing, to properly implement a concatenation of generic sources you'd have to use a templated entry method and heterogeneous type-erased collection mechanism, which may not be for the faint of heart. Implementing a file_source-only concatenation would be more straightforward but less flexible. Template concepts are powerful, but they can be quite a pain to deal with at the provider end.
Gavin, thanks a lot for your reply, it makes a lot more sense now. Previously I read the "Concepts" stuff but it didn't sink in -- I wasn't sufficiently enlightened to understand the documentation, as they used to say in the old days. The stuff about template duck typing is mostly new to me in the C++ context -- I don't think I understood that stuff when I was working on C++ projects long ago -- but it makes sense to me now as an example of expression unification in a computer science-y sense; that's the mechanism for a lot of algebraic identities in a computer algebra project I'm participating in (namely Maxima). Incidentally a web search for C++ template duck typing brings up a lot of entertaining discussions, including complaints about the obscurity of compiler error messages -- I've bumped into that too. I'll digest this new stuff and try again soon. Thanks again for your help. Robert Dodier
participants (2)
-
Gavin Lambert
-
Robert Dodier