A solution to return dynamic strings without heap allocation. Any interest?
Hi all,
I haven't been succesful at attracting interest in a formatting
library [1] I've been working lately. But recently I realized that
part of it could be isolated as a small standalone library that
could solve an old common troublesome situation in C++:
Suppose you need to create a function that returns/provides a string
whose content and size is unknown at compilation time. The first
approach is to make it return a `std::string`. But if it need to be
usable in environments like bare-metal real-time system,
then one usually makes it take a raw string as an output argument,
more or less like this:
struct result{ char* it; bool truncated; };
result get_message(char* dest, std::size_t dest_len);
But this is clearly not a perfect solution since there's nothing
really effective the caller can do when get_mesage fails because
of the destination string being is too small.
So I present the `outbut` abstract class. It somehow resembles
`std::streambuf`, but with a simpler and lower level design,
which is the result of many attempts looking for the best
performance [2] and usability in my formatting library.
Afaics, it does not require a hosted C++ implementation,
though I would like someone else to confirm that.
Now the caller of `get_message` has to choose or create a
suitable class type deriving from outbuf, that dictates where
the message is written to. For example, if the user wants to
get a `std::string`, then `string_maker` will do the job:
#include
On Thu, Aug 29, 2019 at 9:24 AM Roberto Hinz via Boost
I would prefer it to be part of Boost.Core instead of being a standalone library, and also to remove the `outbuf` namespace. But that is up to you.
Boost.Core is for Boost facilities used by other Boost libraries that for simpler tasks. You could propose it for Boost.Utility. However, it seems more worthy of its own library. In either choice (Utility, or your own library) the process for a Boost formal review is at: https://www.boost.org/community/reviews.html Glen
Didn’t you see reply from Glen Fernandez? Check archive.
--
Janek Kozicki, PhD. DSc. Arch. Assoc. Prof.
Gdańsk University of Technology
Faculty of Applied Physics and Mathematics
Department of Theoretical Physics and Quantum Information
--
pg.edu.pl/jkozicki (click English flag on top right)
On 3 Sep 2019, 14:21 +0200, Roberto Hinz via Boost
No one interested?
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On Tue, Sep 3, 2019 at 11:11 AM Janek Kozicki
Didn’t you see reply from Glen Fernandez? Check archive.
Well, yes, but I din't see his message as showing interest, but just as a guidance. And I presume I first need to check whether there is any interest, otherwise what's the point of going any further. Am I misunderstanding something?
On Tue, Sep 3, 2019 at 5:21 AM Roberto Hinz via Boost
No one interested?
`std::basic_ostream` is actually quite usable once you figure out how it works (which is admittedly more difficult than it should be). It can be set up to not perform any memory allocations, depending on the implementation of the derived class. It might not be perfect but it is part of the standard library and thus has a natural advantage that would require extraordinary functionality from an external component to overcome. And I'm not seeing that in the proposed `outbuf`. Regards
On Tue, Sep 10, 2019 at 11:05 PM Vinnie Falco
`std::basic_ostream` is actually quite usable once you figure out how it works
Let me illustrate why I disagree with that. Suppose you want to implement a base64 encoder. You want it to be fast, agnostic, and simple to use. Now suppose you adopt `std::ostream` as the destination type: void to_base64( std::ostream& dest, const std::byte* src, std::size_t count ); You will face two issues: 1) It doesn't matter how well you (as the library author) understand basic_ostream. The *user* needs to implement derivates of basic_ostream to customize the destination types. 2) It's impossible to achieve a decent performance. If you used `outbuf` you could write directly into the buffer. But with `std::ostream` you have to call member functions like `put` or `write` to for each little piece of the content, or to use an additional intermediate buffer. And this is far to be a specific use case. The same issues apply for any kind of encoding, binary or text.
On Thu, Sep 12, 2019 at 10:21 AM Roberto Hinz
Let me illustrate why I disagree with that...
Okay, these are two fair points, but then the title of the original post is not accurate. What you're really proposing is "a better std::ostream" which is an entirely different conversation. Regards
On Fri, Sep 13, 2019 at 9:46 AM Vinnie Falco
On Thu, Sep 12, 2019 at 10:21 AM Roberto Hinz
wrote: Let me illustrate why I disagree with that...
Okay, these are two fair points, but then the title of the original post is not accurate. What you're really proposing is "a better std::ostream" which is an entirely different conversation.
Regards
You are right. Thanks for the comments
On Fri, Sep 13, 2019 at 6:50 AM Roberto Hinz
You are right. Thanks for the comments
Please don't take any of these comments as discouragement. Quite the opposite they are well intended with a goal of progress in mind. There have been a chorus of voices clamoring for "a better std::ostream" and it has been the subject of a few papers. Offering users better versions or replacements for standard library types such as std::ostream lands squarely within the purview of the Boost Libraries. There is already precedent for this, such as boost::system::error_category and boost::shared_ptr, both of which are superior to their standard library equivalents. However any proposed replacement needs to address the body of work that has already been done in this area. What makes it better or more usable? Thanks
On Fri, Sep 13, 2019 at 11:42 AM Vinnie Falco
On Fri, Sep 13, 2019 at 6:50 AM Roberto Hinz
wrote: You are right. Thanks for the comments
Please don't take any of these comments as discouragement. Quite the opposite they are well intended with a goal of progress in mind. There have been a chorus of voices clamoring for "a better std::ostream" and it has been the subject of a few papers. Offering users better versions or replacements for standard library types such as std::ostream lands squarely within the purview of the Boost Libraries. There is already precedent for this, such as boost::system::error_category and boost::shared_ptr, both of which are superior to their standard library equivalents.
However any proposed replacement needs to address the body of work that has already been done in this area. What makes it better or more usable?
Thanks
Not discoureged at all. I will enhance the documentation based on your feedbacks and come back later. Thank you
Hi all, this is a continuation of the thread "A solution to return dynamic strings without heap allocation. Any interest?" Just telling you that I rewrote the docs, especially the rationale: https://robhz786.github.io/outbuf/doc/outbuf.html Best regards Robhz
Hi Roberto,
On 3. Oct 2019, at 14:22, Roberto Hinz via Boost
wrote: Hi all, this is a continuation of the thread "A solution to return dynamic strings without heap allocation. Any interest?" Just telling you that I rewrote the docs, especially the rationale: https://robhz786.github.io/outbuf/doc/outbuf.html Best regards Robhz
Quoted from the rationale: "Your function is complex to use. The user needs to implement a class that derives from ostream to customize the destination. It’s a complex task for most C++ programmers." Agreed, although boost.iostreams makes that easier. "It’s impossible to achieve a good perfomance. std::ostream does not provide direct access to the buffer. to_base64 needs to call member functions like write or put for every little piece of the content, or to use an itermediate buffer." It is not impossible to achieve good performance, page 68 of http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf list problems, which are solvable. In practice, increasing the buffer size helps and turning off synchronisation with stdio: https://stackoverflow.com/questions/5166263/how-to-get-iostream-to-perform-b... The SO answer lists several examples were C++'s iostreams beats C's stdio in performance. Your argument is also not convincing. Just calling member functions doesn't make something slow if you compile with optimisations, which is a must with C++. I think it is quite natural that the stream makes it hard for you to touch the buffer. The stream objects hide buffer management under an interface. The ostream object handles the buffer for you, you don't have to know when you hit the boundary and things need to be flushed to the device. You can't hide something and expose it at the same time, this is breaking the encapsulation, so naturally, the streams make it difficult to touch the buffer directly. Although you can, if you really want to, and it is pretty simple to set up: char Buffer[N]; std::ofstream file("file.txt"); file.rdbuf()->pubsetbuf(Buffer, N); Now you can mess around with the stack-allocated buffer. It is not clear to me what the advantage of outbuf is over this. I think the real problem with iostreams is that it lacks good documentation and tutorials on how to do the more complicated things. Best regards, Hans
Hi Hans,
On Mon, Oct 7, 2019 at 5:14 AM Hans Dembinski
"It’s impossible to achieve a good perfomance. std::ostream does not provide direct access to the buffer. to_base64 needs to call member functions like write or put for every little piece of the content, or to use an itermediate buffer."
It is not impossible to achieve good performance, page 68 of http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf list problems, which are solvable.
In practice, increasing the buffer size helps and turning off synchronisation with stdio:
https://stackoverflow.com/questions/5166263/how-to-get-iostream-to-perform-b... The SO answer lists several examples were C++'s iostreams beats C's stdio in performance.
Your argument is also not convincing. Just calling member functions doesn't make something slow if you compile with optimisations, which is a must with C++. (...)
Thanks for the feedback. I removed that part from the docs. I did some benchmarks. First I implemented a base64 encoder using outbuf and std::streambuf and I couldn't find any conclusive evidence that any one is faster than the other ( I get different results from seemingly irrelevant code changes ). Then I implemented a simple json writer. In this case the streambuf buffer was about 30% slower than the outbuf version. Not a tremendous difference. I choosed to write directly into streambuf instead of ostream so that we can disconsider many of the possible QoI issues related to std::ostream. That article you reference seems to only address optimizations on facets usage and formatting, which I think should not have any effect in these benchmarks. That SO discussion seems to not apply either, since the streambuf I used does not write into a file but solely to char array. The benchmark implementations are available at https://github.com/robhz786/outbuf/tree/master/performance Anyway, it's clear now that my statement is a bit reckless. Best Regards
participants (5)
-
Glen Fernandes
-
Hans Dembinski
-
Janek Kozicki
-
Roberto Hinz
-
Vinnie Falco