Re: [boost] [Async] Review of proposed Boost.Async begins

18 Aug 2023

      On Fri, Aug 18, 2023 at 10:08 PM Vinnie Falco <vinnie.falco@gmail.com> wrote:
...
On Thu, Aug 17, 2023 at 5:25 PM Klemens Morgenstern via Boost <boost@lists.boost.org> wrote:
...
...
What's the rationale behind BOOST_ASYNC_USE_STD_PMR and friends?
I see that Boost.Async is using allocators in some way but I have not looked at the library or its documentation. Was this something that was shoved in "just because" or will I find some examples of how to use the library with allocators to achieve specific goals such as improving performance? A discussion of when custom allocators are helpful? Any sorts of benchmarks or list of tradeoffs? Some analysis on common patterns for how the library allocates memory and how to optimize it?
C++20 coroutines need to allocate their function frame somehow. Since
the library is single threaded it makes sense to optimize for that and
so async uses std::pmr or boost::container::pmr for clang < 16 for
that. In most cases it just provides a thread_local
std::pmr::unsynchronized_pool_resource for the coroutine frame, and
for async operations it uses a small monotonic_buffer for the
associated allocator.

To visualize this, let's say you have a promise like this:

promise<int> dummy() {co_return 42; }

Whenever you call dummy(), there'a non zero chance your compiler might
allocate its function frame, because the compiler optimizations are
not inline enough yet.
Thus if you just write

co_await dummy();
co_await dummy();
co_await dummy();

You may have three allocations & deallocations of the same size on the
same thread. Thus using a thread_local resource can help to minimize
that, to avoid locking here.
I did not want to run my own thread_local solution like asio's
awaitables do, which is why std::pmr was the obvious choice.

Note that because it's async I can however not assume that the
allocations happen in a stack-like pattern, as asio does, I just can
assume they're on one thread for thread, main & run. For spawn no such
resource is used as one can spawn onto a strand.
But a user might spawn onto a single-threaded io_context, in which
case he can use an unsynchronized resource manually. I do not expect
many customizations here, as async just does the right thing.

TL;DR: async is not really using allocators, but std::pmr to optimized
it's own allocations for the single threaded environment. Most user
shouldn't ever need to touch this.