On 1/07/2014 08:09, Niall Douglas wrote:
There is also a strong argument that anything in ASIO which isn't async needs to go. Plus, some might feel that ASIO's current design uses too much malloc for embedded systems/ultra low latency, and I would sympathise :)
Most of the mallocs can be avoided by using the custom allocator framework. It's the locks that I object to from an embedded-low-latency standpoint, and why I ended up rolling my own. :) That's only really an issue though if you're trying to use it for generic-embedded-async or at least non-network-async purposes. Once you're using it for its original purpose of network I/O, locks make more sense.
Finally, as a personal viewpoint I don't care much for ASIO's internal implementation. I find it obtuse and hard to debug or indeed figure out much at all as bits of implementation are scattered all over the place. Some if not much of that is because ASIO implements a sort of generic concept type framework all of which requires checking, well the obvious course of action here is to use proper C++ 17 concepts and do away with the legacy design cruft. I'd also personally split the genericity away from the implementation, and push the implementation into a non-header built stable ABI so we can avoid pulling in so many system header files like all of windows.h.
Unfortunately one of its most powerful features (concept based callable handlers and the ability to chain special conditions such as custom allocators and strands) also means that templates permeate almost everywhere, making it really hard to do anything other than header-only. (I haven't read up on C++17 concepts and whether they would help with this or not.) When I was rolling my own one I did elect to use private implementation instead but it came at a cost of flexibility and performance (particularly since I'm using boost::function to break the template chain instead of something more lightweight); in particular mine doesn't support custom allocators and while it does have strands, they're a bit more brittle and you need to be more careful how you use them or you can get unexpected concurrency. (It's still a net win in my case due to lower latency and lock avoidance, but this is not a tradeoff everyone would want to make.) I might be wrong about this, but I get the impression that a lot of the internal implementation code is duplicated rather than being factored to common methods in order to avoid conditionals and improve performance without relying on compiler inlining. Which can also contribute to making it hard to read and understand.