On Tue, Oct 15, 2024 at 11:16 PM Klemens Morgenstern via Boost < boost@lists.boost.org> wrote:
Please explicitly state that you either *accept* or *reject* the inclusion of this library into boost.
What follows is my review of the async-mqtt5 library by Ivica Siladic (through Mireo). I believe this library should be ACCEPTED. I spent about 20 hours looking at the documentation, the source code, Internet resources related to the MQTT protocol, and talking with the author. I was not able to compile the library and examples, for reasons related to the dependency on OpenSSL. In my opinion this submission represents the gold standard of library proposals. It has been deployed by multiple companies in commercial settings for over a year. It has received considerable field testing and bug fixes. While not perfect, the API clearly works and delivers value for users. In short, async-mqtt5 has proven itself outside of Boost. The library could use improvement in the documentation, the example code, the APIs, and the implementation that would bring it up to an outstanding level of quality. I struggled quite a lot with deciding whether or not some of these improvements should be conditions of acceptance and ultimately I have concluded that Boost would be better with this library in it as the library is currently implemented, rather than not. Writing out a list of conditions is laborious and I aborted two attempts at doing so. Furthermore, the nature of some of the improvements are quite complicated and require experimentation, measurement, and iteration. It doesn't make sense to hold out for a "perfect" submission. Furthermore a good MQTT client implementation is necessarily complex; the interaction of reading and writing combined with quality of service and server behavior is tricky to get right. Optimizing such an implementation requires making a lot of correct decisions regarding tradeoffs. I have rough ideas for improvements, and only the author can explore them in a way that preserves the unique features that MQTT clients need. ANSWERS Q: Does this library bring real benefit to C++ developers for real world use-case? A: It obviously does. MQTT is the industry-wide peer to peer communication solution for the Internet of Things ("IoT"). This library has already benefited its users for over a year. Q: Do you have an application for this library? A: I personally do not have a use for this library. However I will note that the use of async-mqtt5 would be a natural fit for a reimplementation of Beast's websocket chat server. Q: Does the API match with current best practices? A: I personally am unfamiliar with MQTT and IoT best practices. Yet I am confident that it either matches or will match best practices, as the pressures from its commercial usage create the necessary incentives to do so. I have some concerns that the Q: Is the documentation helpful and clear? A: The documentation needs a lot of work. Q: Did you try to use it? What problems or surprises did you encounter? A: I wanted to build it locally but cmake and vcpkg conspired against me. My willingness to offer acceptance despite the inability to build the thing comes from the existence of actual users prior to submission. Q: What is your evaluation of the implementation? A: The implementation is good, not great. More importantly, the author is skilled with Asio and he appears to have his wits about him so I would bet that the support and improvement for this library should it become part of Boost, will be very good. Q: Do you have experience with asio? A: I am an expert with Asio, and I have over 30 years of experience writing concurrent network programs. Q: Did you work with MQTT in the past? A: I did not work with MQTT in the past, but I have heard about it. Eight years ago Michael Caisse presented a talk at CppCon 2016 on it, yet the promised code was never published. NOTES The implementation looks very complicated, just like the MQTT protocol. I have confidence that it mostly does the right thing, since it is used in production. There are a lot of memory allocations and a lot of intermediate structures, and I wonder if this could be improved. The implementation queues both incoming and outgoing packets, and these queues appear to be unbounded. My experience tells me this is a problem, as the lack of backpressure subverts TCP's application-level flow control. Yet I do not have evidence that this is a problem in practice. A lot of time spent in my review is struggling over whether to demand improvements here or to just let it be. I do not know how MQTT is used in the IoT, and a refactoring of this library's APIs to handle backpressure would be an enormous undertaking. I have a lot of ideas about how to reduce the allocations, and make backpressure available to the application but these need experimentation and design work to fit into the protocol. I trust that the author will explore these in his own time. The async_run interface might be unnecessary. A typical program using mqtt_client will always have an outstanding call to async_receive, and periodically call async_publish. In a fashion similar to beast websocket, it is possible to piggyback the low level reading and writing on top of the async_receive and async_publish calls. If there are no outstanding calls to those functions, then the implementation will effectively stop communicating. And this might be a good thing for better support of application level TCP flow control. The library dedicates a lot of ceremonies to allocators, and I appreciate the effort. However there are no motivating use-cases and no measurements. Given that there are so many calls to allocate, in the current state I would not advertise the allocator support so much. A refactoring of the implementation to cause fewer calls to allocate and deallocate may make allocators relevant to performance. Disclaimer, I have also not done any measuring. The async_receive API only returns one message at a time, which is probably inefficient. A better technique would be to read as much data from the socket into a large buffer as possible in a single I/O, extract all complete messages into a range, and return that as a batch to the caller. I am of course glossing over many important details such as QoS, the need for ACK packets in response, and so on, and this idea will need to be explored further and measured to determine if it is practical. Is the necessary? I don't know. Header files should be included in the order of most specific to most generic. That is, library specific (async-mqtt5) headers first, then dependencies (asio, boost), and then standard library headers. This helpfully increases the chances of getting a compiler error when a header is missing. None of the examples work when you run them, because the broker "<your-mqtt-broker>" does not exist. I'm not sure what will happen if you run them as-is, but I would guess that since the library does infinite reconnection attempts with no logging, it will appear to hang. Publishing, receiving, and properties are core aspects of MQTT yet these words do not appear anywhere in the table of contents. Nor do the subjects appear by other names. Using operator[] instead of get<> for properties is unusual, as this creates something which looks like an array where each index holds a different typed object. I'm not going to get too worked up about it, since according to the author properties are an advanced usage. The implementation is in sore need of comments especially in the more complex areas where it is not clear which executor or which allocator is being used. Examples need to have command line arguments for brokers and credentials. Otherwise, default to HIveMQ's public demo broker here https://www.mqtt-dashboard.com/. Running the example with no arguments could show a "usage" string which shows the HiveMQ broker as choice for broker Table of Contents needs to have "Reading Messages", "Publishing Messages", and "Using Properties" as list items. The documentation needs a section which explains the MQTT protocol at a high level. How it is used, the client and broker roles, and links to more supporting documentation and information. This will help users who come across on async-mqtt5 that do not have any knowledge of the protocol, to determine if the library is a fit for their needs. One common opinion is that explaining MQTT in the documentation is out of scope. Is isn't strictly wrong for an introduction to the protocol to be delegated to outside sources, yet it is better for Boost if this is included. More clear documentation regarding properties. How commonly used they are, why they are used, and especially to point out how the indexes to operator[] result in objects of varying types depending on the index. More information in the documentation is strictly better than less. Things like design notes, rationale for trade offs, references to where the library is used, benchmarks, and so on all have value and including this distinguishes the library from its competitors. As pointed out in other reviews, the beast dependency should be optional and the TlsContext specification should be better, or the feature of TLS should be refactored. CONCLUSION Overall this is a good, working library which has considerable room to become great. Boost is better off with it than without it. Thanks to the author for investing the time to prepare the library for review. Regards