Re: [boost] [async-mqtt5] Formal review starts today

27 Oct 2024

      Hi Klemens Mireo and Ivica. My review is below.

On 16/10/2024 4:16 pm, Klemens Morgenstern via Boost wrote:
...
Please explicitly state that you either *accept* or *reject* the inclusion
of this library into boost.
The library is useful and implements the MQTT (client) protocol - does 
that mean it should be in boost?
I'm not 100% sure. It's not "low quality" but its also not obviously 
"the" way to do MQTT clients in C++.
...
Also please indicate the time & effort spent on the evaluation and give the
reasons for your decision.
- Does this library bring real benefit to C++ developers for real world
use-case?
It is useful for MQTT clients that are not resource constrained embedded 
systems.
It is useful for examples of such systems that can't do or don't need 
better queuing at source than storing pending messages in a queue that 
isn't durable.
It is useful for systems where queuing rather than dropping old (in 
favour of only sending "latest") messages is the right policy.
Note this applies regardless of the QoS selected - one might think that 
selecting an at most once delivery policy would mean that messages don't 
queue - but that isn't true.

The user is free to avoid queuing by only calling async_publish once the 
completion handler for the previous publication has been called. And one 
can manage that per topic if appropriate. The fact there is a 
convenience feature doesn't mean one has to use it but I found using the 
library to be a little surprising and lacking in transparency due to the 
lack of any notifications about connectivity or timeouts. This is 
another face of the "it needs logging" comments from other reviewers. If 
the user can be informed what the library is doing, then the user can 
simply log this, or, potentially, react in other ways.

I immediately ran into this lack of visibility when I used EMQX as the 
broker to test asnyc-mqtt5 against. As I was eventually able to deduce, 
EMQX doesn't like a blank client name (and the async-mqtt5 examples 
don't set it / use the default empty string. They probably shouldn't - 
EMQX isn't the only MQTT server that requires client names). At least 
this sent me digging through the code to see how it works, adding 
"logging" so I could see when messages were being built, sent, received. 
I've taken a fairly thorough look at the code in my review.

Addendum: It turns out the example code embedded in the documentation 
DOES set client id... Its only the code in the examples directory that 
.. doesn't..

If it isn't clear from how this is going, I've not actually found a 
match for my use cases. Id really like to see a more composed/layered 
approach used. While its been suggested that acceptance could be seen as 
encouragement to refactor it is not clear that this makes a lot of 
sense. Wouldn't that produce "asymc-mqtt5-v2"? Or result in a large 
delay from acceptance to actual inclusion in boost?

At this point I'd vote to reject, but that is dependent on how much work 
(it looks like a lot) the authors are prepared to invest in making the 
library more generally useful for building systems that use MQTT.
...
- Do you have an application for this library?
Not immediately.
...
- Does the API match with current best practices?
If we took Beast as an example of best practice, probably not.

As an interface for implementing latency sensitive real time distributed 
systems, probably not.

As a set of interfaces that support thorough testing / validation (for 
"security" aka robustness) probably not.
...
- Is the documentation helpful and clear?
It is helpful and generally clear, but is also limited.

Given MQTTs intentionally limited scope of specification and hence 
variety of server behaviours supported, examples that work with a 
variety of common servers should be provided.

Existing examples are opaque - they either silently do nothing (visible) 
because they are working, or because they are not (the magic of the 
"hidden" connection management).
An example chat app or similar would be useful to see what the other 
examples are doing. People are definitely going to want to try out the 
library using examples against their broker of choice.
Including a basic application "wrapper" for examples so that setting up 
connections to a broker / trying various brokers is easier (commandline 
parameters would be enough)  would be nice  and likely necessary to be 
able to support users just trying out the library for the first time.

Related - is there a specific reason why interoperation with MQTT 3.1.1 
servers is not supported? It is unfortunate that the OASIS specification 
for MQTT is less than clear on how/if revisions are or can be backwards 
compatible. It doesn't appear to be difficult to support 3.1.1.

The description of the queuing behaviour talks about "multiple MQTT 
packets for transmission within a single TCP packet whenever feasible." 
and "The DISCONNECT packet is sent in a single TCP packet...". Obviously 
there is no such thing as a TCP packet (its a stream transport) and the 
socket API, network stack and IP and physical network layers all 
contrive to break assumptions regarding "packet boundaries". It is fair 
to say that the implementation aims to, when it can, interact with the 
hosts TCP/IP network stack in a way that presents the TCP/IP stack with 
larger segments (vs individual MQTT messages) and hence allows it to 
send a sequence of MQTT messages using less IP datagrams and resulting 
in less TCP level acks than would result from a more naive implementation.

MQTT 5 topic alias's appear to be supported. Great. I guess they "just 
work"? In any case a more direct documented list (with tests/examples) 
of MQTT protocol capabilities supported and usage of them would be helpful.

An MQTT client library has relatively simple protocol conformance 
requirements. It should be tested for both specification conformance and 
for interoperability. In many ways this would be easier if the library 
had basic support for both client and server "ends" of the protocol and 
separated concerns in a more user accessible and defined way.

The tests do include an implementation of a "server" (ie something 
capable of receive and send of valid (and less so) MQTT messages and 
using the codecs for their content. I built the tests and read some - 
they are fairly minimal.

Some potential users would be looking for a library that lets them use 
server and client capability together to implement edge bridges and 
address items such as (especially) offline queuing or other aggregation 
and storage behaviour there, not in each individual client. This applies 
even if (especially if?) MQTT is being used as a "message bus" on a 
relatively powerful "edge" device whether or not it is also being used 
to connect less capable/more deeply embedded devices.

Its notable, if only as one published example of clients needing to 
control and be aware of the state of their own server connection, that 
the Sparkplug specification imposes some specific semantics around when 
certain messages should be published with respect to other MQTT message 
types that form part of the protocol (connect, specifically).

Sparkplug is a specification for a "service" on top of/using MQTT 
transport. Other services with their own semantics (publicly published 
or not) can be expected to have broadly similar needs to be aware of 
connection and server state, and react to this in a way that implies 
more message by message control over MQTT than the library appears to 
offer.

On digging in deeper a lot of the machinery is there, buried below the 
opinionated public interface. However, a lot of changes would be needed 
to support a non "autoconnect" stream for example. Or one that doesn't 
have boost::asio::ip::tcp::socket as its lowest layer.

This also relates to interoperability/usability testing. There are at 
least the following widely used MQTT servers used with their own use 
cases in a larger system:

mosquitto - probably the least opinionated example.
EMQX
AWS IoT Core (probably the most opinionated)

This is without considering libraries that provide server-side 
capability to build servers that are not simply brokers.
Examples exist in various languages

https://github.com/mochi-mqtt/server (golang)
and
https://github.com/moscajs/aedes (node.js)
spring to mind as libraries/servers that any mqtt client is likely to 
find itself connected to.

This does mean the challenge of maintaining a "high quality" (from a 
user perspective) library for MQTT is substantially higher than 
"traditional" boost libraries which have mainly been targeting 
portability across the OS and CPU they run on, except for .. HTTP 
(Beast) and the redis and mysql client libraries that at least have a 
single implementation to target/support - not a large number, of various 
(allowed) behaviours and implementation quality. How much support is the 
maintainer prepared or expected to offer?
...
- Did you try to use it? What problems or surprises did you encounter?
See comments above.
...
- What is your evaluation of the implementation?
Not bad. It is however not as lightweight as a protocol as raw as MQTT 
could potentially be (at least at the client side).
...
- Do you have experience with asio?
Yes.
...
- Did you work with MQTT in the past?
Yes, a little, and not recently. At the time, MQTT was both immature as 
a protocol and the MQTT ecosphere lacked off-the-shelf solutions or even 
established best practices for building more complex distributed systems 
using MQTT as a transport.

Its important to note that MQTT, like HTTP is just a point to point 
"transport" albeit one with radically different semantics - pub/sub, 
stateful in the sense that a topic outlives a single message delivery, 
one way message flows not request/response, multiplexing over a single 
underlying transport.

What it isn't is a specification for a full solution to pub/sub message 
based distributed systems. MQTT does not specify that a MQTT server is a 
"broker" or what the semantics of a broker are. There is now more 
extensibility in MQTT so I was interested to both look at this library 
and to see if MQTT brokers as well as clients could now be used to build 
such systems without resorting to customisation. It seems not. This 
isn't a problem specific to, or solvable by async-mqtt5 alone. but a 
more flexible library could help.

Looking at alternative MQTT libraries, I'm not seeing anything that is 
obviously superior to the proposed library, so its not that the library 
isn't up to scratch compared to the obvious competition, its just not 
quite "Ready" to support the inherently varied and complex environment 
that is MQTT usage beyond the most basic usage (for which it is 
potentially somewhat over-engineered).
...
I think I should mention that there's another mqtt library that has
been proposed for boost, but has yet to find a review manager:
https://github.com/redboltz/async_mqtt
Some ideas of how we could review them together have been discussed on
the mailing list, but this is an independent review of mireo's
async-mqtt5.
You don't have to consider redboltz/async_mqtt when writing your review.
I did take a very brief look at redboltz async-mqtt. I do think there is 
merit in a more composable approach so in concept, it is closer to what 
Id expect. That isn't an endorsement, more a reflection on the challenge 
of providing a library fit for a wide variety of MQTT usage.

I'd like to complement Mireo and Ivica on their efforts on this library 
so far and I hope they are able to improve it further, based on some of 
the excellent feedback coming from the review process - most of it a lot 
better than mine.

Re: [boost] [async-mqtt5] Formal review starts today

Darryl Green