Re: [boost] [asio-users] [http] Formal review of Boost.Http

8 Aug 2015

      On 7 Aug 2015 at 19:29, Vinícius dos Santos Oliveira wrote:
...
...
I do not believe Http is ready for peer review for these reasons, and
I personally would urge you withdraw it until you have fixed these
issues and then come back to us. If you do not withdraw it, I will
vote for rejection.
Can I persuade you to review the Boost.Http design? I'd reject myself if
you find any fundamental flaw on the design and even if you're going to
reject the library because the lack of tooling (not directly related to
design or documentation, which are the most important items on any
library), it'd be useful to know what needs to be addressed on the design
before I submit again (in the case the submission is rejected).
I think my earlier email was poorly phrased now I've slept on it. I 
made my objections seem like they were about portability, when they 
were really about design and intent of Http.

What I should have said was this: Standalone ASIO is the reference 
Networking TS implementation for standardisation. Http is currently a 
set of extensions to Boost.ASIO specifically when it *should* be a 
set of extensions to the reference Networking TS implementation for 
C++ standardisation. This is why your current design and architecture 
is flawed, but luckily there isn't too much work to fix this by 
finishing the port of Http to the Networking TS.

The Internet of Things tends to involve lots of small somewhat 
underpowered internet devices where C++ is likely going to play a 
much more important role than historically in embedded systems. The 
chances are that any Networking TS in C++ will be very extensively 
used in such IoT applications. If there were also a high quality set 
of extensions adding HTTP support to the Networking TS, I think your 
chances are good that your Http library would be seriously considered 
by WG21 for standardisation.

*This* is why I very strongly believe that Http needs to cleave 
itself tightly to the Networking TS reference implementation, and not 
to Boost.ASIO. Moreover, if you make Http an extension of the 
Networking TS, you get Boost support with that for free - Http 
becomes dual use.
...
About C++03 support, it's not required for Boost acceptance, so I can add
later. I want to increase tests coverage a lot before porting code to
C++03, so I don't introduce accidental bugs in the process.
As a strong advocate for C++ 11 over 03 I can understand. However, 
IoT developers tend to be stuck on some pretty ancient toolsets for 
longer than others. 03 support I think would greatly increase your 
potential userbase.
...
In the case the abstractions I need enter on the C++ standard, maybe your
APIbind will help me.
I actually would advise you to choose ASIO's STL abstraction 
machinery over my own alternative APIBind in this instance.
...
But I just don't find any reference to BOOST_ASIO_HAS_STD_SYSTEM_ERROR
outside of config.hpp. Even if it was used, it is undocumented, which means
it is an implementation detail that won't be guaranteed to be there forever
(accidental behaviour versus a documented guarantee).
I'm pretty sure that implementation detail won't change in a hurry. 
STL abstraction machinery changes as quickly as a STL does i.e. every 
decade or so.
...
...
https://svn.boost.org/trac/boost/wiki/BestPracticeHandbook#a7.SAFETY:S
tronglyconsideranightlyorweeklyinputfuzzautomatedtestifyourlibraryisab
letoacceptuntrustedinput
That's why this first implementation uses a proven-parser that is very well
tested.
The buck always stops with the top most layer, not internally used 
third party libraries.

You NEED to fuzz test untrusted inputs when parsing HTTP. For all you 
know, your STL implementation could be the thing with the overflow 
bug, or the way in which you use it.
...
...
* I don't understand why you cannot issue more than one async read at
a time nor async write at a time. In something like pipelined HTTP
that seems like a very common pattern. I smell a potential design
flaw, and while the docs mention a "queue_socket" I can't see such a
thing in the reference section.
Too much synchronization and buffering.
The pipeline has the following requests:
A => B => C
If I were to allow replies to request B, the design would be more complex,
increasing the difficult to implement alternative backends and everyone's
life. Also, the reply to request B cannot actually be sent while the reply
to request A is on hold. This means that the whole response to request B
would need buffering and if response B is a video live stream, this means
an implicit (the user cannot know) crash after an unfortunate memory
exhaustion. I couldn't claim a **lightweight** server if this excessive
buffering was to occur. I couldn't claim the project is appropriate for
embedded devices if I had chosen the buffering approach. It's a trade-off.
I should write about it in the "design choice" of the documentation.
HTTP/2.0 does have proper multiplexing and it can be more useful to respond
several requests without "head of line blocking".
Ok, let's accept that HTTP before 2.0 can't multiplex (not my 
experience personally, but I agree it's got big gotchas before 2.0).

You need a user facing API for Http which lets user code multiplex 
where that is available, and not multiplex when that is not 
available. In other words, identical code written against Http must 
automatically be optimal in either connection case.

You're probably going to ask how I might implement that right? I 
think my first guess is you need two completion handler layers and 
therefore two completion dispatcher reactors: the bottom layer is the 
same as ASIO's and uses the ASIO reactor, whilst the top layer which 
users of Http sees is a reactor operated by Http and is disconnected, 
though operated by, the ASIO reactor.
...
A queue socket is a concept, I don't provide one. I can look into how make
the documentation less confusing by adding a page exclusively dedicated to
explain the problem it solves. Would that work for you?
It could be this queue socket concept of yours is exactly what I just 
proposed as two completion handler and reactor layers. In either 
case, I don't think queue sockets should be a concept, I think they 
need to be the core of your library's user facing API.

If power users wish to bypass that user facing API layer for less 
hand holding and less latency, that should also be possible. But I 
think HTTP multiplexing should be assumed to be the default starting 
point for Http applications. Once HTTP 2.0 is more common, it will 
seem madness that Http's user facing API is _not_ 100% multiplexing.
...
* Your tutorial examples use "using namespace std;" at global level.
...
That's a showstopper of bad practice, and you need to purge all of
those. Same for "using namespace boost;" at global level.
The tutorials should be as small as possible to not scary the users.
But I can put a comment "you should avoid using namespace". Does it help?
using namespace is fine at local scope. It's not fine at global 
scope.

Replacing all global using namespace with local using namespace is 
all you need to do.
...
* Your reference section is one very long thin list. You might want
...
to use more horizontal space.
I'd be happy already if I knew hot to remove the TOC from the reference
page. It'd be much more clean already.
Personally in AFIO I hack the CSS to make the list into columns. ASIO 
employs a table. There are many solutions.
...
* I personally found the design rationale not useful. I specifically
...
want to know:
1. Why did Http choose ASIO as my base over leading competitor X
and Y?
  2. What sacrifices occurred due to that choice? i.e. things I'd
love weren't the case.
I assume I don't need to reply these questions during the review, given the
reviewers may already be experienced on the answers.
Well ... I'm not sure that's actually the case.

It's not we don't have opinions on the answers, it's whether our 
answers are different to your answers. You'll see this during the 
AFIO review later this month - I have some really weird opinions no 
one else shares like choosing my own custom lightweight monadic 
future-promise implementation over async_result. That will be the 
source of much criticism I'm sure.
...
3. What alternatives may be better suited for the user examining
...
Http for suitability and why? i.e. make your documentation useful to
people with a problem to solve, not just a howto manual.
I don't quite understand this point.
What I'm really asking is for details of when before starting Http 
you reviewed the options available and the designs they used, and 
exactly why you chose the design choices in Http you did with 
reference and comparision to prior art.

Like a literature review, but for programming libraries. Does this 
make sense? It's partially for us to help us understand your choices, 
but also to help those examining Http to see if it's useful to them 
to figure out the best solution to their problem (which may be to use 
an alternative to Http).

Documentation which is useful over documentation which is reference.
...
* Benchmarks, especially latency ones which are important for many
...
uses of HTTP are missing. I'd particularly like to know confidence
intervals on the statistical distribution so I know worst case
outcomes.
These values will change when I replace the parser used now. Also, they can
change based on the buffer sizes and parser options, which the user is/will
be able to change.
Nevertheless, I do want to do some serious benchmarking, specially because
it'll give me a reference point when I write my own parser and want to
compare performance speedup. I can do some poor benchmark this weekend and
show the results.
That would be interesting, but don't go crazy on it.

Also, please do enable the undefined behaviour sanitiser permanently 
including in release mode and with full stack smashing protection 
before running benchmarks. Anything parsing malicious input should 
have all those turned on where available.
...
I hope all the above isn't too disheartening Vinícius.
Not at all. I think you're the second person to dedicate more time
reviewing my work and this isn't disheartening at all. Just that your
feedback is usually around tooling and not the design itself (everybody has
its preferences).
That's because I like the design a lot and have little bad to say 
about it.

I've also been subscribed to your github since the beginning, and I 
have been reviewing your code on and off as you wrote it. You're a 
talented programmer.
...
...
Jamfile for the main library build,
Consider this done. I won't integrate into Boost without removing CMake
first. This was already partially stated in the Bjorn's announcement (but
somehow implicit, to be fair).
I have no problem at all with dual build systems if you're happy 
keeping both maintained.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/

Re: [boost] [asio-users] [http] Formal review of Boost.Http

Niall Douglas