On 7 Aug 2015 at 19:29, Vinícius dos Santos Oliveira wrote:
I do not believe Http is ready for peer review for these reasons, and I personally would urge you withdraw it until you have fixed these issues and then come back to us. If you do not withdraw it, I will vote for rejection.
Can I persuade you to review the Boost.Http design? I'd reject myself if you find any fundamental flaw on the design and even if you're going to reject the library because the lack of tooling (not directly related to design or documentation, which are the most important items on any library), it'd be useful to know what needs to be addressed on the design before I submit again (in the case the submission is rejected).
I think my earlier email was poorly phrased now I've slept on it. I made my objections seem like they were about portability, when they were really about design and intent of Http. What I should have said was this: Standalone ASIO is the reference Networking TS implementation for standardisation. Http is currently a set of extensions to Boost.ASIO specifically when it *should* be a set of extensions to the reference Networking TS implementation for C++ standardisation. This is why your current design and architecture is flawed, but luckily there isn't too much work to fix this by finishing the port of Http to the Networking TS. The Internet of Things tends to involve lots of small somewhat underpowered internet devices where C++ is likely going to play a much more important role than historically in embedded systems. The chances are that any Networking TS in C++ will be very extensively used in such IoT applications. If there were also a high quality set of extensions adding HTTP support to the Networking TS, I think your chances are good that your Http library would be seriously considered by WG21 for standardisation. *This* is why I very strongly believe that Http needs to cleave itself tightly to the Networking TS reference implementation, and not to Boost.ASIO. Moreover, if you make Http an extension of the Networking TS, you get Boost support with that for free - Http becomes dual use.
About C++03 support, it's not required for Boost acceptance, so I can add later. I want to increase tests coverage a lot before porting code to C++03, so I don't introduce accidental bugs in the process.
As a strong advocate for C++ 11 over 03 I can understand. However, IoT developers tend to be stuck on some pretty ancient toolsets for longer than others. 03 support I think would greatly increase your potential userbase.
In the case the abstractions I need enter on the C++ standard, maybe your APIbind will help me.
I actually would advise you to choose ASIO's STL abstraction machinery over my own alternative APIBind in this instance.
But I just don't find any reference to BOOST_ASIO_HAS_STD_SYSTEM_ERROR outside of config.hpp. Even if it was used, it is undocumented, which means it is an implementation detail that won't be guaranteed to be there forever (accidental behaviour versus a documented guarantee).
I'm pretty sure that implementation detail won't change in a hurry. STL abstraction machinery changes as quickly as a STL does i.e. every decade or so.
https://svn.boost.org/trac/boost/wiki/BestPracticeHandbook#a7.SAFETY:S tronglyconsideranightlyorweeklyinputfuzzautomatedtestifyourlibraryisab letoacceptuntrustedinput
That's why this first implementation uses a proven-parser that is very well tested.
The buck always stops with the top most layer, not internally used third party libraries. You NEED to fuzz test untrusted inputs when parsing HTTP. For all you know, your STL implementation could be the thing with the overflow bug, or the way in which you use it.
* I don't understand why you cannot issue more than one async read at a time nor async write at a time. In something like pipelined HTTP that seems like a very common pattern. I smell a potential design flaw, and while the docs mention a "queue_socket" I can't see such a thing in the reference section.
Too much synchronization and buffering.
The pipeline has the following requests:
A => B => C
If I were to allow replies to request B, the design would be more complex, increasing the difficult to implement alternative backends and everyone's life. Also, the reply to request B cannot actually be sent while the reply to request A is on hold. This means that the whole response to request B would need buffering and if response B is a video live stream, this means an implicit (the user cannot know) crash after an unfortunate memory exhaustion. I couldn't claim a **lightweight** server if this excessive buffering was to occur. I couldn't claim the project is appropriate for embedded devices if I had chosen the buffering approach. It's a trade-off. I should write about it in the "design choice" of the documentation.
HTTP/2.0 does have proper multiplexing and it can be more useful to respond several requests without "head of line blocking".
Ok, let's accept that HTTP before 2.0 can't multiplex (not my experience personally, but I agree it's got big gotchas before 2.0). You need a user facing API for Http which lets user code multiplex where that is available, and not multiplex when that is not available. In other words, identical code written against Http must automatically be optimal in either connection case. You're probably going to ask how I might implement that right? I think my first guess is you need two completion handler layers and therefore two completion dispatcher reactors: the bottom layer is the same as ASIO's and uses the ASIO reactor, whilst the top layer which users of Http sees is a reactor operated by Http and is disconnected, though operated by, the ASIO reactor.
A queue socket is a concept, I don't provide one. I can look into how make the documentation less confusing by adding a page exclusively dedicated to explain the problem it solves. Would that work for you?
It could be this queue socket concept of yours is exactly what I just proposed as two completion handler and reactor layers. In either case, I don't think queue sockets should be a concept, I think they need to be the core of your library's user facing API. If power users wish to bypass that user facing API layer for less hand holding and less latency, that should also be possible. But I think HTTP multiplexing should be assumed to be the default starting point for Http applications. Once HTTP 2.0 is more common, it will seem madness that Http's user facing API is _not_ 100% multiplexing.
* Your tutorial examples use "using namespace std;" at global level.
That's a showstopper of bad practice, and you need to purge all of those. Same for "using namespace boost;" at global level.
The tutorials should be as small as possible to not scary the users.
But I can put a comment "you should avoid using namespace". Does it help?
using namespace is fine at local scope. It's not fine at global scope. Replacing all global using namespace with local using namespace is all you need to do.
* Your reference section is one very long thin list. You might want
to use more horizontal space.
I'd be happy already if I knew hot to remove the TOC from the reference page. It'd be much more clean already.
Personally in AFIO I hack the CSS to make the list into columns. ASIO employs a table. There are many solutions.
* I personally found the design rationale not useful. I specifically
want to know:
1. Why did Http choose ASIO as my base over leading competitor X and Y? 2. What sacrifices occurred due to that choice? i.e. things I'd love weren't the case.
I assume I don't need to reply these questions during the review, given the reviewers may already be experienced on the answers.
Well ... I'm not sure that's actually the case. It's not we don't have opinions on the answers, it's whether our answers are different to your answers. You'll see this during the AFIO review later this month - I have some really weird opinions no one else shares like choosing my own custom lightweight monadic future-promise implementation over async_result. That will be the source of much criticism I'm sure.
3. What alternatives may be better suited for the user examining
Http for suitability and why? i.e. make your documentation useful to people with a problem to solve, not just a howto manual.
I don't quite understand this point.
What I'm really asking is for details of when before starting Http you reviewed the options available and the designs they used, and exactly why you chose the design choices in Http you did with reference and comparision to prior art. Like a literature review, but for programming libraries. Does this make sense? It's partially for us to help us understand your choices, but also to help those examining Http to see if it's useful to them to figure out the best solution to their problem (which may be to use an alternative to Http). Documentation which is useful over documentation which is reference.
* Benchmarks, especially latency ones which are important for many
uses of HTTP are missing. I'd particularly like to know confidence intervals on the statistical distribution so I know worst case outcomes.
These values will change when I replace the parser used now. Also, they can change based on the buffer sizes and parser options, which the user is/will be able to change.
Nevertheless, I do want to do some serious benchmarking, specially because it'll give me a reference point when I write my own parser and want to compare performance speedup. I can do some poor benchmark this weekend and show the results.
That would be interesting, but don't go crazy on it. Also, please do enable the undefined behaviour sanitiser permanently including in release mode and with full stack smashing protection before running benchmarks. Anything parsing malicious input should have all those turned on where available.
I hope all the above isn't too disheartening Vinícius.
Not at all. I think you're the second person to dedicate more time reviewing my work and this isn't disheartening at all. Just that your feedback is usually around tooling and not the design itself (everybody has its preferences).
That's because I like the design a lot and have little bad to say about it. I've also been subscribed to your github since the beginning, and I have been reviewing your code on and off as you wrote it. You're a talented programmer.
Jamfile for the main library build,
Consider this done. I won't integrate into Boost without removing CMake first. This was already partially stated in the Bjorn's announcement (but somehow implicit, to be fair).
I have no problem at all with dual build systems if you're happy keeping both maintained. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/