2015-08-07 12:53 GMT-03:00 Niall Douglas
I do not believe Http is ready for peer review for these reasons, and I personally would urge you withdraw it until you have fixed these issues and then come back to us. If you do not withdraw it, I will vote for rejection.
Can I persuade you to review the Boost.Http design? I'd reject myself if you find any fundamental flaw on the design and even if you're going to reject the library because the lack of tooling (not directly related to design or documentation, which are the most important items on any library), it'd be useful to know what needs to be addressed on the design before I submit again (in the case the submission is rejected). 1. You require a C++ 11 compiler and are very much an extension of
ASIO throughout, and then don't take advantage of ASIO's extensive internal C++ 11 vs Boost switching infrastructure.
It's not my intention to make Boost.Http work with C++. I use Boost as dependency and I'm not willing to change that (unless the pieces I use from Boost enter on the C++ standard). About C++03 support, it's not required for Boost acceptance, so I can add later. I want to increase tests coverage a lot before porting code to C++03, so I don't introduce accidental bugs in the process. In the case the abstractions I need enter on the C++ standard, maybe your APIbind will help me. On the second page
of your documentation you state:
"std::error_code, for instance, cannot be transparently replaced by boost::system::error_code within ASIO-using code"
This is utterly untrue, and I told you that several months ago. ASIO itself maps some error_code implementation into namespace asio - go look at its config.hpp. Therefore you simply use whatever error_code ASIO itself uses, using the same internal STL abstraction infrastructure ASIO uses. The same goes for every other instance where you are using Boost dependencies.
I tried to find such macros on Asio and I couldn't find any for error_code: http://www.boost.org/doc/libs/1_58_0/doc/html/boost_asio/overview/cpp2011.ht... Now that you mentioned config.hpp, I used my grep/find skills and I found: BOOST_ASIO_HAS_STD_SYSTEM_ERROR But I just don't find any reference to BOOST_ASIO_HAS_STD_SYSTEM_ERROR outside of config.hpp. Even if it was used, it is undocumented, which means it is an implementation detail that won't be guaranteed to be there forever (accidental behaviour versus a documented guarantee). I said several months ago that you should properly finish the ASIO
integration by ripping out all your current hard coded usage of Boost and using ASIO's own STL abstraction layer, and thereby making Http identical in supported platforms and configurations as ASIO (i.e. standalone C++ 11 or with-Boost) reusing the same ASIO configuration macros. I then suggested it would be a very short step to port Http to C++ 03 as you are not using anything which particularly demands C++ 11 outside the examples and tests and which ASIO hasn't abstracted for you.
Http should be a model EXTENSION of ASIO in the cleanest possible sense in my opinion. You're not far from achieving it, and I really think you need to finish it.
Delay acceptance of the library will only delay its usage. Design of core abstractions is finished and I fail to see why its acceptance should be delayed while I improve implementation details. Standalone Boost.Http isn't as exciting as a WebSocket implementation. There is a roadmap of the features: https://boostgsoc14.github.io/boost.http/design_choices/roadmap.html C++03 is on the roadmap, but standalone Boost.Http is not. 2. You claim Boost.Build doesn't work outside the Boost tree. This is
untrue - Boost.Build doesn't need Boost. I don't mind cmake for the unit testing alone as a temporary stopgap until you bite the Boost.Build bullet before it enters Boost, but I draw the line at requiring cmake for normal builds. Before presenting for review normal library builds cannot use cmake OR should be header only like Hana.
CMake will be replaced by Boost.Build before any integration. 3. ... which leads me to asking why is Http not a header only library
like ASIO? That would eliminate your Boost.Build requirement in the strictest sense. Standalone ASIO doesn't need Boost.Build. Neither should Http.
You mention in the FAQ the reason why not header only is mainly due to the use of a third party HTTP C parser. It could be insufficient familiarity on my part with the vaguaries of parsing HTTP, but it has never struck me as difficult to parse - you don't even need to tokenise, and indeed my first instinct would be it is better to NOT tokenise HTTP as that is both slower and opens a wider attack surface to a hand written "stupid" parser.
(BTW static custom error code categories work just fine header only if correctly set up. Indeed Boost.System can be configured to be header only)
This parser is well tested and largely used. I want to increase test coverage a lot before replacing the parser with one of my own. HTTP spec used to be very confusing before RFC7230. Just now it's easy to implement all corner-cases. Previously, you needed a lot of experience. 4. Http is the first line of parsing code in a world full of
malicious exploitation - just this week all Android devices were made powned with a single malformed MMS message causing a buffer overflow. Anything parsing untrusted input needs to be regularly input fuzz tested period, and very especially Http. Until Http is running at least one of the below fuzz tools (american fuzzy lop or LLVM libfuzzer) at least weekly on a CI it is not safe to allow into production code.
https://svn.boost.org/trac/boost/wiki/BestPracticeHandbook#a7.SAFETY:S tronglyconsideranightlyorweeklyinputfuzzautomatedtestifyourlibraryisab letoacceptuntrustedinput
That's why this first implementation uses a proven-parser that is very well tested. All tests from the Boost.Http socket will run with varying buffer sizes (from 1 to 2047). Of course I do enjoy tests and I'm constantly implementing more (the last experiment was using sanitizers). Until you address a minimum of items 1 and 4, I am very sorry but I
must vote for rejection.
Other notes:
* I don't understand why you cannot issue more than one async read at a time nor async write at a time. In something like pipelined HTTP that seems like a very common pattern. I smell a potential design flaw, and while the docs mention a "queue_socket" I can't see such a thing in the reference section.
Too much synchronization and buffering. The pipeline has the following requests: A => B => C If I were to allow replies to request B, the design would be more complex, increasing the difficult to implement alternative backends and everyone's life. Also, the reply to request B cannot actually be sent while the reply to request A is on hold. This means that the whole response to request B would need buffering and if response B is a video live stream, this means an implicit (the user cannot know) crash after an unfortunate memory exhaustion. I couldn't claim a **lightweight** server if this excessive buffering was to occur. I couldn't claim the project is appropriate for embedded devices if I had chosen the buffering approach. It's a trade-off. I should write about it in the "design choice" of the documentation. HTTP/2.0 does have proper multiplexing and it can be more useful to respond several requests without "head of line blocking". A queue socket is a concept, I don't provide one. I can look into how make the documentation less confusing by adding a page exclusively dedicated to explain the problem it solves. Would that work for you? * Your tutorial examples use "using namespace std;" at global level.
That's a showstopper of bad practice, and you need to purge all of those. Same for "using namespace boost;" at global level.
The tutorials should be as small as possible to not scary the users. But I can put a comment "you should avoid using namespace". Does it help? * Your reference section is one very long thin list. You might want
to use more horizontal space.
I'd be happy already if I knew hot to remove the TOC from the reference page. It'd be much more clean already. * I personally found the design rationale not useful. I specifically
want to know:
1. Why did Http choose ASIO as my base over leading competitor X and Y? 2. What sacrifices occurred due to that choice? i.e. things I'd love weren't the case.
I assume I don't need to reply these questions during the review, given the reviewers may already be experienced on the answers. However, these are indeed good questions that I could add to the "design choices" page. Thank you. 3. What alternatives may be better suited for the user examining
Http for suitability and why? i.e. make your documentation useful to people with a problem to solve, not just a howto manual.
I don't quite understand this point. * Benchmarks, especially latency ones which are important for many
uses of HTTP are missing. I'd particularly like to know confidence intervals on the statistical distribution so I know worst case outcomes.
These values will change when I replace the parser used now. Also, they can change based on the buffer sizes and parser options, which the user is/will be able to change. Nevertheless, I do want to do some serious benchmarking, specially because it'll give me a reference point when I write my own parser and want to compare performance speedup. I can do some poor benchmark this weekend and show the results. * Personally speaking, I'd like if in your tutorial you took ten
self-contained steps in building a simple HTTP fileserver, each with a finished working program at the end of each section. Like the ASIO tutorial does. In particular, I'd like to see layers which add on etags and other of the fancy facilities in stages.
That's a good idea, thank you. I won't replace the current tutorial, though, as it teaches the main players on HTTP. I can, however, add a second tutorial showing how to implement a **simple** file server. * I would personally worry about investing on developing code based
on Http until you have HTTP 2.0 support in there, only because I would suspect you might need to break the API a bit to get the all-binary format of HTTP 2.0 working as it's such a change from HTTP 1.0. It may merely be paranoia, but I suspect I wouldn't be alone on this silent assumption.
You're right about being paranoid here. I was too. But I've spent some time researching HTTP/2.0 and I got confident about the current design. But you could vote for conditional acceptance based on HTTP/2.0 and I'd think it is fair. * You're still missing CI testing with valgrind, thread sanitiser,
coveralls.io coverage testing, etc etc all the stuff from https://svn.boost.org/trac/boost/wiki/BestPracticeHandbook.
I already started the experiments with the sanitizers, but I got a problem with Boost Test and the experiment was put on hold. I'll come back to it later. I hope all the above isn't too disheartening Vinícius. Not at all. I think you're the second person to dedicate more time reviewing my work and this isn't disheartening at all. Just that your feedback is usually around tooling and not the design itself (everybody has its preferences). If you took
six weeks extra to finish the port of Http to ASIO proper,
I disagree about this point. Boost.Http is properly using Asio. Just because it won't support standalone Asio, it doesn't mean it's improperly using Asio. added a
Jamfile for the main library build,
Consider this done. I won't integrate into Boost without removing CMake first. This was already partially stated in the Bjorn's announcement (but somehow implicit, to be fair). -- Vinícius dos Santos Oliveira https://about.me/vinipsmaker