On Fri, Aug 7, 2015 at 3:08 AM, Bjorn Reese
Dear Boost and ASIO communities.
The formal review of VinÃcius dos Santos Oliveira's Boost.Http library starts today, August 7th and ends on Sunday August 16th.
Boost.Http is designed for various forms of modern HTTP interaction, from normal HTTP request, over HTTP chunking and pipelining, to upgrading to other web protocols like WebSocket. This library builds on top of Boost.ASIO, and follows the threading model of ASIO.
The two basic building-blocks are http::socket, which is socket that talks HTTP, and http::message with contains HTTP meta-data and body information. You can use these building-blocks to build a HTTP server that fits your exact needs; for instance, an embedded HTTP server for a ReST API. Boost.Http comes with a light-weight HTTP server and a static file server.
Currently, Boost.Http is limited to server-side interaction, but the design principles used extends to client-side as well.
Boost.Http was originally developed as part of GSoC 2014 and VinÃcius has continued to develop and improve the library since then.
The documentation can be found here:
http://boostgsoc14.github.io/boost.http/
The source code can be found here:
https://github.com/BoostGSoC14/boost.http
The source code is build using CMake, but Boost.Build is in the pipeline (already done for documentation.)
Please answer the following questions:
1. Should Boost.Http be accepted into Boost? Please state all conditions for acceptance explicity.
I do not think Boost.Http should be accepted in its current state. My main reasons (read portions below for expanded comments/thoughts): (1) Incomplete design - There is no built-in mechanism for writing out client headers, which means there is no client support. (2) Unproven design - A low-level abstraction for a "HttpSocket" is provided, but only two higher-order methods for file serving are provided. Why aren't there higher-order functions for more common use cases? (3) More considerations for design choices (or documentation wording) should be considered - All data reads / writes for the message body require a copy to be made. (4) More time is needed to think about the design patterns (this follows from a lack of (2)) - why is `async_write_response_continue` a requirement for the http::ServerSocket concept? (5) Documentation can be better organized and lacks critical information in a few areas. 2. What is your evaluation of the design?
- There is no built-in mechanism for writing out client headers, which means there is no client support. The provided rationale is that it took a long-time to design the server portion, and the client side will take equally as long. I don't see this as a valid argument; several current library authors have indicated that providing a "boost-ready" library is an incredible amount of work. I would expect a boost::http library to be more feature complete before being accepted. - An abstraction for a "HttpSocket" is provided, but only a few higher-order functions are provided (for file serving). The current lower-level abstractions _should_ allow for useful utilities to be written, but I would like to see more functionality implemented before boost acceptance, so the lower-level design is more "proven". - The http::ServerSocket concept defines methods for writing response headers, writing the body, writing trailers, and writing an entire response message at once. This concept provides everything necessary for writing a valid HTTP response. So why is `async_write_response_continue` mandated in the http::ServerSocket concept when a free function can be implemented, that uses the other concept functions? If the reasoning is performance (which appears to be the reason) - why isn't a more performant method (probably some form of pre-generation) of sending messages added? This would prevent the socket concept from continually absorbing additional special case messages as higher order functions are developed, and allow for clients to pre-generate error responses. - `async_read_request`, and `async_write_response_metadata` in the http::ServerSocket concept require an entire http::Message concept as an argument, when only the AssociativeContainer concept should be necessary for this function. - `async_write` and `async_read_some` in the http::Socket concept requires an entire http::Message concept as an argument, when only the SequenceContainer concept should be necessary for this function. - All data reads / writes for the message body require a copy to be made, because memory from a SequenceContainer concept cannot be given to ASIO/OS directly (std::liststd::uint8_t is a SequenceContainer). The message body should likely be restricted to the C++1z ContiguousContainer concept instead. I think users would prefer speed here more than flexibility. Its also easier to relax requirements after release than the opposite. - Ideally the read / write functions for the payload would have the same requirements as an asio buffer. Requiring a container prevents memory locations from other sources (such as directly from a streambuf), and prevents scatter gather i/o. I'd say this is less critical, as its something most users are unlikely to use. - `async_read_trailers` and `async_writer_trailers` in the http::Socket concept require an entire http::Message concept as an argument, when only the AssociativeContainer concept should be necessary. - I mentioned this before; client and server specific header writing/reading is a shame due to the overwhelming similarities. My earlier suggestions were criticized for being stringly typed (or having more states), but I think this was a somewhat weak criticism since the type system can be used more effectively. `async_write_header(http::response{200, "OK"}, ...)` or `async_write_header(http::request{http::request::GET, "/"}, ...)`. - It might be worth eventually relaxing the requirements for the key/values in the AssociativeMap used by the field parsing functions to a subset of string functions. begin(), end(), size(), empty(), push_back(), pop_back(), operator<(), a hash implementation, and contigous data. This way a quick and dirty small string optimization could be written for field handling. - What is the purpose of the standalone `async_write_response` and `async_write_response_metadata` functions? They only forward to functions with the same name in the http::ServerSocket concept, is this cruft? - I like that all of the asynchronous callbacks take consistent parameters. asio::coroutine works nicely.
3. What is your evaluation of the implementation?
Overall the quality is good. I didn't go read the code thoroughly, but I did notice a few things - - The current `async_write_response` implementation assumes contiguous elements on the SequenceContainer concept, which is incorrect. - A std::ostringstream is used, when spirit::karma would be faster and have less memory allocations (minor). - The current `async_read_some` implementation has an error code path that can initiate a write. This is not documented anywhere, and effectively issuing an `async_read_some` command blocks both the send and receive channels of the underlying ASIO stream. This problem is likely to occur in a few other spots, but I didn't check. - There are some error code paths in async operations that modify the entire message object (clear() is invoked on all parts of the message). Seems unnecessary, and is at least undocumented. 4. What is your evaluation of the documentation?
Needs better organization, and someone to edit the rationale section in particular. - The reference section is organized strangely. Every class, concept, function, and file is listed under the same header, and then its broken into sections at the bottom. I didn't notice the bottom portion at first, and it should replace the top portion that lacks organization. - How do the various write functions manipulate the headers? What fields are added, if any? This is partially mentioned at the bottom of the http::ServerSocket page, but I saw "content-length: 22" added to my message, and this was never explicitly stated (although obviously assumed). This should likely be explicitly specified in the concept itself, AND the location should be specified (i.e. these fields are explicitly appended). Should also document that some fields, such as content-length cannot be listed twice according to the specification, while comma de-limited fields can be listed multiple times as a valid form of appension. I.e. "content-encoding: gzip, sdch\r\n" and "content-encoding: gzip\r\ncontent-encoding: sdch\r\n" indicate equivalent content encodings applied to the data. Should probably mention that Transfer-Encoding is a comma delimited list too. - The `key_type` and `mapped_type` for the headers portion of the message::Concept indicate that they must meet the "String concept". I don't recall seeing this concept being defined in C++ or boost, where does it come from? If its from C++1z, it might be to merge the behavior of string_ref and basic_string, so that concept would be unlikely to require storage like a container. Might want to re-think the concept requirements here, or state that only a conforming std::basic_string implementation is acceptable for now. - Mentioned previously - the documentation for basic_socket should state when a function is using the read or write portion of the stream. Currently some functions use both unexpectedly, such as async_read_some.
5. What is your evaluation of the potential usefulness of the library?
This library has multiple potentially use cases, especially if a good low-level design is provided. Additional higher-level functions would be nice, because many (perhaps most) users have similar requirements, don't need extreme performance, and just want something simple and relatively efficient. Hopefully work will be continued on this library, regardless of the outcome of this review process. 6. Did you try to use the library? With what compiler? Did you have
any problems?
I ran all of the tests on OSX, which had no complaints. I also wrote a quick asio::coroutine implementation, and viewed the data passing by in wireshark for good measure.
7. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I spent a good amount of time reading the documentation, and implementation. A little less testing / running code.
8. Are you knowledgeable about the problem domain?
I have fairly extensive knowledge of protocols, and experience providing implementations for protocols (including HTTP). I also have a reasonable amount of ASIO experience. Lee