First of all, sorry to all members of the list about my unavailability. I
was planing to write an experimental HTTP 2.0 backend, so I could give more
confidence about how much I believe this Boost.Http core I present for
review is the right abstraction.
Anyway, looks like I took the wrong approach. I should have answered the
easiest questions first and implement the HTTP 2.0 backend (experimental,
using existing-library and not Boost-quality) later.
2015-08-11 14:35 GMT-03:00 Niall Douglas
Where I'm really at is I think if Http is accepted you're going to either have to ditch it and reimplement atop the Networking TS as Chris folds the substantial changes WG21 will force onto ASIO into Boost.ASIO, or end up refactoring Http to cleave more closely to the Networking TS anyway.
I'm okay with that.
And then you aren't following C++ rule number #1 anymore: You only pay for
what you use. That's why Asio itself doesn't solve this problem for you. You can use boost::http::basic_socket
if you need to work around Asio composed operations at this level. All customization points are there for anyone. I am finding myself unconvinced by your arguments here. What stands in the way of a two layer API design? Bottom layer is racy but lowest latency. Top layer is not racy, but adds some latency.
I think for the majority of HTTP users they just want it to work without surprises to a high default performance level. If you look at the history of the HTTP library support in Python you'll see what I mean - firstly, it's surprisingly easy to get a HTTP library API design wrong, even in a v2 refactor. And secondly that people need both a stupid-simple API and a more bare metal API *simultaneously* with HTTP, and therein lies the design gotcha.
If you want a high level API, you're going to use coroutines. There is nothing in the wild so readable as coroutines. Coroutines are **the** solution to solve spaghetti code in asynchronous abstractions. Lambdas and futures will never be as readable as coroutines. Anyway... If you want a high-level API, you're going to use coroutines and the use of coroutines will already suspend your code until the completion of the previous operation. You end up not scheduling too many operations at once (less resources consumed) and you are using the API the right way. If you want to not use coroutines and still have a somehow high-level API, just change the underlying socket. It's not a problem. The only problem I see here is the lack of a page just documenting composed operations given the confusion that arose on this matter. Really, you should NOT pay for what you do NOT use. Eventually we'll have coroutines in the language, so you will be unable to even something more efficient and will end up just using coroutines. And with the design I propose, you won't even pay for scheduling/storing multiple operations that just cannot be used right now. If the "pay for what I don't use" design was what Boost wanted, I believe Boost.Asio would do different (fair enough that Boost.Asio is a low-level library). Now, about a real high-level API. Boost.Http is somewhat low-level, but not because the reasons given. I can provide a higher-level API and it won't change the points you're against. If you see the Boost.Http roadmap, you'll notice where Boost.Http is really low-level (lack of requests router, form parsing, HTTP session management...). This kind of stuff can go hugely polemic and I think it's very unwisely to integrate it all at once. You'd be like comparing frameworks that are completely different (like Python's Django or Flask) and asking "hey, what is the **right** choice?". This kind of question is really unhelpful. These frameworks continue to evolve and sometimes they break API or even completely new approaches arise (like the recent rise of popularity in web microframeworks). I'd rather provide really generic and flexible building blocks than state that my view of web development is correct. Not too long ago, LAMP was a very popular solution, and this solution assumed you would use MySql database, but sometimes you do not even need a database. You should check the answer for "Why isn't a router available?" on the Boost.Http FAQ: https://boostgsoc14.github.io/boost.http/design_choices.html It's very polemic and I need to develop a NEW approach, not import the design from some place or another. I need to reconcile current approaches (I'd like to use the word paradigm here). Not only reconcile them, but I need to allow some kind of collaboration. And it's C++, it's harder. It's not harder because it's C++. It's harder because the C++ community takes software development very seriously. And then, it's not just C++, it's Boost, the small group from C++ that is know among the community as the group who strives to deliver even higher quality software. If we stick for what we need for now, I believe the correct question to focus on is "If I need to communicate HTTP messages, what is the correct approach?". It's what we need now, pass HTTP messages around. And then, the HTTP protocol may not even be involved, that's why I focused so much on allow alternative HTTP backends. Extremely detailed and careful requirements like the ones written for Message[1] and ServerSocket[2] is not something you'll see anyone doing. And it takes a lot of care because you need to be careful who you're excluding. I choose to not exclude HTTP 1.0 and I've put HTTP upgrade and HTTP chunking as optional features that the user must check. I also choose to not exclude alternative backends. I also choose to allow **lightweight** implementations, so embedded devices would be left out. cpp-netlib and pion have their own thread pools and aren't very friendly to embedded devices. I also go fully async, allowing really really fine-grained control by the user. Even trying to be so ambitious, I'd not say that the design went so low-level that becomes unusable, as the spawn example proves[3] (163 lines and a good part is because Boost.Asio boilerplate, not Boost.Http). I have some experience with HTTP libraries (one of the most important being the Tufão project[4]) and I acquired experience. This experience I have is what makes me capable of judging some design decisions that can be a mistake. Not considering alternative HTTP backends from the beginning being one of them. Other mistakes being less obvious. The Node.js API is really trick (implicit decision on whether use chunking or not) to get right if you're concerned with portability among applications (there is an old and fixed Tufão bug related to just this[5]). The hush to get high-level APIs can also become a problem because you can very very easily lose the possibility to do fine-grained adjustments. First you're fine with single-threaded, and then you want the handling being split into threads. Then you want not split the responsibility to split the connections among threads, but each pair of request-response with a scheduler that is clever than round-robin[6]. Eventually you'll end up losing the interoperability and having to rewrite large parts of the application. I'm very concerned about interoperability, that's why I mentioned it in the very initial GSoC proposal last year, along the following lines: "In fact, there is a lot of higher-level abstractions competing with each
other, providing mostly incompatible abstractions. By not targeting this field, this library can actually become a new base for all these higher-level abstractions and allow greater interoperability among them." -- https://github.com/vinipsmaker/gsoc2014-boost#non-goals
The more I think, the more I believe how much such "middle-level" API is underestimated. I mentioned at random places how much I appreciate the message-oriented approach I'm using. Now I think I should have dedicated a whole chapter on the topic. If it weren't for this message-oriented approach, the design would be much more complex. It'd be like trying to solve problems of a high-level API (set_timeout, set_scheduler, set_handler_factory, set_allocator, set_pool, set_feature_xyz) at this level already and thinking really hard to not miss **ANY** feature that the user could possibly want to customize. The message-oriented approach has a single-first-immediate impact: Communication channels and message representations are decoupled. Without this simple separation, you do not need to complicate communication channels so much. And you also gain the ability to use your own allocators, pools, data structures, non-allocating buffers and so on. I've read some messages about people concerned that the API is too level and even still will allocate sometimes. All the examples I've written allocate and the reason is that all the examples I wrote don't have a bounded limit of HTTP connections (there isn't a single big chunk of stack-allocated pool/buffer/...). A new "handler" will be created as a new connection appears. But if you put a limit, implement your own data structures and read the documentation (it doesn't even need to be an extremely careful read), you're done. There are some gritty decisions on implementation details that made me allocate at some spots (use of functors, lack of dynarray...), but then I should discuss the implementation and not mix with API design (unless you propose an interface to specify allocators). It's not really like Boost.Http is too low level. It's more like only the less polemic building blocks are available now. You'll see the same main players in a higher-level abstraction and the main difference is most likely that you won't manage the main players yourself. And advocate for a Boost.Http instead a Boost.NetExtensions because I allow alternative backends. Asio will always be involved because the pure async nature, but you might not even use network and use an alternative backend that communicate using shared memory and other means. There are many more thoughts, but I think I'm diverging too much from Niall's concern, so I'll stop here and answer the rest of the questions. If you guys have any question to specific points, just raise them. [1] https://boostgsoc14.github.io/boost.http/reference/message_concept.html [2] https://boostgsoc14.github.io/boost.http/reference/server_socket_concept.htm... [3] https://github.com/BoostGSoC14/boost.http/blob/0fc8dd7a594bb5ebb676d2d55621a... [4] https://github.com/vinipsmaker/tufao [5] https://github.com/vinipsmaker/tufao/issues/41 [6] https://en.wikipedia.org/wiki/Round-robin_scheduling -- Vinícius dos Santos Oliveira https://about.me/vinipsmaker