Dear All, This review is not very thorough - I've only looked at the docs and some small parts of the code, and read the discussion on the list. I've not tried to use the library. But I think I've seen enough to express an opinion. I've never used ASIO. I've studied it in the past, and concluded that it offers portability to platforms that I don't use but doesn't work well with other platforms that I do use. It also seems more complicated than I would have liked, perhaps in order to achieve some optimised level of performance that I don't need (or perhaps for no good reason). I have however quite often used HTTP (hasn't everyone?) and I think it's an embarrassment that std::c++ can't download something from the 'net with one or two lines of code in the way that almost every other language can. So, my first reaction is that it's a disappointment to see that a proposed HTTP library is tied to ASIO. On further investigation, however, I think this is no great loss to me as the functionality that this library provides is so limited in scope. Parsing HTTP headers, for example, is not much work. Even my own hacked-together-in-an-afternoon HTTP classes do more than this - to mention just one thing, parsing and formatting of dates in the correct format. Everyone who uses this library will have to implement this sort of thing themselves on top of it. (And many of them will implement it wrongly.) Vinnie says that he hopes that more functionality will be layered on top of this work. The danger is that the person who comes along wanting to implement the other 75% will not like Vinnie's starting point, and re-implement it all in their own style. (I know I would.) Unless, of course, Vinnie is going to offer this "other 75%" himself eventually. *That would be great*. If that is the plan, my review is "this is a fine start but please remove the ASIO dependency and come back when it's finished". But I think Vinnie has said that the library is to be reviewed "as is". In that case, my review is "just add this to ASIO and call it boost.asio.http and boost.asio.websocket, no need for a review" (assuming that the ASIO developer is happy with that). Regarding what I have seen of the implementation: - Security must be one of the most important properties of an HTTP library, and I'd really hope to find that such a library had been implemented by someone with significant experience in this area. I never cease to be amazed by the ingenuity of attackers. - I fear that there may be much premature optimisation in the parser code that I've looked at. For example: static bool is_digit(char c) { return static_cast<unsigned char>(c-'0') < 10; } Is that right? Say c='!' (ASCII 33). '0' is ASCII 48. So c-'0' is -15. But you cast that to an unsigned char, so -15 = 241, and 241 > 10, so the function returns false. So it is correct - but (1) it needs a comment to explain it, (2) it's the sort of thing that makes me worry from a security viewpoint, and (3) is it actually an improvement on the more obvious c >= '0' && c <= '9'? Well I've done a quick test and I get the same assembler for both - so why bother? (Except on close inspection actually I don't get exactly the same assembler. To get the same assembler I have to replace the constant 10 in your code with 10U, which changes the final comparison from signed to unsigned. I don't think that matters, but it again makes me wonder whether this security-critical code is ideal.) But hey, why even do c >= '0' && c <= '9' when we have std::isdigit()? But it would be much slower, right? Wrong, it's actually one instruction shorter; somehow it manages to avoid one of the zero-extension instructions. So this is not a premature optimisation, it's a premature pesimisation. (My test code is at the end of this message.) Now, is_digit is just about the simplest thing in the parser; Vinnie has also implemented his own code for parsing numbers, and then we come to parse_field. Please have a look at this (around line 666 of basic_parser.hpp). Note that it calls find_fast() near the top of that file, which uses some Intel SIMD instruction to, I think, skip forward over the header field name, and then over the field value to the CRLF. Not that there are any comments in find_fast to explain what it does, for those of us who don't know what _mm_loadu_si128 and _mm_cmpestri do. Has this code actually been benchmarked against a much simpler and more obviously-correct version, and found to have a worthwhile performance benefit? (I believe that some of the discussion has used this "optimised parser" as a justification for not making the code more generic, i.e. to work with any iterator pairs. I believe that to be flawed, since (a) I don't think this optimisation is useful, and (b) even if it were, the code could use it for any contiguous iterators, and fall back to "unoptimised" code in other cases. I think that having the "unoptimised" code present would be useful for documenting what the parser is trying to do, in any case.) Could it be that the reason why the scope of the library seems so limited is that Vinnie has spent his time fiddling with this optimisation, rather than adding useful features? My opinion: having looked at the implementation, reject. Test for is_digit: #include <cctype> bool is_digit_1(char c) { return static_cast<unsigned char>(c-'0') < 10U; } bool is_digit_2(char c) { return c >= '0' && c <= '9'; } bool is_digit_3(char c) { return std::isdigit(c); } Aarch64 assembler output from g++ 6.3.0 -O3, with the uninteresting bits removed: _Z10is_digit_1c: uxtb w0, w0 sub w0, w0, #48 uxtb w0, w0 cmp w0, 9 cset w0, ls ret _Z10is_digit_2c: uxtb w0, w0 sub w0, w0, #48 uxtb w0, w0 cmp w0, 9 cset w0, ls ret _Z10is_digit_3c: uxtb w0, w0 sub w0, w0, #48 cmp w0, 9 cset w0, ls ret Regards, Phil.
On Sun, Jul 9, 2017 at 11:45 AM, Phil Endecott via Boost
I'd really hope to find that such a library had been implemented by someone with significant experience in this area.
An off-list request was made to describe the nature of the bug discovered through Jens' fuzz tests, here is the fix: https://github.com/vinniefalco/Beast/commit/09af312f8242da1cb134fb7d5fb8149e... Thanks
On 09-07-17 20:45, Phil Endecott via Boost wrote:
(I believe that some of the discussion has used this "optimised parser" as a justification for not making the code more generic, i.e. to work with any iterator pairs. I believe that to be flawed, since (a) I don't think this optimisation is useful, and (b) even if it were, the code could use it for any contiguous iterators, and fall back to "unoptimised" code in other cases. I think that having the "unoptimised" code present would be useful for documenting what the parser is trying to do, in any case.) +1 for this sentiment.
Boost has an excellent library for fast and versatile parsing. Using it instead would at once lend a level of testing infeasible with the hand-written code here AND make sure it runs on all target architectures, as well as on any type of iterators. I think indeed premature optimization does not benefit the library - certainly in this stage of adoption.
Am 10.07.2017 um 00:22 schrieb Seth via Boost:
On 09-07-17 20:45, Phil Endecott via Boost wrote:
(I believe that some of the discussion has used this "optimised parser" as a justification for not making the code more generic, i.e. to work with any iterator pairs. I believe that to be flawed, since (a) I don't think this optimisation is useful, and (b) even if it were, the code could use it for any contiguous iterators, and fall back to "unoptimised" code in other cases. I think that having the "unoptimised" code present would be useful for documenting what the parser is trying to do, in any case.) +1 for this sentiment.
Boost has an excellent library for fast and versatile parsing. Using it instead would at once lend a level of testing infeasible with the hand-written code here AND make sure it runs on all target architectures, as well as on any type of iterators. +1 for suggesting to use a parser library for parser tasks
On Mon, Jul 10, 2017 at 1:22 AM, Seth via Boost
Boost has an excellent library for fast and versatile parsing. Using it instead would at once lend a level of testing infeasible with the hand-written code here AND make sure it runs on all target architectures, as well as on any type of iterators.
If you mean Boost.Spirit then from my experience it is not always the best tool. Not that I benchmarked that code much but I had a number of cases when it was actually easier to write the correct parser by hand than to express all the corner cases and error handling in Boost.Spirit. The resulting code looks cleaner and more easy to follow (for me, at least). This is most often the case for micro-parsers. And obviously, hand-made parsers are almost always faster to compile. As an example, settings parsers in Boost.Log were initially implemented in Boost.Spirit. There was a lot of criticism at that time that Boost.Log was too heavy to compile and the binaries were huge. I had to rewrite most of the parsers by hand to alleviate the problem. I didn't review Beast implementation and I can't tell whether Vinnie's choice was justified, but I can understand why he could make that choice.
I think indeed premature optimization does not benefit the library - certainly in this stage of adoption.
I can argue that there is no such thing as too much performance. There's a balance with maintainability and other concerns, but as long as other requirements are satisfied, more performance is always welcome. And in many areas performance is the key point, so having faster code would certainly help the adoption. I'd like to support Vinnie in his willingness to write and maintain code with the aim at performance. (Of course, as long as the performance is actually improved according to benchmarks.)
On Tue, Jul 11, 2017 at 5:08 PM, Andrey Semashev via Boost
I can argue that there is no such thing as too much performance. There's a balance with maintainability and other concerns, but as long as other requirements are satisfied, more performance is always welcome. And in many areas performance is the key point, so having faster code would certainly help the adoption.
I'd like to support Vinnie in his willingness to write and maintain code with the aim at performance. (Of course, as long as the performance is actually improved according to benchmarks.)
I have spent relatively little time optimizing specific areas of Beast. The parts I invested some time in were those parts which showed very obvious opportunities for non-trival improvements, when measured with a profiler. Although I am certain there are plenty more optimizations yet to be discovered, I am in no rush to do so. An area which I HAVE spent some time in, is making sure that Beast's interfaces do not inhibit future optimization from taking place. Stream and buffer algorithms are designed to allocate memory as minimally as possible. For algorithms that require memory allocation, the interface permits caller provided allocators to be passed. This extends especially to the asynchronous interfaces, which use the asio_handler_allocate, asio_handler_deallocate hooks. For example, I have developed a special allocator called "session_alloc" which wraps your completion handlers to customize the allocation hooks, and also functions as a standard Allocator for your containers and buffers. You create a new session_alloc allocator for each connection, and after a quick "warm-up" period where it learns the allocation requirements for the heaviest allocating call chain, future asynchronous operations become zero-alloc: https://github.com/vinniefalco/Beast/blob/25efdf1ba2b1f2f12781597c4a953a4989... This type of optimization is possible because of diligent adherence to Asio best practices and being mindful of customization points. Anything that Beast does which you don't like for performance reasons, you can usually replace. You can use your own Allocator with basic_fields, or implement your own Fields. You can use your own stream algorithm using basic_parser or serializer. You can derive from basic_parser and represent messages different ways, and Beast's stream read algorithms will still work for you. Beast is designed as a flexible, customizable set of tools you can use to assemble your own solution the way that you want.
On Sun, Jul 9, 2017 at 11:45 AM, Phil Endecott via Boost
Could it be that the reason why the scope of the library seems so limited is that Vinnie has spent his time fiddling with this optimisation, rather than adding useful features?
I don't find this kind of comment to be constructive or helpful in a review context. Vinnie had a concept about what the library should be and spent considerable time and effort designing it. He's repeatedly stated that he wanted something limited in scope and explained his rationale behind that. I'm sure he presented the best design he could come up with. To suggest that the limited scope is, instead, because you feel he micro-optimized something you don't think should be micro-optimized seems petty.
Test for is_digit:
#include <cctype>
bool is_digit_1(char c) { return static_cast<unsigned char>(c-'0') < 10U; }
bool is_digit_2(char c) { return c >= '0' && c <= '9'; }
bool is_digit_3(char c) { return std::isdigit(c); }
Your criticism of custom reimplementations of things like isdigit isn't entirely off the mark. Such things should be avoided almost always, and when reimplemented, there should be a clear and convincing reason. Maybe that's the case here, maybe it isn't - that's a legitimate concern and a fair one to bring up. But have you considered, for example, the fact that std::isdigit can return true for digits other than 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9? According to cppreference, "some implementations (e.g. Microsoft in 1252 codepage) may classify additional single-byte characters as digits"
Nik Bougalis wrote:
Your criticism of custom reimplementations of things like isdigit isn't entirely off the mark. Such things should be avoided almost always, and when reimplemented, there should be a clear and convincing reason. Maybe that's the case here, maybe it isn't - that's a legitimate concern and a fair one to bring up.
But have you considered, for example, the fact that std::isdigit can return true for digits other than 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9? According to cppreference, "some implementations (e.g. Microsoft in 1252 codepage) may classify additional single-byte characters as digits"
Yes, I dislike std locale-related things for this reason. If there were std::is_ascii_digit() I'd use it. I was actually surprised by the result that I got for std::isdigit, which I only tried as an afterthought. Regards, Phil.
participants (6)
-
Andrey Semashev
-
Mike Gresens
-
Nik Bougalis
-
Phil Endecott
-
Seth
-
Vinnie Falco