Boost.Beast includes CRLF in HTTP fields
Hello, It looks like HTTP response headers are saved with trailing CRLF in the message upon receiption after (async_)read. Since it usually does not add additional valuable information, would it be possible to remove them in next version of Boost.Beast? It will be more convenient as you don't need to trim them when you want to capture the field value. Regards, -- David
On Fri, Jul 6, 2018 at 3:45 AM, David Demelier via Boost-users
It looks like HTTP response headers are saved with trailing CRLF in the message upon receiption after (async_)read.
Yes, beast::http::basic_fields stores each field/value pair in its serialized format, which includes the trailing CRLF and also the colon and space. It is done this way so that beast::http::serializer requires no memory allocations to perform its function.
Since it usually does not add additional valuable information, would it be possible to remove them in next version of Boost.Beast?
No
It will be more convenient as you don't need to trim them when you want to capture the field value.
You shouldn't have to do any trimming. basic_fields member functions and iterator value_type which return field values do so using a string_view, whose length excludes the trailing CRLF and also any colon and space. Callers should not assume the string is null-terminated, and work with the interface provided by string_view. See: https://www.boost.org/doc/libs/1_67_0/libs/beast/doc/html/beast/ref/boost__b... https://www.boost.org/doc/libs/1_67_0/libs/beast/doc/html/beast/ref/boost__b... https://www.boost.org/doc/libs/1_67_0/libs/beast/doc/html/beast/ref/boost__b... Thanks
On Fri, 2018-07-06 at 08:31 -0700, Vinnie Falco via Boost-users wrote:
You shouldn't have to do any trimming. basic_fields member functions and iterator value_type which return field values do so using a string_view, whose length excludes the trailing CRLF and also any colon and space. Callers should not assume the string is null-terminated, and work with the interface provided by string_view.
Ah yes, I didn't see the basic_fields' value_type has dedicated member functions, I was blindly doing: cout << foo[field::location]; Thanks! -- David
On Sat, Jul 7, 2018 at 1:19 AM, David Demelier via Boost-users
I was blindly doing:
cout << foo[field::location];
If `foo` is an instance of http::basic_fields, then the expression
above should not produce the CRLF sequence. It should output only the
value of the Location field.
On Sat, Jul 7, 2018 at 1:43 AM, Richard Hodges via Boost-users
Out of interest, why do these functions return a const string_view?
To prevent assignments such as: foo[field::location] = "localhost:8080/index.html"; The statement above has no effect, the `const` annotation prevents it from compiling.
On Fri, 6 Jul 2018 at 17:32, Vinnie Falco via Boost-users < boost-users@lists.boost.org> wrote:
On Fri, Jul 6, 2018 at 3:45 AM, David Demelier via Boost-users
wrote: It looks like HTTP response headers are saved with trailing CRLF in the message upon receiption after (async_)read.
Yes, beast::http::basic_fields stores each field/value pair in its serialized format, which includes the trailing CRLF and also the colon and space. It is done this way so that beast::http::serializer requires no memory allocations to perform its function.
Since it usually does not add additional valuable information, would it be possible to remove them in next version of Boost.Beast?
No
It will be more convenient as you don't need to trim them when you want to capture the field value.
You shouldn't have to do any trimming. basic_fields member functions and iterator value_type which return field values do so using a string_view, whose length excludes the trailing CRLF and also any colon and space. Callers should not assume the string is null-terminated, and work with the interface provided by string_view.
See:
< https://www.boost.org/doc/libs/1_67_0/libs/beast/doc/html/beast/ref/boost__b...
< https://www.boost.org/doc/libs/1_67_0/libs/beast/doc/html/beast/ref/boost__b...
< https://www.boost.org/doc/libs/1_67_0/libs/beast/doc/html/beast/ref/boost__b...
Out of interest, why do these functions return a const string_view?
Thanks _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
On Fri, 2018-07-06 at 08:31 -0700, Vinnie Falco via Boost-users wrote:
It will be more convenient as you don't need to trim them when you want to capture the field value.
You shouldn't have to do any trimming. basic_fields member functions and iterator value_type which return field values do so using a string_view, whose length excludes the trailing CRLF and also any colon and space. Callers should not assume the string is null-terminated, and work with the interface provided by string_view.
Okay, maybe I'm stupid or I don't understand something. I've done a simple request to google.fr and the http::response[field::location] still has CRLF even with the following code using value_type's value() function: if (res_.result() == status::moved_permanently) { const auto it = res_.find(field::location); if (it != res_.end()) cout << "new location: [" << it->value << "]" << std::endl; } The res_ object is a http::responsehttp::string_body, And once printed to the console I have: new location: [http://www.google.fr/ ] May I miss something from your original answer? Regards, -- David
On Sat, Jul 7, 2018 at 8:11 AM, David Demelier via Boost-users
The res_ object is a http::responsehttp::string_body, And once printed to the console I have:
new location: [http://www.google.fr/ ]
Hmm...that's not what I'm seeing. I added this function to
test/beast/http/fields.cpp:
void
testValue()
{
request
On Sat, 2018-07-07 at 08:20 -0700, Vinnie Falco via Boost-users wrote:
On Sat, Jul 7, 2018 at 8:11 AM, David Demelier via Boost-users
wrote: The res_ object is a http::responsehttp::string_body, And once printed to the console I have:
new location: [http://www.google.fr/ ]
Hmm...that's not what I'm seeing. I added this function to test/beast/http/fields.cpp:
void testValue() { request
req; req.insert(field::location, "http://www.google.fr/"); auto it = req.find(field::location); if(it != req.end()) log << "[" << it->value() << "]"; } The output is:
beast.http.fields [http://www.google.fr/]25.6s, 1 suite, 1 case, 334 tests total, 0 failures The program '[0x6BF4] tests-beast-http.exe' has exited with code 0 (0x0).
If the CRLF was being included in the value, then none of these tests would pass, as the equalities would evaluate to false:
My bad, I apologize I wrongly converted the boost::string_view to std::string using the .data() member function and I just realized it's not null terminated... That's why my string was filled with random characters. Sorry for the noise! I should search how people convert boost::string_view to std::string in a convenient manner instead ;) Regards, -- David
On Sat, Jul 7, 2018 at 8:34 AM, David Demelier via Boost-users
I wrongly converted the boost::string_view to std::string using the .data() member function and I just realized it's not null terminated...
Ah! Glad to see the mystery is solved. However, you would be better off working directly with the string_view instead of constructing a string, to avoid an unnecessary dynamic memory allocation. Regards
Le sam. 7 juil. 2018 à 11:34, David Demelier via Boost-users < boost-users@lists.boost.org> a écrit :
On Sat, 2018-07-07 at 08:20 -0700, Vinnie Falco via Boost-users wrote:
On Sat, Jul 7, 2018 at 8:11 AM, David Demelier via Boost-users
wrote: The res_ object is a http::responsehttp::string_body, And once printed to the console I have:
new location: [http://www.google.fr/ ]
Hmm...that's not what I'm seeing. I added this function to test/beast/http/fields.cpp:
void testValue() { request
req; req.insert(field::location, "http://www.google.fr/"); auto it = req.find(field::location); if(it != req.end()) log << "[" << it->value() << "]"; } The output is:
beast.http.fields [http://www.google.fr/]25.6s, 1 suite, 1 case, 334 tests total, 0 failures The program '[0x6BF4] tests-beast-http.exe' has exited with code 0 (0x0).
If the CRLF was being included in the value, then none of these tests would pass, as the equalities would evaluate to false:
My bad, I apologize I wrongly converted the boost::string_view to std::string using the .data() member function and I just realized it's not null terminated... That's why my string was filled with random characters.
Why do you use data() without using size()/length() ? with string_view this is a recipe for hard to find bugs. At my job I've seen same behavior, and I am trying to explain to people that data() in std::string & std::string_view is mostly for C interfaces.
Sorry for the noise!
I should search how people convert boost::string_view to std::string in a convenient manner instead ;)
Regards,
-- David _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Daniel «Il faut imaginer Sisyphe heureux» Albert Camus
On Sat, 2018-07-07 at 13:48 -0400, Daniel Anderson wrote:
Why do you use data() without using size()/length() ?
with string_view this is a recipe for hard to find bugs.
Because at that time I didn't know that it was not null terminated and I needed a way to convert the boost::string_view to std::string because I need to make a copy of the Beast's result. Regards -- David
On Sun, Jul 8, 2018 at 12:47 AM, David Demelier via Boost-users
I need to make a copy of the Beast's result.
What if there was a way to "detach" the field/value element from the container so that it could outlive the message? Would that be sufficient to allow you to avoid making a copy? Thanks
participants (4)
-
Daniel Anderson
-
David Demelier
-
Richard Hodges
-
Vinnie Falco