On Tue, Oct 12, 2021 at 11:46 PM Gavin Lambert via Boost
it is not possible to take an arbitrary URL and round-trip it between encoded and decoded forms as a whole (even ignoring equivalent variants like %20 vs + or %-encoding more characters than strictly required).
That's right.
Where a / occurs inside a single path-component, it must be escaped so as to not be seen as a path-component-separator. And as such, it's not possible to (reliably) pass multiple unencoded path-components as a single string.
Not exactly. If you call set_path() it treats slash as a separator, since it works with the entire path portion of the URL. If you call segments().push_back( "my/slash" ) then you get the slash percent-encoded. The library APIs are biased towards interpreting the path hierarchically but you can still treat the path as a monolithic string using set_path and set_encoded_path.
The same problems occur with query parameters and &= characters, or with ?# characters appearing anywhere.
There isn't any actual ambiguity here. set_query() treats '&' and '=' as separators. If that is not your intent you can use params().insert() or params().push_back() which will percent-encode these symbols. If you call set_query() with a literal string you will get that literal string as a percent-encoded query in the final URL. If you then call query() you will get back the original unencoded string. Modulo the wrinkle where "+" becomes a space on decoding, but even that can be controlled by the user: https://master.url.cpp.al/url/ref/boost__urls__pct_decode_opts.html
(Where the authority component is not a DNS hostname there may be issues with :@/ characters appearing there too.)
Well, there is no set_authority() function yet, only set_encoded_authority(). I haven't looked closely at it but it would similarly to the others. However, I'm not convinced it is necessary. If you call set_host() you can put any string you want in there and it will be correctly percent-encoded. If there is no user, password, or port, then that percent-encoded string is effectively the entire "authority" - so there is already a way to set the authority to an arbitrary string. Thanks