On 13/10/2021 18:56, Julien Blanc wrote:
And then there's the issue of the '/' character in set_path if you take unencoded strings (not sure how this should be handled...)
Exactly, that's a fundamental problem with the concept of accepting not-yet-encoded strings -- the URL specification does not have uniform encoding; it is not possible to take an arbitrary URL and round-trip it between encoded and decoded forms as a whole (even ignoring equivalent variants like %20 vs + or %-encoding more characters than strictly required). Where a / occurs inside a single path-component, it must be escaped so as to not be seen as a path-component-separator. And as such, it's not possible to (reliably) pass multiple unencoded path-components as a single string. The same problems occur with query parameters and &= characters, or with ?# characters appearing anywhere. (Where the authority component is not a DNS hostname there may be issues with :@/ characters appearing there too.) I think the only sane choice here is (if supported at all) to only provide access to unencoded values at the smallest possible unit (i.e. only for a single parameter or a single path component, or a collection of such, but never as a single string). (But to still provide easy access to the encoder/decoder for arbitrary external usage.)