Hi Andrzej, Thanks for reviewing the library.
I recommend that Boost.URL docs say that it requires Boost 1.78 or higher.
Definitely. I'll look at the issue in the next days. https://github.com/CPPAlliance/url/issues/184
A better alternative would be to use the official boost::string_view from Boost.Utility. Or is there a good reason not to?
As others have noted, core::string_view is convertible to std::string_view, which is becoming more and more important. A string_view not convertible is std::string_view is problematic. Others have already shared some relevant links. Now, the name `parse_uri` implies that it will
recognize any URI,
It does. URLs and URIs have the same fields. The distinction is only relevant for URNs, which would have some subcomponents we don't consider.
but on the other hand it is impossible that the result will fit into a url_view, because not every URI is an URL.
This is possible because the url_view has all the necessary fields. Maybe for the same reason, the distinction between URL and URI is becoming more and more pointless. For instance, Javascript calls everything a URL. The synopsis for parse_uri (
https://master.url.cpp.al/url/ref/boost__urls__parse_uri.html) says:
Exception safety: throws nothing.
And the line below it says that the function throws std::length_error when the input is too long. It looks like a bug in specs.
Definitely. https://github.com/CPPAlliance/url/issues/185
When can a parsing be non-successful? Is it only because it was
not conformant to the grammar? Yes.
The synopsis says "This function parses a string according to the URI grammar below", but is it a URI grammar or a URL grammar actually?
We should probably try to better explain the difference between URI, URL, and URNs in the docs. There's some content but it's probably not enough. This is naturally confusing because people use URI and URL interchangeably. But then they see URL is a subset of URIs and assume a URL cannot represent any URI. But this is incorrect, and it's precisely the reason people use URI and URL interchangeably. In fact, the distinction between absolute-URI, relative-ref, URI, and URI-reference is much more relevant. The distinction between URLs and URIs is not that relevant because a URL has all fields required by a URI. Only URNs consider some URI subcomponents to represent extra fields. So the class is called URL because that's what everyone calls it. And all algorithms are called parse_<component>, where <component> is exactly the name as it happens in the grammar. Thus, we have parse_absolute_uri, parse_relative_ref, parse_uri, and parse_uri_reference, which is what the spec calls them.
That is, any other reason for not being successful (if any resources needed to be allocated and failed) may still be reported via exceptions.
These algorithms don't allocate memory.
Now, there is probably a good explanation to the URI vs URL discrepancy. I think it would be good if it was placed in the docs, so that the users don't get confused.
There are some mentions of that in the docs, but we could create a section to discuss the distinction between them more explicitly and provide examples.
Regards, &rzej;
Thanks again! -- Alan Freitas https://github.com/alandefreitas