On 2020-01-22 04:10, Gavin Lambert via Boost wrote:
On 22/01/2020 07:39, Andrey Semashev wrote:
On 2020-01-21 18:51, Vinnie Falco wrote:
On Tue, Jan 21, 2020 at 2:13 AM Andrey Semashev wrote:
I'd be more interested in a more generic URI library. Along with a few associated algorithms, e.g. those described in: https://tools.ietf.org/html/rfc3986
Yes, this library does that. I do not use the term "URI" because it is confusing and pointless. They are all URLs now. My library follows the RFC, except that I have renamed the top level production rules to reflect this preference:
URL = scheme ":" hier-part [ "?" query ] [ "#" fragment ] URL-reference = URL / relative-ref absolute-URL = scheme ":" hier-part [ "?" query ]
I didn't invent this idea, deprecating the word "URI" and using "URL" consistently in its place is recommended by WhatWG.
There is a semantic difference between URI and URL - the former is an identifier and the latter is a locator (i.e. a path to a resource location). You can treat locator as an identifier but not the other way around. Using the term URL to refer to an URI is confusing.
Notably, all URLs are URIs, but not all URIs are URLs. Some are URNs, for example, which are structured a bit differently (eg. "urn:oasis:names:specification:docbook:dtd:xml:4.1.2").
A program only dealing with "locations to download from" generally only needs to worry about URLs, but there are other places where all URIs (including URNs) may be encountered (even by such a program) -- for example, as XML namespace identifiers. (Usually these can be treated as opaque, though.)
Still, given that the same parsing rules can apply to both (URNs usually just have a long opaque path after the "urn" scheme), it doesn't seem unreasonable to call it an "URL library" anyway (despite the recommendation in RFC3986). Some people would be confused by calling them "URIs" and those who know better will know that as well. Having said that, the docs should call out RFC support and URI compatibility explicitly, so that people aren't left wondering.
From https://tools.ietf.org/html/rfc8141: A Uniform Resource Name (URN) is a Uniform Resource Identifier (URI) that is assigned under the "urn" URI scheme and a particular URN namespace, with the intent that the URN will be a persistent, location-independent resource identifier. So the name URI is very much appropriate when working with URNs. As is with URLs. But URL definitely is not the appropriate term to work with URNs. "People will understand what you mean" is not the right reasoning. As a programmer, you have every opportunity to pick the right name for the entity of your code, so that a technically educated reader understands what this entity represents. People who aren't programmers or do not know even the basic terms in your technical domain are not your audience. Personally, I wouldn't be using a `url` type to represent URIs for the documentation purpose alone.