Hi all, the formal boost review of the boost.url library starts today, the 13th of August and will last until and including the 22nd of this month. The library has been developed by Vinnie Falco and Alan de Freitas. The master branch is frozen during the review and can be found here: https://github.com/CPPAlliance/url The current documentation can be found here: https://master.url.cpp.al/ Boost.URL is a portable C++ library which provides containers and algorithms which model a "URL" and understands the various grammars related to URLs and provides for validating and parsing of strings, manipulation of URL strings, and algorithms operating on URLs such as normalization and resolution. Please explicitly state that you either *accept* or *reject* the inclusion of this library into boost. Also please indicate the time & effort spent on the evaluation and give the reasons for your decision. Some questions to consider: - Does this library bring real benefit to C++ developers for real world use-case? - Do you have an application for this library? - Does the API match with current best practices? - Is the documentation helpful and clear? - Did you try to use it? What problems or surprises did you encounter? - What is your evaluation of the implementation? More information about the Boost Formal Review Process can be found at: http://www.boost.org/community/reviews.html The review is open to anyone who is prepared to put in the work of evaluating and reviewing the library. Prior experience in contributing to Boost reviews is not a requirement. Reviews can also be submitted privately to me, and I will not disclose who sent them. I might however cite passages from those reviews in my conclusion. Thanks to Vinnie & Alan for providing us with a new library & thanks for all the reviews in advance. Klemens
On Fri, Aug 12, 2022 at 7:30 PM Klemens Morgenstern via Boost
...
The library depends on the tip of the current Boost master branch (in particular, constexpr support in boost::core::empty_value) - the documentation does not reflect this, and Boost.URLs master branch is frozen. Thanks
On Fri, Aug 12, 2022 at 7:30 PM Klemens Morgenstern via Boost
There was a hiccup - the library requires a constexpr boost::core::empty_value, which is available in the developmental superproject but not the latest Boost release. Therefore, to ensure ease of installation and usage of the library during the review I have reverted a commit on the Boost.URL master branch so that the library will work with the just-released Boost 1.80. This does not affect the public API or documentation. Thanks
Hi all, Here is my review of Boost.Url. > - Does this library bring real benefit to C++ developers for real world > use-case? Do you have an application for this library? I'm happy to see this library being proposed. I strongly believe we need such functionality in Boost. If it gets accepted, I'm planning on using it in Boost.Mysql to create type-erased connections from connection URLs (such as mysql://user@localhost:3306/database?ssl-mode=disable). I think URL functionality is a must for the C++ for web environment we're developing. - Does the API match with current best practices? The API is clear and concise. It allows parsing and modifying URLs in a pretty nice way. I've found the naming for view containers confusing (e.g. params and params_view). When I saw it, I thought params would take ownership of the query string params buffer, while params_view would not. If I've understood correctly, the difference is that params allows for modification (as it's returned from url containers) and params_view does not (as it's returned from url_view). So it's more about mutability than ownership. If that's the case, I would either emphasize it in the docs, or rename the containers to something that reflects this fact (e.g. const_params and mutable_params, as asio does). I'm happy to see percent-encoding standalone functions being exposed, as this is something present in almost all other languages. I'd consider exposing higher-level functions here, maybe taking a lvalue reference to a std::string, allowing the user to percent-encode strings easier (so they resemble JavaScript encodeURIComponent). Something I've found when playing with mutating URLs is that modifying a query parameter to a certain value is a little bit verbose: url u; // get it from somewhere auto it = u.params().find("k"); if (it == u.params().end()) { u.params().append("k", "value"); } else { u.params().replace_value(it, "value"); } I had this use case in the last web application I wrote (although it wasn't C++). I don't know if this is a common use case, but I'd consider adding a function that simplifies it (i.e. params::set_value(string_view key, string_view value). I'm not very keen on exposing the parsing infrastructure and helper functions (other than percent encoding/decoding) as public API, as I think it violates the principle of API surface minimization. I understand that these parsing facilities may help libraries such as HTTPProto, but I'm not sure if the general public should be using them. I may be missing something, but I haven't found the example on magnet links very compelling - I feel the same result could be achieved in a cleaner way by using the higher level API. If the parsing facilities are to be kept public, I think we need a more compelling example for them. >From reading this comment https://github.com/CPPAlliance/url/issues/379#issuecomment-1212026752 I get the impression that some schemes have different reserved character sets, and that these functions help with that problem - maybe an example of such an scheme could be helpful. I can see other public types in the reference such as recycled, recycled_ptr and aligned_storage, which again have the feeling should not be exposed as public types. - Is the documentation helpful and clear? The documentation is good and the reference is complete, but I miss some examples. We've just got two examples, and one of them is about a very concrete feature that most users will likely not be using day-to-day. In particular, I miss an example (and possibly a page) about composing and mutating URLs (starting with a base URL, add some segments and query parameters, emphasizing on when percent encoding gets applied). This is a pretty typical task when you're consuming a third party REST API. I'd add a page and an example on percent encoding functions, as they are only explained in the reference and the explanation assumes familiarity with the CharSet concept. As there are several container view types, I would also suggest creating a comparison table, like in json::value (https://www.boost.org/doc/libs/1_80_0/libs/json/doc/html/json/dom/value.html) (I've found that one particularly useful). The docstring for segments::insert seems to be wrong (https://master.url.cpp.al/url/ref/boost__urls__segments/insert/overload1.html), as I've passed a non-percent encoded string and it has accepted it, encoding the value. It seems an incorrect copy/paste from segments_encoded::insert. params::insert overloads with key only and with key/value also seem to have swapped docstrings. I'm also curious about the security review that is announced in the main page - when is that going to take place? - Did you try to use it? What problems or surprises did you encounter? I used it in a toy project via CMake FetchContent, with Boost 1.80.0, Linux and clang12. I had no problems building it. I initially tried `mkdir build && cmake -DCMAKE_PREFIX_PATH=path/to/boost ..` but got errors about linking to non existent targets. I only built the source (not the official examples and tests) and my toy programs. I focused on composing and mutating URLs, but also tried parsing and percent encoding. Except for the mismatched docstrings I've already mentioned, no surprises. - What is your evaluation of the implementation? I haven't looked into it. - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? I've dedicated half day to build and play with the library, and had read most of the docs (and some code) beforehand. - Are you knowledgeable about the problem domain? I'm not a URL expert, just a regular URL user. I've written several web applications where URL handling is a day-to-day task. My recommendation is to CONDITIONALLY ACCEPT this library as part of Boost, subject to reviewing which types/functions in the lower-level API are exposed and which ones are not. Thanks Vinnie and Alan for your work. Regards, Ruben.
On Tue, Aug 16, 2022 at 11:03 AM Ruben Perez
Here is my review of Boost.Url.
Thank you for graciously contributing your time to review the library!
I've found the naming for view containers confusing (e.g. params and params_view).
Well...yeah. I am not entirely happy about having 8 different containers between the segments and params ranges. However the decision was made to offer the user an interface that resembles standard containers, so that standard algorithms could be composed on them (more or less). I think that the ForwardRange and BidirectionalRange interfaces were the right choice, so we followed the design where it led us. The need for having read-only views, modifiable views, over separate encoded versus unencoded strings left us with the 8 containers.
I would either emphasize it in the docs, or rename the containers to something that reflects this fact (e.g. const_params and mutable_params, as asio does).
I would much rather emphasize it in the docs, because writing "const" and "mutable" over and over is bound to be a chore. Asio gets away with it because the mutability of buffers is central to the role of asynchronous operations but I'm not so sure the same applies here. We are very happy to hear proposals for alternate container names, especially if they are complete proposals (give an alternative for each existing name) and if necessary what the url_view and url member functions would be renamed to.
I'd consider exposing higher-level functions here, maybe taking a lvalue reference to a std::string, allowing the user to percent-encode strings easier (so they resemble JavaScript encodeURIComponent).
Yes, I am not joyous about our current percent-encoding APIs. There's a lot of them, with small variations, and some simple operations are missing (appending or assigning to a MutableString out-param as you pointed out). They are still great but I think they could be greater. We will work on them some more, and if you or anyone else would like to provide input into that process, it is welcomed.
I don't know if this is a common use case, but I'd consider adding a function that simplifies it (i.e. params::set_value(string_view key, string_view value).
Yep, we need that! There are open questions (like what do you do if more than one key exists). Here is an open issue for it: https://github.com/CPPAlliance/url/issues/412
I'm not very keen on exposing the parsing infrastructure and helper functions (other than percent encoding/decoding) as public API, as I think it violates the principle of API surface minimization.
I feel you there but they have to go somewhere because the alternatives are worse. It isn't perfect, I know.
The documentation is good and the reference is complete, but I miss some examples.
Yep. Its not easy to come up with a complete application that just does stuff with URLs.
I'd add a page and an example on percent encoding functions, as they are only explained in the reference and the explanation assumes familiarity with the CharSet concept.
Yes I think these routines need some tidying and docs.
I'm also curious about the security review that is announced in the main page - when is that going to take place?
Once the code settles down a little, if the library is accepted then the version which first appears in Boost would be reviewed. Possibly the second release if the company doing the review is very busy. Regards
It seems like you decided not to handle anything related to non-ASCII characters. Are there plans to do this in a future library, or are there plans to update or add to this library? A lot has happened since RFC 3986.
I would either emphasize it in the docs, or rename the containers to something that reflects this fact (e.g. const_params and mutable_params, as asio does).
I would much rather emphasize it in the docs, because writing "const" and "mutable" over and over is bound to be a chore. Asio gets away with it because the mutability of buffers is central to the role of asynchronous operations but I'm not so sure the same applies here.
We are very happy to hear proposals for alternate container names, especially if they are complete proposals (give an alternative for each existing name) and if necessary what the url_view and url member functions would be renamed to.
I would suggest leaving url and url_view as-is, and replacing the "view" part for "const" for the rest of the containers: - params stays as is - params_view becomes const_params - params_encoded stays as is - params_encoded_view becomes const_params_encoded And so on; I'd leave the member function names as they are.
I don't know if this is a common use case, but I'd consider adding a function that simplifies it (i.e. params::set_value(string_view key, string_view value).
Yep, we need that! There are open questions (like what do you do if more than one key exists). Here is an open issue for it:
Just commented there.
I'm not very keen on exposing the parsing infrastructure and helper functions (other than percent encoding/decoding) as public API, as I think it violates the principle of API surface minimization.
I feel you there but they have to go somewhere because the alternatives are worse. It isn't perfect, I know.
Are the functions intended to be used just by other Boost libraries, or also the general public? If the former is the case, could they not be separated into a BSL-licensed repository shared between libraries and just used as a submodule? I understand you don't want to put them into a separate Boost library because of the review process it would involve and the overlapping it would create, but I'd say a submodule repo could work-around it and not violate any of the Boost guidelines (AFAIK).
The documentation is good and the reference is complete, but I miss some examples.
Yep. Its not easy to come up with a complete application that just does stuff with URLs.
I would find this useful (this builds a QR code using Google's
Infographic API):
#include
On Tue, Aug 16, 2022 at 2:27 PM Ruben Perez
Are the functions intended to be used just by other Boost libraries, or also the general public?
Some enterprising folks are going to want to use them to do things like say, parse cookies, parse URLs-within-URLs (like when you put a whole URL in the fragment), parse a larger grammar that contains URLs, or even just parse things that are not URLs but they are already using Boost.URL (and Boost.Beast and Boost.JSON) for their server and the algorithms do exactly what they want with the performance and runtime characteristics which are ideal for the use-case.
If the former is the case, could they not be separated into a BSL-licensed repository shared between libraries and just used as a submodule?
That is the equivalent of just making them private. Which I could easily do, just remove them from the docs and that's that. Submodules in boost library include/ directories don't work out too well (I'm not sure they are even allowed).
I understand you don't want to put them into a separate Boost library because of the review process it would involve and the overlapping it would create
I think my current trajectory is to leave it the way it is for now, continue building on it in other soon to be proposed libraries to validate the design, and continue to explore new use cases. People keep wanting to parse things, but Beast's offerings are subpar: https://github.com/boostorg/beast/blob/76043dec2cf67a2ba33b32bdcc129f5f0027b... The ideas in this rfc7230.hpp file are ancient and they have over time turned into what you see in Boost.URL now. The code in Beast is showing its age and doesn't do everything people want - I would point them to this new code. We could even update Beast to use Boost.URL and replace the ramshackle parsers with stuff that actually works although I am reluctant to invest heavily in new things in Beast. On the other hand some of the new parsers that I am writing could probably just be copied in so there's that. For example this rule handles the legacy "extra commas and whitespace allowed in lists" grammar which was specified in the errata: This could be ported directly to Beast. At some point I think I will propose it as a separate library, and by then there will be several downstream users so the design will have more evidence supporting it.
I would find this useful (this builds a QR code using Google's Infographic API): ...
IDK.... that's a pretty small example that just does a printf after a few mods to the URL... I feel like an example should have a lot more meat on the bone. On the other hand we do need more examples... Thanks
On Tue, Aug 16, 2022 at 3:15 PM Vinnie Falco
For example this rule handles the legacy "extra commas and whitespace allowed in lists" grammar which was specified in the errata:
Forgot to include the link to this rule implementation: https://github.com/CPPAlliance/http_proto/blob/7cd9b63bbf5924fb7519d6288c160... Thanks
I would find this useful (this builds a QR code using Google's Infographic API): ...
IDK.... that's a pretty small example that just does a printf after a few mods to the URL... I feel like an example should have a lot more meat on the bone. On the other hand we do need more examples...
I disagree, small examples may be even more useful to newcomers. Having this would have saved me some time when coding my use case. Regards, Ruben.
wt., 16 sie 2022 o 20:31 Vinnie Falco via Boost
On Tue, Aug 16, 2022 at 11:03 AM Ruben Perez
wrote: Here is my review of Boost.Url.
Thank you for graciously contributing your time to review the library!
I've found the naming for view containers confusing (e.g. params and params_view).
Well...yeah. I am not entirely happy about having 8 different containers between the segments and params ranges. However the decision was made to offer the user an interface that resembles standard containers, so that standard algorithms could be composed on them (more or less). I think that the ForwardRange and BidirectionalRange interfaces were the right choice, so we followed the design where it led us. The need for having read-only views, modifiable views, over separate encoded versus unencoded strings left us with the 8 containers.
I would either emphasize it in the docs, or rename the containers to something that reflects this fact (e.g. const_params and mutable_params, as asio does).
I would much rather emphasize it in the docs, because writing "const" and "mutable" over and over is bound to be a chore. Asio gets away with it because the mutability of buffers is central to the role of asynchronous operations but I'm not so sure the same applies here.
We are very happy to hear proposals for alternate container names, especially if they are complete proposals (give an alternative for each existing name) and if necessary what the url_view and url member functions would be renamed to.
I'd consider exposing higher-level functions here, maybe taking a lvalue reference to a std::string, allowing the user to percent-encode strings easier (so they resemble JavaScript encodeURIComponent).
Yes, I am not joyous about our current percent-encoding APIs. There's a lot of them, with small variations, and some simple operations are missing (appending or assigning to a MutableString out-param as you pointed out). They are still great but I think they could be greater. We will work on them some more, and if you or anyone else would like to provide input into that process, it is welcomed.
I don't know if this is a common use case, but I'd consider adding a function that simplifies it (i.e. params::set_value(string_view key, string_view value).
Yep, we need that! There are open questions (like what do you do if more than one key exists). Here is an open issue for it:
https://github.com/CPPAlliance/url/issues/412
I'm not very keen on exposing the parsing infrastructure and helper functions (other than percent encoding/decoding) as public API, as I think it violates the principle of API surface minimization.
I feel you there but they have to go somewhere because the alternatives are worse. It isn't perfect, I know.
Maybe an alternative would be to keep them in the public API but give them a separate namespace, like `urls::encoding`. This would indicate more clearly that you have a relatively separate sub-library there, that needs to stay for logistic reasons. Boost.Lexical_cast took a similar approach for try_lexical_convert: https://www.boost.org/doc/libs/1_80_0/doc/html/boost_lexical_cast/synopsis.h... Regards, &rzej;
The documentation is good and the reference is complete, but I miss some examples.
Yep. Its not easy to come up with a complete application that just does stuff with URLs.
I'd add a page and an example on percent encoding functions, as they are only explained in the reference and the explanation assumes familiarity with the CharSet concept.
Yes I think these routines need some tidying and docs.
I'm also curious about the security review that is announced in the main page - when is that going to take place?
Once the code settles down a little, if the library is accepted then the version which first appears in Boost would be reviewed. Possibly the second release if the company doing the review is very busy.
Regards
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On Wed, Aug 17, 2022 at 11:12 AM Andrzej Krzemienski
Maybe an alternative would be to keep them in the public API but give them a separate namespace, like `urls::encoding`. This would indicate more clearly that you have a relatively separate sub-library there, that needs to stay for logistic reasons.
Yes that is precisely the design choice that we made, that namespace is `boost::urls::grammar` Thanks
Here is my quick review of Boost.URL. I vote to ACCEPT this library. It may not be perfect, but it's a good enough solution for a real problem. I do have some concerns: - The lack of IRI support is unfortunate. It's 2022; we should all be writing software with Unicode support by default. However, this can be built on top of Boost.URL, and isn't needed in all cases. - The "string+offsets" implementation means that a lot of characters need to be copied every time something at the left end of a long URL is modified - for example when changing an http URL to use https. This is unlikely to cause a problem in most circumstances, but it's something to be aware of. On 13.08.22 04:29, Klemens Morgenstern via Boost wrote:
Some questions to consider:
- Does this library bring real benefit to C++ developers for real world use-case?
Yes.
- Do you have an application for this library?
Yes.
- Does the API match with current best practices?
It's good enough.
- Is the documentation helpful and clear?
Yes, but I did find a few errors that should be fixed. One of the examples on https://master.url.cpp.al/url/containers/authority.html does not match the text. I have to admit I skipped over the customization section.
- Did you try to use it? What problems or surprises did you encounter?
No.
- What is your evaluation of the implementation?
Like many Boost libraries, I found the implementation fairly opaque. Class url delegates to url_base, which delegates to url_view_base, which delegates to detail::url_impl, with much of the functionality spread out over various view classes. This isn't a problem per se, but it makes casual reading difficult. -- Rainer Deyke (rainerd@eldwood.com)
On Sun, Aug 21, 2022 at 7:15 AM Rainer Deyke via Boost
- The lack of IRI support is unfortunate. It's 2022; we should all be writing software with Unicode support by default. However, this can be built on top of Boost.URL, and isn't needed in all cases.
We will probably add something to parse an IRI but in all likelihood it would convert it to UTF-8 as a regular URL. I don't know if I have the stomach for a total duplication of the existing library except names like iri_view, iri, static_iri, the duplication of all the segments and params containers, and the modification of all those mutating algorithms to support Unicode. I'm not even sure that it is called for, given that IRIs are for more user-facing purposes. Such things cannot be submitted in HTTP requests, and as far as I know, unicode host names would need to be converted to punycode anyway (which we could do) to submit them to a DNS server.
- The "string+offsets" implementation means that a lot of characters need to be copied every time something at the left end of a long URL is modified.
Yep! Thanks for noticing that :) Was it the right implementation choice? I sure hope so, but we won't know for sure without field experience. And I hope we get lots of it, as I am already writing new libraries whose containers use this same model.
Yes, but I did find a few errors that should be fixed. One of the examples on https://master.url.cpp.al/url/containers/authority.html does not match the text. I have to admit I skipped over the customization section.
Happy to have any issues reported in the repository. The Customization section is only aimed for the most advanced users, so no loss there. Thank you for taking the time to review the library. Regards
On 21.08.22 17:13, Vinnie Falco via Boost wrote:
On Sun, Aug 21, 2022 at 7:15 AM Rainer Deyke via Boost
wrote: - The lack of IRI support is unfortunate. It's 2022; we should all be writing software with Unicode support by default. However, this can be built on top of Boost.URL, and isn't needed in all cases.
We will probably add something to parse an IRI but in all likelihood it would convert it to UTF-8 as a regular URL. I don't know if I have the stomach for a total duplication of the existing library except names like iri_view, iri, static_iri, the duplication of all the segments and params containers, and the modification of all those mutating algorithms to support Unicode. I'm not even sure that it is called for, given that IRIs are for more user-facing purposes. Such things cannot be submitted in HTTP requests, and as far as I know, unicode host names would need to be converted to punycode anyway (which we could do) to submit them to a DNS server.
I see IRIs not as a different datatype, but as a specific interpretation of URIs. The transparent percent en-/decoding of Boost.URL already gets us most of the way there. Additional IRI support would mean: - Decoding accessors that perform utf-8 validation. (Arbitrary percent-encoded 8-bit values are legal in URLs, but not in IRIs.) - Additional mutators that perform NFC normalization or validation. - Punycode encoding/decoding. -- Rainer Deyke (rainerd@eldwood.com)
Rainer Deyke wrote:
(Arbitrary percent-encoded 8-bit values are legal in URLs, but not in IRIs.)
Aren't they? I didn't find anything prohibiting them in the RFC, although I might well have missed it. The sections that specify the recommended way to convert between URI and IRI do say how these are handled - a percent-encoded sequence in a URI that doesn't correspond to a valid UTF-8 encoded code point is left alone in the IRI. (Valid UTF-8 percent encodings are percent-decoded.)
On 21.08.22 20:36, Peter Dimov via Boost wrote:
Rainer Deyke wrote:
(Arbitrary percent-encoded 8-bit values are legal in URLs, but not in IRIs.)
Aren't they? I didn't find anything prohibiting them in the RFC, although I might well have missed it.
The sections that specify the recommended way to convert between URI and IRI do say how these are handled - a percent-encoded sequence in a URI that doesn't correspond to a valid UTF-8 encoded code point is left alone in the IRI. (Valid UTF-8 percent encodings are percent-decoded.)
OK, it is possible to partially decode a URL containing a mix of utf-8 and arbitrary 8-bit values to get something that looks like a URL, but with Unicode. And this half-decoded URL is an IRI, so on a technical level you are correct. On the other hand, if calling the 'path' member function of a URL object returns a fully percent-decoded path, then calling the utf-8 equivalent of that member function should return something that is both legal utf-8 and fully percent-decoded. Which is only possible if the path contains no percent-encoded values that are not utf-8. -- Rainer Deyke (rainerd@eldwood.com)
On 21.08.22 22:18, Vinnie Falco via Boost wrote:
On Sun, Aug 21, 2022 at 1:01 PM Rainer Deyke via Boost
wrote: ...IRI...
What are the use-cases for IRI?
I think that anytime a URL is shown to or input by the user, whole or in part, it should be in IRI form by default. -- Rainer Deyke (rainerd@eldwood.com)
This is my review of Boost.Url. Disclaimer I am employed by The C++ Alliance, that is by Vinnie. Nevertheless, I haven't followed development of this library closely, so I did not have much prior knowledge about design choices it makes. My vote is to ACCEPT the library, since it does solve the use cases for URLs that I currently can come up with, it is quite easy to use (and not too easy to use incorrectly) and the flaws that it has doesn't seem to be fundamental.
What is your evaluation of the design?
Overall, I found the API fairly easy to use. I understand the concern about parsing into a view type, but I consider it a lesser evil than allocating on every parse or creating two kinds of functions, one that allocates and one that doesn't. I did also find the naming situation for encoded/decoded a bit confusing. In particular, url::path returns pct_encoded_view, but url::encoded_path returns string_view. I would expect the opposite. Maybe this needs additional explanation in the docs? Finally, I find the use of system::result as the sole method of reporting errors intriguing. I have not yet encountered a library that uses this approach, and I have been thinking about employing it myself in the past. If this library is accepted, it will become a great experiment for this sort of API and will provide us with the experience on whether it's a good replacement for dual APIs using error_code.
What is your evaluation of the implementation?
I haven't looked into that.
What is your evaluation of the documentation?
Documentation seems sparse but ultimately I have been able to successfully find all the information that I needed. Still, a lot more examples and detailed explanations are necessary.
What is your evaluation of the potential usefulness of the library?
The obvious application is networking, but given that URIs were designed as a universal way to identify entities, there are other potential use cases.
Did you try to use the library? With what compiler? Did you have any problems?
I have implemented a very simple program that constructed URLs (including ones with custom schemes), compared them against each other, and deconstructed them for "routing". b2 was used as the build system and GCC 10.3 as the compiler. I did not encounter any problems.
How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
3 hours of coding and 2 hours of reading documentation.
Are you knowledgeable about the problem domain?
I'm not an expert, but I have been dealing with URLs in Python and in C++ using
Qt's QUrl.
сб, 13 авг. 2022 г. в 05:30, Klemens Morgenstern via Boost
Hi all,
the formal boost review of the boost.url library starts today, the 13th of August and will last until and including the 22nd of this month.
The library has been developed by Vinnie Falco and Alan de Freitas. The master branch is frozen during the review and can be found here: https://github.com/CPPAlliance/url
The current documentation can be found here: https://master.url.cpp.al/
Boost.URL is a portable C++ library which provides containers and algorithms which model a "URL" and understands the various grammars related to URLs and provides for validating and parsing of strings, manipulation of URL strings, and algorithms operating on URLs such as normalization and resolution.
Please explicitly state that you either *accept* or *reject* the inclusion of this library into boost. Also please indicate the time & effort spent on the evaluation and give the reasons for your decision.
Some questions to consider:
- Does this library bring real benefit to C++ developers for real world use-case? - Do you have an application for this library? - Does the API match with current best practices? - Is the documentation helpful and clear? - Did you try to use it? What problems or surprises did you encounter? - What is your evaluation of the implementation?
More information about the Boost Formal Review Process can be found at: http://www.boost.org/community/reviews.html
The review is open to anyone who is prepared to put in the work of evaluating and reviewing the library. Prior experience in contributing to Boost reviews is not a requirement.
Reviews can also be submitted privately to me, and I will not disclose who sent them. I might however cite passages from those reviews in my conclusion.
Thanks to Vinnie & Alan for providing us with a new library & thanks for all the reviews in advance.
Klemens
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Hi, everyone. This is a summary of issues we have opened during the review so anyone interested can track them. * Public grammar namespace: https://github.com/CPPAlliance/url/issues/444 I find the creation of a mini-parsing library and the expectation that
users will use this mini-library to write their own parsers to be somewhat problematic.
Regarding the parser interface, I feel this should mature into its own
library.
I'm not very keen on exposing the parsing infrastructure and helper
functions
I can see other public types in the reference such as recycled,
recycled_ptr and aligned_storage, which again have the feeling should not be exposed as public types.
* parse_url vs. parse_uri:
https://github.com/CPPAlliance/url/issues/443
When I call parse_uri(), I get a result of type result
isn't the function spelled parse_url()
In contrast to others, I think the choice of Nomenclature, namely URL
everywhere except in the grammar domain (the parse function names) is a very elegant one
* Documentation issues: https://github.com/CPPAlliance/url/issues/442 (About 20 minor issues with the docs) * Exploring IRIs: https://github.com/CPPAlliance/url/issues/441 The lack of IRI support is unfortunate. It's 2022; we should all
be writing software with Unicode support by default. However, this can be built on top of Boost.URL, and isn't needed in all cases.
* Add lowercase/uppercase in pct_encode_opts: https://github.com/CPPAlliance/url/issues/440 In the past I have had situations where it was necessary to specify the
case for hex letters
* Optional query values: https://github.com/CPPAlliance/url/issues/439 I would be inclined to use a std::optional for the value instead. * Name of the decoding view is confusing: https://github.com/CPPAlliance/url/issues/438 I would expect a "decoded_view" to refer to underlying encoded data pct_encoded_view is an unfortunate choice for a view that actively decodes
URL-encoded characters
I did also find the naming situation for encoded/decoded a bit confusing. I
would expect the opposite.
* Using std::string in url: https://github.com/CPPAlliance/url/issues/437 In your non-view url object, you seem to allocate storage using new char[].
Why not use a std::string?
* url::persist docs: https://github.com/CPPAlliance/url/issues/436 so url_view doesn't own the underlying storage, except when it
does as a result of this mysterious method. What's the rationale for this?
* url lifetimes: https://github.com/CPPAlliance/url/issues/435 There's no indication here that r is a non-owning view and that
the caller is responsible for keeping s alive for the lifetime of r.
* Examples: https://github.com/CPPAlliance/url/issues/419 I miss an example (and possibly a page) about composing and mutating URLs I'd add a page and an example on percent encoding functions small examples may be even more useful to newcomers. I found resolve a bit mystifying. (...) it will be a big help to have
inspirational examples
Perhaps the examples would be better when fleshed out in long form * pct-encoding overloads: https://github.com/CPPAlliance/url/issues/417 I'd consider exposing higher-level functions here, maybe taking a lvalue
reference to a std::string
* container names: https://github.com/CPPAlliance/url/issues/416 I've found the naming for view containers confusing (e.g. params and
params_view).
I would suggest leaving url and url_view as-is, and replacing the "view"
part for "const" for the rest of the containers
Em sex., 12 de ago. de 2022 às 23:30, Klemens Morgenstern via Boost < boost@lists.boost.org> escreveu:
Hi all,
the formal boost review of the boost.url library starts today, the 13th of August and will last until and including the 22nd of this month.
The library has been developed by Vinnie Falco and Alan de Freitas. The master branch is frozen during the review and can be found here: https://github.com/CPPAlliance/url
The current documentation can be found here: https://master.url.cpp.al/
Boost.URL is a portable C++ library which provides containers and algorithms which model a "URL" and understands the various grammars related to URLs and provides for validating and parsing of strings, manipulation of URL strings, and algorithms operating on URLs such as normalization and resolution.
Please explicitly state that you either *accept* or *reject* the inclusion of this library into boost. Also please indicate the time & effort spent on the evaluation and give the reasons for your decision.
Some questions to consider:
- Does this library bring real benefit to C++ developers for real world use-case? - Do you have an application for this library? - Does the API match with current best practices? - Is the documentation helpful and clear? - Did you try to use it? What problems or surprises did you encounter? - What is your evaluation of the implementation?
More information about the Boost Formal Review Process can be found at: http://www.boost.org/community/reviews.html
The review is open to anyone who is prepared to put in the work of evaluating and reviewing the library. Prior experience in contributing to Boost reviews is not a requirement.
Reviews can also be submitted privately to me, and I will not disclose who sent them. I might however cite passages from those reviews in my conclusion.
Thanks to Vinnie & Alan for providing us with a new library & thanks for all the reviews in advance.
Klemens
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-- Alan Freitas https://github.com/alandefreitas
participants (9)
-
Alan de Freitas
-
Alex Christensen
-
Andrzej Krzemienski
-
Klemens Morgenstern
-
Peter Dimov
-
Rainer Deyke
-
Ruben Perez
-
Vinnie Falco
-
Дмитрий Архипов