[review][JSON] Review of JSON starts today: Sept 14 - Sept 23
Boost formal review of Vinnie Falco and Krystian Stasiowski's library JSON starts today and will run for 10 days ending on 23 Sept 2020. Both of these authors have already developed a couple of libraries which are accepted in Boost(boost beast and Static String) This library focuses on a common and popular use-case for JSON. It provides a container to hold parsed and serialised JSON types. It provides more flexibility and better benchmark performance than its competitors. JSON highlights the following features in the documentation: - Fast compilation - Require only C++11 - Fast streaming parser and serializer - Easy and safe API with allocator support - Constant-time key lookup for objects - Options to allow non-standard JSON - Compile without Boost, define BOOST_JSON_STANDALONE - Optional header-only, without linking to a library (a point I would like to add in highlight: it has cool Jason logo 😝) To quickly understand and get the flavour of the library take a look at "Quick Look" http://master.json.cpp.al/json/usage/quick_look.html You can find the source code to be reviewed here: https://github.com/CPPAlliance/json/tree/master You can find the latest documentation here: http://master.json.cpp.al/ Benchmarks are also given in the document which can be found here: http://master.json.cpp.al/json/benchmarks.html Some people have also given the early reviews, the thread can be found here: https://lists.boost.org/Archives/boost/2020/09/249745.php Please provide in your review information you think is valuable to understand your choice to ACCEPT or REJECT including JSON as a Boost library. Please be explicit about your decision (ACCEPT or REJECT). Some other questions you might want to consider answering: - What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With which compiler(s)? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain? More information about the Boost Formal Review Process can be found here: http://www.boost.org/community/reviews.html Thank you for your effort in the Boost community. -- Thank you, Pranam Lashkari, https://lpranam.github.io/
Because of my direct involvement in the C++ Alliance I feel it would be wrong for me to provide a review that leads to an accept/reject conclusion. However, I have some experience of integrating this library into a private project and I felt it might be valuable to share my experiences. * - What is your evaluation of the design?* My personal opinion is that the design is sane and well-reasoned. Any areas with which I have previously taken issue have been raised with the authors and concerns covered. Some effort was made to explore the effect of ideas I presented and outcomes were measured. My opinion is that the final design is largely data-driven. * - What is your evaluation of the implementation?* I have found no faults in the implementation during use. There is the slightly off-putting fact that the default text representation of parsing integers that are complete powers of 10 are expressed in scientific notation. Unusual as it seems however, this is strictly conformant with the JSON standard. * - What is your evaluation of the documentation?* The documentation is clear and succinct, the fact that it takes steps to elucidate the rationale behind design decisions ought to head off a number of "Wait! Why?" questions. * - What is your evaluation of the potential usefulness of the library?* The library has already proven useful to me. For me personally, the ability to map the parser directly to C++ objects without going through the intermediate json::value data structure would offer a minor improvement in performance. I have started exploring the building of such a parse handler which I intend to offer as something to go into the examples section at some future date assuming I have the time to finish it. Notwithstanding, the fact that I can supply a custom area-style memory resource to the parser/value largely offsets this concern in practice. Essentially by voiding the building of the DOM I can avoid one memory allocation and some redundant copies. In practice, neither one memory allocation nor the memory copies have proven measurably expensive in my uses of the library. Whether this ultimately belongs in the JSON library or should be a dependent library is not for me to say. It is worth noting that the separation of concerns between parser and handler is helpful in that it makes this work possible without having to rewrite any parsing code. * - Did you try to use the library? With which compiler(s)? Did you have any problems?* I have used the library with GCC 9&10, and Clang 9&10. Standards selected were C++17 and C++20. I chose the boost-dependent (default) option rather than standalone because I was also using the boost libraries Asio, Beast, Program Options and System. * - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?* I have written an application that uses the library: A cryptocurrency market-making bot that faced off to the Deribit websocket/json API. * - Are you knowledgeable about the problem domain?* Yes. In a previous market data distribution engine I used Nlohmann JSON (high level but slow), RapidJSON (low level but fast) and JSMN (super low level and blindingly fast but no DOM representation, only provides indexes into data). Regards, R On Mon, 14 Sep 2020 at 09:30, Pranam Lashkari via Boost < boost@lists.boost.org> wrote:
Boost formal review of Vinnie Falco and Krystian Stasiowski's library JSON starts today and will run for 10 days ending on 23 Sept 2020. Both of these authors have already developed a couple of libraries which are accepted in Boost(boost beast and Static String)
This library focuses on a common and popular use-case for JSON. It provides a container to hold parsed and serialised JSON types. It provides more flexibility and better benchmark performance than its competitors.
JSON highlights the following features in the documentation:
- Fast compilation - Require only C++11 - Fast streaming parser and serializer - Easy and safe API with allocator support - Constant-time key lookup for objects - Options to allow non-standard JSON - Compile without Boost, define BOOST_JSON_STANDALONE - Optional header-only, without linking to a library
(a point I would like to add in highlight: it has cool Jason logo 😝)
To quickly understand and get the flavour of the library take a look at "Quick Look" http://master.json.cpp.al/json/usage/quick_look.html
You can find the source code to be reviewed here: https://github.com/CPPAlliance/json/tree/master
You can find the latest documentation here: http://master.json.cpp.al/
Benchmarks are also given in the document which can be found here: http://master.json.cpp.al/json/benchmarks.html
Some people have also given the early reviews, the thread can be found here: https://lists.boost.org/Archives/boost/2020/09/249745.php
Please provide in your review information you think is valuable to understand your choice to ACCEPT or REJECT including JSON as a Boost library. Please be explicit about your decision (ACCEPT or REJECT).
Some other questions you might want to consider answering:
- What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With which compiler(s)? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain?
More information about the Boost Formal Review Process can be found here: http://www.boost.org/community/reviews.html
Thank you for your effort in the Boost community.
-- Thank you, Pranam Lashkari, https://lpranam.github.io/
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-- Richard Hodges hodges.r@gmail.com office: +442032898513 home: +376841522 mobile: +376380212
On Mon, Sep 14, 2020 at 12:30 AM Pranam Lashkari via Boost < boost@lists.boost.org> wrote:
<https://github.com/CPPAlliance/json https://github.com/CPPAlliance/json/tree/master>
Don't forget to STAR the repository, and thanks!
On Mon, Sep 14, 2020 at 8:01 PM Vinnie Falco
Don't forget to STAR the repository, and thanks!
Yes, please start the amazing library it may help to reach this library to more people. -- Thank you, Pranam Lashkari, https://lpranam.github.io/
On 14.09.20 09:30, Pranam Lashkari via Boost wrote:
Please provide in your review information you think is valuable to understand your choice to ACCEPT or REJECT including JSON as a Boost library. Please be explicit about your decision (ACCEPT or REJECT).
- What is your evaluation of the design?
Most of it seems fine, but I do have some issues: - The choice of [u]int64_t seems arbitrary and restrictive. It means that Boost.JSON will not use a 128 bit integer even where one is available, and it means that it cannot compile at all on implementations that don't provide int64_t. It's good enough for my purposes, but I would like to see some discussion about this. The json spec allows arbitrarily large integers. - On a similar note, the use of double restricts floating point accuracy even when a higher-precision type is available. The json spec allows arbitrarily precise decimal values. - boost::json::value_to provides a single, clean way to extract values from json, but it also renders other parts of the library (e.g. number_cast) redundant except as an implementation detail. - boost::json::value_to provides a single, clean way to extract values from json, but it's syntactically long. The same functionality in a member function of boost::json::value would be nicer to use. - The omission of binary serialization formats (CBOR et al) bothers me. Not from a theoretical point of view, but because I have actual code that uses CBOR, and I won't be able to convert this code to Boost.JSON unless CBOR support is provided. (Or I could write my own, but that rather defeats the point of using a library in the first place, especially if Neils Lohmann's library already provides CBOR support.)
- What is your evaluation of the implementation?
I didn't look at it.
- What is your evaluation of the documentation?
boost::json::value_to provides a single, clean way to extract values from json, but it's actually rather hard to find in the documentation. I was looking for a way to extract a std::string from a boost::json::value, so I looked at the documentation for boost::json::value and found as_string. OK, that returns boost::json::string, which is not implicitly convertible to std::string (in C++14). But it has an implicit conversion to string_view, which is an alias of boost::string_view, which doesn't appear to be documented anywhere but which (looking at the source code) has a member function to_string. So I ended up with this code: boost::json::value v = "Hello world."; std::string s = static_castboost::json::string_view(v.as_string()).to_string(); ...which I think we can all agree is an abomination. /Then/ I found out about boost::json::value_to, and replaced my code with this: std::string s = boost::json::value_tostd::string(v); Which is definitely nicer, though still not as nice as Niels Lohmann's code: std::string s = v.getstd::string(); boost::json::value_to or its member function replacement should definitely be front and center to the documentation.
- What is your evaluation of the potential usefulness of the library?
A good json library is very useful for a large number of programs. The questions isn't if Boost should have a json library, but if this is the json library that Boost should have.
- Did you try to use the library? With which compiler(s)? Did you have any problems?
I converted a small but non-trivial program from Lohmann's json to the proposed Boost.JSON, and compiled it with several different cross-compilers. I did not encounter any problems compiling or running the program, although I did have to add Boost.Container to the set of linked libraries.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
A few hours of work, most of which was spent converting code from Lohmann's json library to the proposed Boost.JSON.
- Are you knowledgeable about the problem domain?
Yes.
Please be explicit about your decision (ACCEPT or REJECT).
For me, the ultimate question is if I would actually use this library, and my reluctant answer is "not its current state". I'm basically satisfied with Lohmann's json library, which requires less verbosity to use and which provides CBOR support. I can see the attraction of Boost.JSON's superior performance, and the attraction of incremental parsing and serialization, but for my usage none of this matters. CBOR support, on the other hand does matter. I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the inclusion of code for parsing and serializing boost::json::value as CBOR. I can live with the added verbosity of Boost.JSON (although I'd rather see it reduced where possible), but not without CBOR. (If CBOR support is not added, this vote should count as an abstain vote and not as a reject. I don't think that Boost.JSON needs CBOR in order to be useful to other people - I just don't want to vote to accept a library that's not useful for me.) -- Rainer Deyke (rainerd@eldwood.com)
On Tue, Sep 15, 2020 at 7:14 AM Rainer Deyke via Boost
I did have to add Boost.Container to the set of linked libraries.
Boost.Container develop branch has a fix for this, so linking with that library will not be necessary if Boost.JSON is released with Boost: https://github.com/boostorg/container/commit/0b297019ec43483f523a3270b632fec... Thank you for your thoughtful review! Regards
On Tue, Sep 15, 2020 at 7:14 AM Rainer Deyke via Boost
- The choice of [u]int64_t seems arbitrary and restrictive. It means that Boost.JSON will not use a 128 bit integer even where one is available, and it means that it cannot compile at all on implementations that don't provide int64_t. It's good enough for my purposes, but I would like to see some discussion about this. The json spec allows arbitrarily large integers.
- On a similar note, the use of double restricts floating point accuracy even when a higher-precision type is available. The json spec allows arbitrarily precise decimal values.
The spec gives implementations freedom to place arbitrary upper limits on precision. To be useful as a vocabulary type, the library prefers homogeneity of interface over min/maxing. In other words more value is placed on having the library use the same integer representation on all platforms than using the largest integer width available. Another point is that the sizes of types in the library is very tightly controlled. `sizeof(value)` is 16 bytes on 32-bit platforms and 24 bytes on 64-bit platforms, and this is for a reason. It is to keep performance high and memory consumption low. There is a direct, linear falloff in general performance with increasing size of types.
- boost::json::value_to provides a single, clean way to extract values from json, but it also renders other parts of the library (e.g. number_cast) redundant except as an implementation detail.
Well, number_cast isn't redundant since it is the only interface which offers the use of error codes rather than exceptions. We could have gone with error codes in conversion to user-defined types but then the interface would need some kind of expected<> return type and things get messy there. Exceptions are a natural error handling mechanism. However we recognize that in network programs there is a need to convert numbers without using exceptions, thus number_cast is available.
- boost::json::value_to provides a single, clean way to extract values from json, but it's syntactically long. The same functionality in a member function of boost::json::value would be nicer to use.
Algorithms which can be implemented completely in terms of a class' public interface are generally expressed as free functions in separate header files. If we were to make `get` a member function of `json::value`, then users who have no need to convert to and from user-defined types would be unnecessarily including code they never use. Thanks
- boost::json::value_to provides a single, clean way to extract values from json, but it's syntactically long. The same functionality in a member function of boost::json::value would be nicer to use.
Algorithms which can be implemented completely in terms of a class' public interface are generally expressed as free functions in separate header files. If we were to make `get` a member function of `json::value`, then users who have no need to convert to and from user-defined types would be unnecessarily including code they never use.
To add to that: You can use ADL to avoid naming the namespace: `value_tostd::string(json_val)` which is not much longer than: `json_val.getstd::string()` I could have been named it `get_as` though: `get_asstd::string(json_val)` but one is as good as the other
On Tue, Sep 15, 2020 at 7:14 AM Rainer Deyke via Boost
I can see the attraction of Boost.JSON's superior performance, and the attraction of incremental parsing and serialization, but for my usage none of this matters. CBOR support, on the other hand does matter.
We are researching the topic. If you would like to weigh in, the issue can be tracked here: https://github.com/CPPAlliance/json/issues/342 One thing I will note, however. A Google search for "CBOR" produces 390,000 results while a Google search for JSON produces 188,000,000 results. Now there is surely some margin of error in these numbers but I have to wonder how widespread is the use of CBOR. Thanks
On 15.09.20 16:44, Vinnie Falco via Boost wrote:
On Tue, Sep 15, 2020 at 7:14 AM Rainer Deyke via Boost
wrote: I can see the attraction of Boost.JSON's superior performance, and the attraction of incremental parsing and serialization, but for my usage none of this matters. CBOR support, on the other hand does matter.
We are researching the topic. If you would like to weigh in, the issue can be tracked here: https://github.com/CPPAlliance/json/issues/342
One thing I will note, however. A Google search for "CBOR" produces 390,000 results while a Google search for JSON produces 188,000,000 results. Now there is surely some margin of error in these numbers but I have to wonder how widespread is the use of CBOR.
That's a valid point, but I would counter that most applications that handle large amounts of JSON would benefit from using CBOR, and it's often lack of library support that's holding them back. -- Rainer Deyke (rainerd@eldwood.com)
Rainer Deyke wrote:
I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the inclusion of code for parsing and serializing boost::json::value as CBOR.
I find this condition is too strict. You basically say that you'd rather not see the proposed Boost.JSON enter Boost until CBOR is implemented, which may happen six months from now. So people who don't have a need for CBOR will have to wait until Boost 1.77, which doesn't really help anyone. I can understand requiring a firm commitment on the part of the authors to add CBOR support, but postponing the acceptance until this is implemented...
On 15.09.20 16:45, Peter Dimov via Boost wrote:
Rainer Deyke wrote:
I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the inclusion of code for parsing and serializing boost::json::value as CBOR.
I find this condition is too strict. You basically say that you'd rather not see the proposed Boost.JSON enter Boost until CBOR is implemented, which may happen six months from now. So people who don't have a need for CBOR will have to wait until Boost 1.77, which doesn't really help anyone.
Actually I would like to see Boost.JSON in Boost, with or without CBOR. However, I can't in good conscience vote to accept a library that I am unwilling to use. Boost should not be a graveyard of well-designed but unused libraries. As I explained in my review, if my condition is not met, I would like my vote to be counted as ABSTAIN, not REJECT. I am hoping that Boost.JSON will get enough votes to accept to make it into Boost, with or without CBOR support.
I can understand requiring a firm commitment on the part of the authors to add CBOR support, but postponing the acceptance until this is implemented...
If such a commitment is made, I would also consider my condition for acceptance met. -- Rainer Deyke (rainerd@eldwood.com)
On 16/09/2020 08:31, Rainer Deyke via Boost wrote:
On 15.09.20 16:45, Peter Dimov via Boost wrote:
Rainer Deyke wrote:
I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the inclusion of code for parsing and serializing boost::json::value as CBOR.
I find this condition is too strict. You basically say that you'd rather not see the proposed Boost.JSON enter Boost until CBOR is implemented, which may happen six months from now. So people who don't have a need for CBOR will have to wait until Boost 1.77, which doesn't really help anyone.
Actually I would like to see Boost.JSON in Boost, with or without CBOR. However, I can't in good conscience vote to accept a library that I am unwilling to use. Boost should not be a graveyard of well-designed but unused libraries.
As I explained in my review, if my condition is not met, I would like my vote to be counted as ABSTAIN, not REJECT. I am hoping that Boost.JSON will get enough votes to accept to make it into Boost, with or without CBOR support.
If the proposed library were called Boost.Serialisation2 or something, I would see your point. But it's called Boost.JSON. It implements JSON. It does not implement CBOR. I don't think it's reasonable to recommend a rejection for a library not doing something completely different to what it does. Speaking wider that this, if the format here were not specifically JSON, I'd also be more sympathetic - binary as well as text serialisation/deserialisation is important. But JSON is unique, most users would not choose JSON except that they are forced to do so by needing to talk to other stuff which mandates JSON. At work we have this lovely very high performance custom DB based on LLFIO. It has rocking stats. But it's exposed to clients via a REST API, and that means everything goes via JSON. So the DB spends most of its time fairly idle compared to what it is capable of, because JSON is so very very slow in comparison. If we could choose anything but JSON, we would, but the customer spec requires an even nastier and slower text format than JSON. We expect to win the argument to get them to "upgrade" to JSON, but anything better than that is years away. Change is hard for big governmental orgs. In any case, CBOR is actually a fairly lousy binary protocol. Very inefficient compared to alternatives. But the alternatives all would require you to design your software differently to what JSON's very reference count centric design demands. Niall
Niall Douglas wrote:
In any case, CBOR is actually a fairly lousy binary protocol. Very inefficient compared to alternatives.
I actually quite like CBOR. Of all the "binary JSON" protocols, it's the best. Or the least worst, if you will. (Except for MessagePack, which I like even more, but it's arguably not a binary JSON.)
On 16/09/2020 14:53, Peter Dimov via Boost wrote:
Niall Douglas wrote:
In any case, CBOR is actually a fairly lousy binary protocol. Very inefficient compared to alternatives.
I actually quite like CBOR. Of all the "binary JSON" protocols, it's the best. Or the least worst, if you will. (Except for MessagePack, which I like even more, but it's arguably not a binary JSON.)
I know what you're saying. However, comparing **C++ implementations** of CBOR to JSON ones does not yield much differential. For example, simdjson will happily sustain 10 Gbit of textual JSON parsing per session which is enough for most NICs. CBOR parsers, at least the ones available to C++, are no better than this. Our custom DB will push 20-25Gb/sec, so we'd need a 250 Gbit NIC and a zero copy all binary network protocol to have the DB become the primary bottleneck. I doubt any CBOR like design could achieve that, ever, because that design is fundamentally anti-performance. CBOR's primary gain for me is exact bit for bit value transfer, so floating point numbers come out exactly as you sent them. That's rarely needed outside scientific niches though, and even then, just send the FP number as a string encoded in hexadecimal in JSON right? In fact, for any format which looks better than JSON, encoding your values as hexadecimal strings in JSON is an excellent workaround. Hexadecimal string parsing is very, very fast on modern CPUs, you often can add +20% to JSON bottlenecked performance by using hexadecimal for everything. Niall
Niall Douglas wrote:
I know what you're saying. However, comparing **C++ implementations** of CBOR to JSON ones does not yield much differential. For example, simdjson will happily sustain 10 Gbit of textual JSON parsing per session which is enough for most NICs. CBOR parsers, at least the ones available to C++, are no better than this.
That's not a fair comparison, because you need fewer Gb to encode the same thing in CBOR.
On 16/09/2020 16:05, Peter Dimov via Boost wrote:
Niall Douglas wrote:
I know what you're saying. However, comparing **C++ implementations** of CBOR to JSON ones does not yield much differential. For example, simdjson will happily sustain 10 Gbit of textual JSON parsing per session which is enough for most NICs. CBOR parsers, at least the ones available to C++, are no better than this.
That's not a fair comparison, because you need fewer Gb to encode the same thing in CBOR.
This is true, but I didn't mention that I was accounting already for that. CBOR had about 15% overhead from binary when I last tested it. JSON for the data we were transmitting it was around 50%. However simdjson and sajson are many many times faster than the CBOR library I tested, and JSON compresses very easily. I guess what I'm really saying here is that yes, JSON emits less dense data than CBOR. But, a recent JSON parser + snappy compression produces denser representation than CBOR, and yet is still faster overall. I'm very sure that a faster CBOR library is possible than what we have. But given what's currently available in the ecosystem, I'm saying a recent JSON library + fast compression is both faster and smaller output than currently available alternatives right now. This is why I'm not keen on CBOR personally. I don't think it solves a problem anyone actually currently has, rather it solves a problem people think they have because they haven't considered adding a compression pass to a fast JSON implementation. Niall
On 16.09.20 15:33, Niall Douglas via Boost wrote:
On 16/09/2020 08:31, Rainer Deyke via Boost wrote:
On 15.09.20 16:45, Peter Dimov via Boost wrote:
Rainer Deyke wrote:
I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the inclusion of code for parsing and serializing boost::json::value as CBOR.
I find this condition is too strict. You basically say that you'd rather not see the proposed Boost.JSON enter Boost until CBOR is implemented, which may happen six months from now. So people who don't have a need for CBOR will have to wait until Boost 1.77, which doesn't really help anyone.
Actually I would like to see Boost.JSON in Boost, with or without CBOR. However, I can't in good conscience vote to accept a library that I am unwilling to use. Boost should not be a graveyard of well-designed but unused libraries.
As I explained in my review, if my condition is not met, I would like my vote to be counted as ABSTAIN, not REJECT. I am hoping that Boost.JSON will get enough votes to accept to make it into Boost, with or without CBOR support.
If the proposed library were called Boost.Serialisation2 or something, I would see your point.
But it's called Boost.JSON. It implements JSON. It does not implement CBOR. I don't think it's reasonable to recommend a rejection for a library not doing something completely different to what it does.
I see CBOR not as a separate format, but as an encoding for JSON (with some additional features that can safely be ignored). I use it to store and transmit JSON data, and would not use it for anything else. JSON data exists independently of the JSON serialization format. This is in fact a core principle of Boost.JSON: the data representation exists independently of the serialization functions.
Speaking wider that this, if the format here were not specifically JSON, I'd also be more sympathetic - binary as well as text serialisation/deserialisation is important. But JSON is unique, most users would not choose JSON except that they are forced to do so by needing to talk to other stuff which mandates JSON.
At work we have this lovely very high performance custom DB based on LLFIO. It has rocking stats. But it's exposed to clients via a REST API, and that means everything goes via JSON. So the DB spends most of its time fairly idle compared to what it is capable of, because JSON is so very very slow in comparison.
This is exactly the sort of problem that CBOR excels at. The server produces JSON. The client consumes JSON. Flip a switch, and the server produces CBOR instead. Ideally the client doesn't have to be changed at all. One line of code changed in the server, and suddenly you have twice the data throughput.
In any case, CBOR is actually a fairly lousy binary protocol. Very inefficient compared to alternatives. But the alternatives all would require you to design your software differently to what JSON's very reference count centric design demands.
It may be lousy as a general-purpose binary protocol, but it's a fairly good binary JSON representation. Which is why it belongs in a JSON library if it belongs anywhere. -- Rainer Deyke (rainerd@eldwood.com)
Rainer Deyke wrote:
I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the inclusion of code for parsing and serializing boost::json::value as CBOR.
The one interesting decision that needs to be made here is how to handle CBOR byte strings (major type 3), as they aren't representable in JSON or in the current boost::json::value. I see that Niels Lohmann has added a "binary" 'kind' to the json value for this purpose. Which would then invite the opposite question, what's a JSON serializer supposed to do with kind_binary.
On Wed, Sep 16, 2020 at 6:33 AM Peter Dimov via Boost
The one interesting decision that needs to be made here is how to handle CBOR byte strings (major type 3), as they aren't representable in JSON or in the current boost::json::value.
I see that Niels Lohmann has added a "binary" 'kind' to the json value for this purpose. Which would then invite the opposite question, what's a JSON serializer supposed to do with kind_binary.
If I make a library that has a public function which accepts a parameter of type json::value, I don't expect to see binary objects in it nor would I want to have to write code to handle something that is not part of JSON. And I expect that things round-trip correctly, i.e. if I serialize a json::value and then parse it, I get back the same result. This is clearly impossible if the json::value contains a "binary" string. If CBOR was just another serialization format, then I might lean towards implementing it. But CBOR is sounding more and more like it is Not JSON and thus out-of-scope for Boost.JSON. Thanks
Vinnie Falco wrote:
If I make a library that has a public function which accepts a parameter of type json::value, I don't expect to see binary objects in it nor would I want to have to write code to handle something that is not part of JSON. And I expect that things round-trip correctly, i.e. if I serialize a json::value and then parse it, I get back the same result. This is clearly impossible if the json::value contains a "binary" string.
"Binary string" is basically JSON string with UTF-8 validation turned off; it's a common thing to want to send/receive and the binary formats are arguably correct in offering it as a specific type. Of course this doesn't change the fact that it's not representable in standard JSON.
On 17/09/2020 02:26, Peter Dimov wrote:
"Binary string" is basically JSON string with UTF-8 validation turned off; it's a common thing to want to send/receive and the binary formats are arguably correct in offering it as a specific type. Of course this doesn't change the fact that it's not representable in standard JSON.
The standard JSON representation would be a base64 string. Though of course both sides have to agree to that. And you probably wouldn't want a library to non-semantically auto-decode anything that looks vaguely like a base64 string to bytes (but you'd need something like that if you wanted to round-trip CBOR to JSON to CBOR).
On Tue, Sep 15, 2020 at 7:14 AM Rainer Deyke via Boost
- The omission of binary serialization formats (CBOR et al) bothers me. Not from a theoretical point of view, but because I have actual code that uses CBOR, and I won't be able to convert this code to Boost.JSON unless CBOR support is provided.
I've looked at the CBOR specification and some implementations in the wild and these points stick out: 1. CBOR supports extensions, which cannot be represented in boost::json::value 2. CBOR also supports "binary" strings, which also cannot be represented in boost::json::value 3. If boost.json's value container could hold these things, then it would no longer serialize to standard JSON Therefore, it seems to me that CBOR is not just a "binary serialization format for JSON." It is in fact a completely different format that only strongly resembles JSON. Or perhaps you could say it is a superset of JSON. I think the best way to support this is as follows: 1. Fork the Boost.JSON repository, rename it to Boost.CBOR 2. Add support for binary strings to the cbor::value type 3. Add support for extensions to the cbor::value type 4. Replace the parse, parser, serialize, and serializer interfaces with CBOR equivalents 5. Propose this library as a new Boost library, with a separate review process Then, we would have a first-class CBOR library whose interface and implementation are optimized specifically for CBOR. Questions such as what happens when you serialize a cbor::value to JSON would be moot. This could be something that Krystian might take on as author and maintainer. Thanks
Vinnie Falco wrote:
I think the best way to support this is as follows:
1. Fork the Boost.JSON repository, rename it to Boost.CBOR 2. Add support for binary strings to the cbor::value type 3. Add support for extensions to the cbor::value type
This misses the point quite thoroughly; it doesn't "support this" at all. The point of "binary JSON" is that people already have a code base that uses JSON for communication and - let's suppose - boost::json::value internally. Now those people want to offer an optional, alternate wire format that is not as wasteful as JSON, so that the other endpoint may choose to use it. But they most definitely don't want to rewrite or duplicate all their code to be cbor::value based, instead of, or in addition to, json::value based. It doesn't matter that CBOR supports extensions, because they don't. And it may not even matter that the hypothetical CBOR-to-json::value parser doesn't support binary values, because their protocol, being originally JSON-based, doesn't. (There is actually a fully compatible way to support "binary values" in json::value, but it will require some hammering out.)
On Wed, Sep 16, 2020 at 8:02 AM Peter Dimov via Boost
The point of "binary JSON" is that people already have a code base that uses JSON for communication and - let's suppose - boost::json::value internally. Now those people want to offer an optional, alternate wire format that is not as wasteful as JSON, so that the other endpoint may choose to use it.
So what is being discussed here is "partial CBOR support?" In other words, only the subset of CBOR that perfectly overlaps with JSON?
(There is actually a fully compatible way to support "binary values" in json::value, but it will require some hammering out.)
Well, let's hear it! Thanks
Vinnie Falco wrote:
So what is being discussed here is "partial CBOR support?" In other words, only the subset of CBOR that perfectly overlaps with JSON?
Well, it's self-evident that parsing a json::value from, and serializing a json::value to, CBOR, only could ever support the subset of CBOR that's representable in json::value. Fortunately, that's almost all of CBOR, because that's what CBOR is for.
(There is actually a fully compatible way to support "binary values" in json::value, but it will require some hammering out.)
Well, let's hear it!
You may remember my going on and on about arrays of scalars at one point.
This is what MessagePack gets right - arrays of scalars are an important
special case and stuffing them into the general value[] representation
wastes both memory and performance.
If json::array supported internally arrays of scalars, without changing its
interface at all so it still appeared to clients as a json::array of
json::values, we could represent a binary value as a json::array (which
internally uses a scalar type of unsigned char.)
This magically solves everything - value_to
On Wed, Sep 16, 2020 at 8:20 AM Peter Dimov
If json::array supported internally arrays of scalars, without changing its interface at all so it still appeared to clients as a json::array of json::values, we could represent a binary value as a json::array (which internally uses a scalar type of unsigned char.)
That is easy to say but what do you do about this function which returns a reference: inline value& array::operator[]( std::size_t pos ) noexcept; What do you do if you have an array of int (scalar) and someone accesses an element in the middle and assigns a string to it? Thanks
Vinnie Falco wrote:
That is easy to say but what do you do about this function which returns a reference:
inline value& array::operator[]( std::size_t pos ) noexcept;
What do you do if you have an array of int (scalar) and someone accesses an element in the middle and assigns a string to it?
The only possible answer is "copy the entire thing into value[]". This may
or may not be acceptable. I would think that if you have an array of 8044
ints, assigning a string somewhere in the middle would be a rare occurrence,
but who knows.
The upside is that an array of ints would consume significantly less memory
than today; this is often important and we've already seen that for a subset
of users, memory consumption is _the_ important metric. (It would also make
value_to
-----Original Message----- From: Boost
On Behalf Of Pranam Lashkari via Boost Sent: 14 September 2020 08:30 To: boost Cc: Pranam Lashkari Subject: [boost] [review][JSON] Review of JSON starts today: Sept 14 - Sept 23 Boost formal review of Vinnie Falco and Krystian Stasiowski's library JSON starts today and will run for 10 days ending on 23 Sept 2020. Both of these authors have already developed a couple of libraries which are accepted in Boost(boost beast and Static String)
This library focuses on a common and popular use-case for JSON. It provides a container to hold parsed and serialised JSON types. It provides more flexibility and better benchmark performance than its competitors.
JSON highlights the following features in the documentation:
- Fast compilation - Require only C++11 - Fast streaming parser and serializer - Easy and safe API with allocator support - Constant-time key lookup for objects - Options to allow non-standard JSON - Compile without Boost, define BOOST_JSON_STANDALONE - Optional header-only, without linking to a library
What I knew about JSON could have be written on a postage stamp, but I have at least read the documentation. It is an example of how it should be done. It has good examples and good reference info. I could quickly see how to use it, but didn't have a need, and didn't feel it useful as others already have, and there are benchmarks too. On that basis alone, my view is ACCEPT, FWIW.
(a point I would like to add in highlight: it has cool Jason logo 😝)
(😝 indeed - My only recommendation is to replace this with a Boost logo ASAP ! No - more than that - I make it a condition for acceptance.) Paul PS That there are other libraries doing similar (but fairly different) things is no reason to reject this library.
Again I request everyone to spare some time to review this library. Last
day to submit the official review would be on the 23rd of September. There
is no way this review would be extended as there are other reviews aligned
so please submit the review as soon as possible. Every review is really
important to boost.
Thank you very much for your time in advance.
On Mon, Sep 14, 2020 at 1:00 PM Pranam Lashkari
Boost formal review of Vinnie Falco and Krystian Stasiowski's library JSON starts today and will run for 10 days ending on 23 Sept 2020. Both of these authors have already developed a couple of libraries which are accepted in Boost(boost beast and Static String)
This library focuses on a common and popular use-case for JSON. It provides a container to hold parsed and serialised JSON types. It provides more flexibility and better benchmark performance than its competitors.
JSON highlights the following features in the documentation:
- Fast compilation - Require only C++11 - Fast streaming parser and serializer - Easy and safe API with allocator support - Constant-time key lookup for objects - Options to allow non-standard JSON - Compile without Boost, define BOOST_JSON_STANDALONE - Optional header-only, without linking to a library
(a point I would like to add in highlight: it has cool Jason logo 😝)
To quickly understand and get the flavour of the library take a look at "Quick Look" http://master.json.cpp.al/json/usage/quick_look.html
You can find the source code to be reviewed here: https://github.com/CPPAlliance/json/tree/master
You can find the latest documentation here: http://master.json.cpp.al/
Benchmarks are also given in the document which can be found here: http://master.json.cpp.al/json/benchmarks.html
Some people have also given the early reviews, the thread can be found here: https://lists.boost.org/Archives/boost/2020/09/249745.php
Please provide in your review information you think is valuable to understand your choice to ACCEPT or REJECT including JSON as a Boost library. Please be explicit about your decision (ACCEPT or REJECT).
Some other questions you might want to consider answering:
- What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With which compiler(s)? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain?
More information about the Boost Formal Review Process can be found here: http://www.boost.org/community/reviews.html
Thank you for your effort in the Boost community.
-- Thank you, Pranam Lashkari, https://lpranam.github.io/
-- Thank you, Pranam Lashkari, https://lpranam.github.io/
Tomorrow(23rd Sept) is going to be the last day to submit an official
review for the JSON, so if you are willing to submit review, hurry up.
On Sat, Sep 19, 2020 at 7:12 PM Pranam Lashkari
Again I request everyone to spare some time to review this library. Last day to submit the official review would be on the 23rd of September. There is no way this review would be extended as there are other reviews aligned so please submit the review as soon as possible. Every review is really important to boost.
Thank you very much for your time in advance.
On Mon, Sep 14, 2020 at 1:00 PM Pranam Lashkari
wrote: Boost formal review of Vinnie Falco and Krystian Stasiowski's library JSON starts today and will run for 10 days ending on 23 Sept 2020. Both of these authors have already developed a couple of libraries which are accepted in Boost(boost beast and Static String)
This library focuses on a common and popular use-case for JSON. It provides a container to hold parsed and serialised JSON types. It provides more flexibility and better benchmark performance than its competitors.
JSON highlights the following features in the documentation:
- Fast compilation - Require only C++11 - Fast streaming parser and serializer - Easy and safe API with allocator support - Constant-time key lookup for objects - Options to allow non-standard JSON - Compile without Boost, define BOOST_JSON_STANDALONE - Optional header-only, without linking to a library
(a point I would like to add in highlight: it has cool Jason logo 😝)
To quickly understand and get the flavour of the library take a look at "Quick Look" http://master.json.cpp.al/json/usage/quick_look.html
You can find the source code to be reviewed here: https://github.com/CPPAlliance/json/tree/master
You can find the latest documentation here: http://master.json.cpp.al/
Benchmarks are also given in the document which can be found here: http://master.json.cpp.al/json/benchmarks.html
Some people have also given the early reviews, the thread can be found here: https://lists.boost.org/Archives/boost/2020/09/249745.php
Please provide in your review information you think is valuable to understand your choice to ACCEPT or REJECT including JSON as a Boost library. Please be explicit about your decision (ACCEPT or REJECT).
Some other questions you might want to consider answering:
- What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With which compiler(s)? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain?
More information about the Boost Formal Review Process can be found here: http://www.boost.org/community/reviews.html
Thank you for your effort in the Boost community.
-- Thank you, Pranam Lashkari, https://lpranam.github.io/
-- Thank you, Pranam Lashkari, https://lpranam.github.io/
-- Thank you, Pranam Lashkari, https://lpranam.github.io/
Formal review of JSON has ended officially now. I would take a couple of
days to compile and declare the results.
Thank you very much to everyone who has invested time to review this
library.
On Tue, Sep 22, 2020 at 9:37 PM Pranam Lashkari
Tomorrow(23rd Sept) is going to be the last day to submit an official review for the JSON, so if you are willing to submit review, hurry up.
On Sat, Sep 19, 2020 at 7:12 PM Pranam Lashkari
wrote: Again I request everyone to spare some time to review this library. Last day to submit the official review would be on the 23rd of September. There is no way this review would be extended as there are other reviews aligned so please submit the review as soon as possible. Every review is really important to boost.
Thank you very much for your time in advance.
On Mon, Sep 14, 2020 at 1:00 PM Pranam Lashkari
wrote: Boost formal review of Vinnie Falco and Krystian Stasiowski's library JSON starts today and will run for 10 days ending on 23 Sept 2020. Both of these authors have already developed a couple of libraries which are accepted in Boost(boost beast and Static String)
This library focuses on a common and popular use-case for JSON. It provides a container to hold parsed and serialised JSON types. It provides more flexibility and better benchmark performance than its competitors.
JSON highlights the following features in the documentation:
- Fast compilation - Require only C++11 - Fast streaming parser and serializer - Easy and safe API with allocator support - Constant-time key lookup for objects - Options to allow non-standard JSON - Compile without Boost, define BOOST_JSON_STANDALONE - Optional header-only, without linking to a library
(a point I would like to add in highlight: it has cool Jason logo 😝)
To quickly understand and get the flavour of the library take a look at "Quick Look" http://master.json.cpp.al/json/usage/quick_look.html
You can find the source code to be reviewed here: https://github.com/CPPAlliance/json/tree/master
You can find the latest documentation here: http://master.json.cpp.al/
Benchmarks are also given in the document which can be found here: http://master.json.cpp.al/json/benchmarks.html
Some people have also given the early reviews, the thread can be found here: https://lists.boost.org/Archives/boost/2020/09/249745.php
Please provide in your review information you think is valuable to understand your choice to ACCEPT or REJECT including JSON as a Boost library. Please be explicit about your decision (ACCEPT or REJECT).
Some other questions you might want to consider answering:
- What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With which compiler(s)? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain?
More information about the Boost Formal Review Process can be found here: http://www.boost.org/community/reviews.html
Thank you for your effort in the Boost community.
-- Thank you, Pranam Lashkari, https://lpranam.github.io/
-- Thank you, Pranam Lashkari, https://lpranam.github.io/
-- Thank you, Pranam Lashkari, https://lpranam.github.io/
-- Thank you, Pranam Lashkari, https://lpranam.github.io/
> Please provide in your review information you think is valuable to > understand your choice to ACCEPT or REJECT including JSON as a > Boost library. Please be explicit about your decision (ACCEPT or REJECT). ACCEPT. Few minor issues to iron out, but nothing holding back acceptance. > Some other questions you might want to consider answering: > > - What is your evaluation of the design? Sound. The introduction of a new string-type raises eyebrows but it seems to be worth it given the improvements. However a conversion to a std::string should likely be added (or documented as supported already) I'm however not convinced about the as_double, value_toand number_cast methods and their differences and intuitive exceptions from users. I'd expect that documented at "Using numbers" or at least a cross reference from there > - What is your evaluation of the implementation? After the discussions in Slack and on ML it improved considerably. Especially due to use of inline namespaces and is_/as_ function changes. Things left: - `basic_parser` being declared in a detail header feels wrong. The implementation (which includes the "private" declaration) is "public" which is kind of the opposite I expected and indeed other reviewers found the same. Especially as the docs list it as "defined in detail". I'd put both in public, maybe use "basic_parser_impl.hpp" - There seem to be multiple ways of doing things and some seem to make great differences in exception handling, conversions and performance. Sometimes this doesn't seem to be clear (see other reviews and above) > - What is your evaluation of the documentation? Very good with few improvements: - "Quick Look" should be a top-level link. I struggled to find it in my first pass. I'd expect that to be the very first link when opening the docu or even at the front-page - "This storage is freed when the parser is destroyed, allowing the parser to cheaply re-use this memory on subsequent parses, improving performance." - What does this mean? How can a destroyed parser reuse anything? Or why does it need highlighting that freed memory can be reused? - The example `handler` assigns "-1" to a std::size_t. I guess that deserves a comment at least for beginners wondering about the signed->unsigned conversion - "`finish` Parse JSON incrementally." is lacking a clearer summary - "parser::release" says "UB if the parser is not done" but "If ! this->done(), an exception is thrown.", which seems contradictory. I'd remove the UB here, maybe use an EC overload if exceptions should be avoided - "write/write_some" both say "Parse JSON incrementally.". The summary should make it clearer what they do and the full description should contain, well, a full description - concepts like "ToMapLike" are explained after they are used w/o a link/reference > - What is your evaluation of the potential usefulness of the library? Very useful especially after reading Peters review about extending the parser etc. > - Did you try to use the library? With which compiler(s)? Did you have > any problems? Yes with some small tests. GCC 10.0, no problems > - How much effort did you put into your evaluation? A glance? A quick > reading? In-depth study?without explanation what they are Couple hours checking docu and code. > - Are you knowledgeable about the problem domain? Ok, I guess. Just as anyone who has used JSON anywhere.
On Mon, Sep 21, 2020 at 2:05 AM Alexander Grund via Boost
ACCEPT. Few minor issues to iron out, but nothing holding back acceptance.
Thank you for your thoughtful review and comments. I have broken out each of the problems you raised into separate issues in the repository so they can be tracked. Regards
I've gotten some great feedback on this library, thanks to everyone who spent the time to look at this work and comment on it. We have a little under three days, and I wanted to address something that is obviously important as it has been brought up a few times here and elsewhere. It seems there are two desired use-cases for JSON: 1. "JSON-DOM" - parse and serialize to and from a variant-like hierarchical container provided by the library 2. "JSON Serialization" - parse and serialize directly to and from user-defined types. My thoughts on this are as follows: * Both of these use-cases are useful, and desirable * Most of the time, a user wants one or the other - rarely both * Optimized implementations of these use-cases are unlikely to share much code * These are really two different libraries Boost.JSON is designed to offer 1. above and has no opinion on 2. Some of the less-than-positive reviews argue that both of these use-cases should be crammed into one library, otherwise users should not have access to either. Here are some related facts: * No one has submitted a JSON library of any kind for review to Boost *ever* * The most popular JSON libraries implement JSON-DOM, not JSON Serialization * Even one of the most popular serialization libraries, Boost.Serialization, does not offer a JSON archive implementation * Boost.Property tree supports JSON-DOM out of the box, but not JSON Serialization I find it interesting that people are coming out of the woodwork who claim to have written their own JSON libraries, that say REJECT to Boost.JSON, because they feel that conversion between JSON and user-defined types is of the utmost importance and that Boost can't have a JSON library without it. * If this is so important, why does Boost.Serialization not have it? * Why is no one submitting a pull request to Boost.Serialization for a JSON archive? * Why has no one proposed a library to Boost which implements JSON Serialization? * Why doesn't Boost.PropertyTree have JSON-DOM but no JSON Serialization? * Where are the immensely popular JSON Serialization libraries? Meanwhile the JSON For Modern C++ repository has over 20,000 GitHub stars and Tencent's RapidJSON repository has almost 10,000 GitHub stars. In a sense these libraries have become standards, which is a shame since they both have defects which I believe Boost.JSON addresses. Clearly the JSON-DOM use case is popular (these libraries also do not offer JSON Serialization). Where are the immensely popular JSON Serialization libraries? Not to put too fine a point on it...but these arguments against Boost.JSON do not withstand any sort of scrutiny; rational readers should find them entirely unconvincing. Thanks
On 9/21/20 11:46 AM, Vinnie Falco via Boost wrote:
I find it interesting that people are coming out of the woodwork who claim to have written their own JSON libraries, that say REJECT to Boost.JSON, because they feel that conversion between JSON and user-defined types is of the utmost importance and that Boost can't have a JSON library without it.
* If this is so important, why does Boost.Serialization not have it?
Good question. FYI - no one has EVER requested or even inquired about this. I don't think it would be very hard (the xml_archive is hard). Maybe everyone who felt the needed it just made there own. It's also possible that no one uses the serialization library anymore. I would have no way of knowing if that were the case. FWIW I'm hoping this year to do a re-boot of the serialization documentation. The main focus is to convert it from raw html to boost.book. But also I want to add more examples: using it for "deep copy", using co-routines to convert from one serialization format to another, layering other boost libraries like encryption and compression, generating "editable" archives, etc. None of these will alter the library itself - just provide more examples. I have always been puzzled why no one has ever asked for any of these things.
* Why is no one submitting a pull request to Boost.Serialization for a JSON archive?
ditto
* Why has no one proposed a library to Boost which implements JSON Serialization? * Why doesn't Boost.PropertyTree have JSON-DOM but no JSON Serialization? * Where are the immensely popular JSON Serialization libraries?
ditto Robert Ramey
Robert Ramey via Boost said: (by the date of Mon, 21 Sep 2020 13:01:58 -0700)
* If this is so important, why does Boost.Serialization not have it?
Good question. FYI - no one has EVER requested or even inquired about this. I don't think it would be very hard (the xml_archive is hard). Maybe everyone who felt the needed it just made there own. It's also possible that no one uses the serialization library anymore.
I think that it's so good that there is nothing left to ask about (except for problems with serializing boost::optional ;) #165 Vinnie Falco via Boost said: (by the date of Mon, 21 Sep 2020 11:46:13 -0700)
Not to put too fine a point on it...but these arguments against Boost.JSON do not withstand any sort of scrutiny; rational readers should find them entirely unconvincing.
Agreed. -- # Janek Kozicki http://janek.kozicki.pl/
Em seg., 21 de set. de 2020 às 15:46, Vinnie Falco via Boost
* Where are the immensely popular JSON Serialization libraries?
My gopher friend actually makes fun of us because we don't have json.Unmarshal(): https://blog.golang.org/json It's him making fun of us/me that gave me the idea to integrate Boost.Serialization and Boost.Hana's Struct. ```go type Bid struct { Price string Size string NumOrders int } type OrderBook struct { Sequence int64 `json:"sequence"` Bids []Bid `json:"bids"` Asks []Bid `json:"asks"` } ... var book OrderBook json.Unmarshal(buffer, &book) ``` But that's a subject that I'll want to discuss with Robert Ramey after Boost.JSON's review. In my job, we have our own in-house serialization framework. I guess that's what many others do as well in C++, but the requirements on the JSON library to be usable in serialization frameworks would be quite similar. Compare Boost.Serialization and QDataStream from Qt. -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
Vinícius dos Santos Oliveira wrote:
In my job, we have our own in-house serialization framework. I guess that's what many others do as well in C++, but the requirements on the JSON library to be usable in serialization frameworks would be quite similar.
The easiest way to make a JSON input archive for Boost.Serialization is to use Boost.JSON and go through json::value. Boost.Serialization typically wants fields to appear in the same order they were written, but JSON allows arbitrary reordering. Reading a json::value first and then deserializing from that is much easier than deserializing directly from JSON. For output, it's easier to bypass json::value and write JSON directly.
Em seg., 21 de set. de 2020 às 23:02, Peter Dimov via Boost
The easiest way to make a JSON input archive for Boost.Serialization is to use Boost.JSON and go through json::value. Boost.Serialization typically wants fields to appear in the same order they were written, but JSON allows arbitrary reordering. Reading a json::value first and then deserializing from that is much easier than deserializing directly from JSON.
For output, it's easier to bypass json::value and write JSON directly.
It may be easier, but it's also the wrong way. There are libraries whose usage patterns leak some structure into user code itself. We have Lua and its virtual stack, for instance. Boost.Serialization is one of such libraries. That's really a topic that I'd rather delay the explanation on. Boost.Serialization un-capturable structure makes it impossible to untangle the serialization format from ordered trees. For a JSON archive, this means arrays everywhere, and json::value doesn't really help here. But there's a catch. The user can have very valid concerns to control serialization to just one archive or another. Here's where he'll overload his types directly to a single archive. And here's where he can use archive extensions from one model. For the JSON iarchive, the only extension we need to expose is the pull parser object. This and some accompanying algorithms (json::partial::skip(), json::partial::scanf(), ...) is a much better answer than what you propose. It's not really hard if you understand the pull parser model. But I'm more excited about Boost.Hana's Struct integration actually. We can discuss all in detail after Boost.JSON's review. Please don't misdirect discussions with comments such as "it'd be easier with json::value". That's hardly a comment from somebody who explored the subject. -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
Vinícius dos Santos Oliveira wrote:
But I'm more excited about Boost.Hana's Struct integration actually.
Is this like https://pdimov.github.io/describe/doc/html/describe.html#example_serializati... or do you have something else in mind?
Em ter., 22 de set. de 2020 às 09:22, Peter Dimov
But I'm more excited about Boost.Hana's Struct integration actually.
Is this like
https://pdimov.github.io/describe/doc/html/describe.html#example_serializati...
or do you have something else in mind?
Yes and no. If you rely on universal serialization, you fall back to Boost.Serialization implied ordered trees. The TMP code could look like the example you linked (e.g. one mp11::mp_for_each() here and there). Are you willing to submit a new reflection library to Boost? I'd be glad to hear all about it after Boost.JSON's review. One of such questions would be: how does it compare to Boost.Hana's Struct? The reason why I'm trying to delay this debate is because I know we'll have lots of noise from non-interested parties and I'd rather not deal with that. A couple extra days and all the unwanted noise would vanish. There are a few questions to sort it out: - Do you want to enable Boost.Hana by default or use an opt-in mechanism? - Overload rules to choose the most specific serialization. - Integration to Boost.Serialization's extra features. This will require a larger time investment, but has nothing to do with Boost.Hana or integration to any other reflection library. - And of course, concerns raised by interested stakeholders. The end game would be: ```json { "foo": 42, "bar": "hello world" } ``` (de)serializes effortlessly to: ```cpp struct Foobar { int foo; std::string bar; }; ``` just like Go's json.Unmarshal() -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
Vinícius dos Santos Oliveira wrote:
The end game would be:
```json { "foo": 42, "bar": "hello world" } ```
Switch "foo" and "bar" here for generality.
(de)serializes effortlessly to:
```cpp struct Foobar { int foo; std::string bar; }; ```
It's already possible to make this work using Boost.JSON. https://pdimov.github.io/describe/doc/html/describe.html#example_from_json Just add the `parse` call.
Are you willing to submit a new reflection library to Boost?
Yes, I'm waiting for the review to end to not detract from the Boost.JSON discussions.
Em ter., 22 de set. de 2020 às 10:21, Peter Dimov
It's already possible to make this work using Boost.JSON. https://pdimov.github.io/describe/doc/html/describe.html#example_from_json
Just add the `parse` call.
You're missing the point. I don't want the DOM intermediate representation. It's not needed. The right choice for the archive concept is to use a pull parser and read directly from the buffer. The DOM layer adds a high cost here.
Yes, I'm waiting for the review to end to not detract from the Boost.JSON discussions.
Looking forward to it. -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
Vinícius dos Santos Oliveira wrote:
You're missing the point. I don't want the DOM intermediate representation. It's not needed.
I get that. I also get the appeal and the utility of pull parsers. But my point is that I can make that work today, quite easily, using Boost.JSON. It's 2020. Boost has zero pull JSON parsers. (I counted them, twice.) The "boost" implementation on https://github.com/kostya/benchmarks#json uses PropertyTree and is leading from behind. Maybe Boost.JSON can be refactored and a pull parser can be inserted at the lowest level. But in the meantime, there are people who have actual need for a nlohmann/json (because of speed) or RapidJSON (because of interface) replacement, and we don't have it. It doesn't make much sense to me to wait until 2032 to have that in Boost.
On 22. Sep 2020, at 16:23, Peter Dimov via Boost
wrote: But in the meantime, there are people who have actual need for a nlohmann/json (because of speed) or RapidJSON (because of interface) replacement, and we don't have it. It doesn't make much sense to me to wait until 2032 to have that in Boost.
My rough count of accept votes indicates that Boost.JSON is going to be accepted, so you get what you want, but I feel we gave up on trying to achieve the best possible technical solution for this problem out of a wrong sense of urgency (also considering the emails by Bjørn and Vinícius I does not seem like we need to wait for 2032 for a different approach). This is C++, we strifes for "zero overhead" and "maximum efficiency for all reasonable use cases", and to achieve that requires a careful interface design. I only worry about interfaces, because implementations can be improved at any time. However if an interface is designed to requires more work than absolutely necessary, then this cannot be fixed afterwards. You said yourself that Boost.JSON is not as efficient as it could be during the conversion of "my data type" to JSON, because the existing data has to be copied into the json::value first. I am a young member of the Boost family, but my feeling is that this would have been a reason to reject the design in the past. Designing abstractions that enable users to get maximum performance if they want it is a core value of C++. As my previous examples of pybind11 and cereal have shown, the lasting legacies of Boost are excellent interfaces. Making good interface is very difficult and that's where the review process really shines. We have not achieved that here, since valid concerns are pushed aside by the argument: we have to offer a solution right now. Best regards, Hans
Hans Dembinski wrote:
You said yourself that Boost.JSON is not as efficient as it could be during the conversion of "my data type" to JSON, because the existing data has to be copied into the json::value first. I am a young member of the Boost family, but my feeling is that this would have been a reason to reject the design in the past.
I don't think so. As a previous reviewer correctly observed, an apple has been submitted, and you're complaining that it isn't an orange. Were it a bad apple, that would have been a reason to reject. If we didn't need apples, if users didn't need apples, if the two most popular fruits weren't apples, that might have been a reason to reject. Not being an orange isn't. One possible objection (that has been used in the past) is that if we accept an apple, nobody will submit an orange anymore. That's where the calendar argument comes in. It's 2020, we didn't have an apple, and nobody has submitted an orange. Ten years should have been enough time for orange proponents. And this has nothing to do with "urgency".
On 23.09.20 11:04, Hans Dembinski via Boost wrote:
You said yourself that Boost.JSON is not as efficient as it could be during the conversion of "my data type" to JSON, because the existing data has to be copied into the json::value first. I am a young member of the Boost family, but my feeling is that this would have been a reason to reject the design in the past. Designing abstractions that enable users to get maximum performance if they want it is a core value of C++.
Actually, some of the oldest Boost libraries have (had) atrocious performance. Early Boost libraries were often more concerned with how to do something at all (within the limitations of C++03) than with how to do it efficiently. -- Rainer Deyke (rainerd@eldwood.com)
On 23/09/2020 10:04, Hans Dembinski via Boost wrote:
On 22. Sep 2020, at 16:23, Peter Dimov via Boost
wrote: But in the meantime, there are people who have actual need for a nlohmann/json (because of speed) or RapidJSON (because of interface) replacement, and we don't have it. It doesn't make much sense to me to wait until 2032 to have that in Boost.
My rough count of accept votes indicates that Boost.JSON is going to be accepted, so you get what you want, but I feel we gave up on trying to achieve the best possible technical solution for this problem out of a wrong sense of urgency (also considering the emails by Bjørn and Vinícius I does not seem like we need to wait for 2032 for a different approach).
For the record, I've had offlist email discussions about proposed Boost.JSON with a number of people where the general feeling was that there was no point in submitting a review, as negative review feedback would be ignored, possibly with personal retribution thereafter, and the library was always going to be accepted in any case. So basically it would be wasted effort, and they haven't bothered. I haven't looked at the library myself, so I cannot say if the concerns those people raised with it are true, but what you just stated above about lack of trying for a best possible technical solution is bang on the nail if one were to attempt summarising the feeling of all those correspondences. Me personally, if I were designing something like Boost.JSON, I'd implement it using a generator emitting design. I'd make the supply of input destructive gather buffer based, so basically you feed the parser arbitrary sized chunks of input, and the array of pointers to those discontiguous input blocks is the input document. As the generator resumes, emits and suspends during parse, it would destructively modify in-place those input blocks in order to avoid as much dynamic memory allocation and memory copying as possible. I'd avoid all variant storage, all type erasure, by separating the input syntax lex from the value parse (which would be on-demand, lazy), that also lets one say "go get me the next five key-values in this dictionary" and that would utilise superscalar CPU concurrency to go execute those in parallel. I would also attempt to make the whole JSON parser constexpr, not necessarily because we need to parse JSON at compile time, but because it would force the right kind of design decisions (e.g. all literal types) which generate significant added value to the C++ ecosystem. I mean, what's the point of a N + 1 yet another JSON parser when we could have a Hana Dusíková all-constexpr regex style JSON parser? Consider this: a Hana Dusíková type all-constexpr JSON parser could let you specify to the compiler at compile time "this is the exact structure of the JSON that shall be parsed". The compiler then bangs out optimum parse code for that specific JSON structure input. At runtime, the parser tries the pregenerated canned parsers first, if none match, then it falls back to runtime parsing. Given that much JSON is just a long sequence of identical structure records, this would be a very compelling new C++ JSON parsing library, a whole new better way of doing parsing. *That* I would get excited about. Niall
On Wed, Sep 23, 2020 at 4:55 AM Niall Douglas via Boost
Me personally, if I were designing something like Boost.JSON, I'd implement it using a generator emitting design. I'd make the supply of input destructive gather buffer based
So you would implement an orange instead of an apple. Note that the C++ ecosystem already has the flavor of orange you are describing, it is called SimdJSON and it is quite performant. As with your approach, it produces a read-only document. Quite different from Boost.JSON. There's nothing wrong with implementing a library that only offers a parser and a read-only DOM, and there are certainly use-cases for it. Personally I would not try to compete with SimdJSON myself but perhaps you can do better than me, especially in the area of "parallel execution utilising superscalar CPU concurrency." Regards
On 23/09/2020 13:14, Vinnie Falco wrote:
On Wed, Sep 23, 2020 at 4:55 AM Niall Douglas via Boost
wrote: Me personally, if I were designing something like Boost.JSON, I'd implement it using a generator emitting design. I'd make the supply of input destructive gather buffer based
So you would implement an orange instead of an apple. Note that the C++ ecosystem already has the flavor of orange you are describing, it is called SimdJSON and it is quite performant. As with your approach, it produces a read-only document. Quite different from Boost.JSON. There's nothing wrong with implementing a library that only offers a parser and a read-only DOM, and there are certainly use-cases for it. Personally I would not try to compete with SimdJSON myself but perhaps you can do better than me, especially in the area of "parallel execution utilising superscalar CPU concurrency."
I never said what I'd do is comparable to what you've done. You're absolutely right it's an apple vs orange difference. What I was describing is the sort of design which would get me excited because it's novel and opens up all sorts of new opportunities not currently well served by existing solutions in the ecosystem. There was a larger point made though, regarding the tradeoff of optimal design vs getting it done. Historically you've favoured getting it done, and have not been a warm recipient to suggestions regarding alternative designs which would require you throwing most or all of what you've done away and starting again. Quite a few people have perceived this about you, and no longer bother to comment on anything you're doing or seeking advice upon. You asked off list what I meant by "personal retribution". I'd prefer to answer that on list. I am referring to you having, in the past, being perceived as harassing and persecuting individuals whose technical opinion you disagree with across multiple internet forums over multiple months, especially if they have ever publicly criticised a technical design that you personally believe doesn't deserve that criticism. That has caused those people, who perceive that about you, to not be willing to interact with anything you touch or are involved with, because they aren't willing to be followed all over the internet for the next few months by you. Now, personally speaking, I think it's more a case of you being rather enthusiastic and passionate in your beliefs and not thinking through other people's perception of you applying those beliefs, rather than you being malevolent. I have stated that opinion about you on several occasions when my opinion was privately sought. But you also need to accept that you reap what you sow in how you're perceived to treat people, just the same way as I knew I'd permanently make most of the technical Boost leadership hate me for life when I decided to go rattle their cages here so many years ago. You may take this reply personally. It was not meant as a personal attack. It was meant as a statement of facts to my best understanding, because I don't think many people tell you this stuff, and if nobody tells you, then there's no progress possible. Niall
On Wed, 23 Sep 2020 at 15:14, Niall Douglas via Boost
On 23/09/2020 13:14, Vinnie Falco wrote:
On Wed, Sep 23, 2020 at 4:55 AM Niall Douglas via Boost
wrote: Me personally, if I were designing something like Boost.JSON, I'd implement it using a generator emitting design. I'd make the supply of input destructive gather buffer based
So you would implement an orange instead of an apple. Note that the C++ ecosystem already has the flavor of orange you are describing, it is called SimdJSON and it is quite performant. As with your approach, it produces a read-only document. Quite different from Boost.JSON. There's nothing wrong with implementing a library that only offers a parser and a read-only DOM, and there are certainly use-cases for it. Personally I would not try to compete with SimdJSON myself but perhaps you can do better than me, especially in the area of "parallel execution utilising superscalar CPU concurrency."
I never said what I'd do is comparable to what you've done. You're absolutely right it's an apple vs orange difference.
What I was describing is the sort of design which would get me excited because it's novel and opens up all sorts of new opportunities not currently well served by existing solutions in the ecosystem.
There was a larger point made though, regarding the tradeoff of optimal design vs getting it done. Historically you've favoured getting it done, and have not been a warm recipient to suggestions regarding alternative designs which would require you throwing most or all of what you've done away and starting again. Quite a few people have perceived this about you, and no longer bother to comment on anything you're doing or seeking advice upon.
You asked off list what I meant by "personal retribution". I'd prefer to answer that on list. I am referring to you having, in the past, being perceived as harassing and persecuting individuals whose technical opinion you disagree with across multiple internet forums over multiple months, especially if they have ever publicly criticised a technical design that you personally believe doesn't deserve that criticism.
That has caused those people, who perceive that about you, to not be willing to interact with anything you touch or are involved with, because they aren't willing to be followed all over the internet for the next few months by you.
Now, personally speaking, I think it's more a case of you being rather enthusiastic and passionate in your beliefs and not thinking through other people's perception of you applying those beliefs, rather than you being malevolent. I have stated that opinion about you on several occasions when my opinion was privately sought. But you also need to accept that you reap what you sow in how you're perceived to treat people, just the same way as I knew I'd permanently make most of the technical Boost leadership hate me for life when I decided to go rattle their cages here so many years ago.
You may take this reply personally. It was not meant as a personal attack. It was meant as a statement of facts to my best understanding, because I don't think many people tell you this stuff, and if nobody tells you, then there's no progress possible.
I think you highlight one of the problems plaguing C++ decision-making in the present day. Far too much concern over feelings, ruffled feathers, politics and imagined thought crimes - and not enough technical discussions backed with facts and benchmarks. I have worked with Vinnie for the past 9 months. During that time he has spoken to me directly concerning design and implementation choices I have made. Often indicating strong disagreement in the most uncompromising terms. However, because I am an adult, able to draw upon my experience and not immediately exhibiting an emotional reaction in response to being challenged, I have been quite able to make my view quite firmly known and understood. Without any fear whatsoever of "personal retribution". I have not experienced "being followed all over the internet" in any way whatsoever. I am also often told by Peter in his deadpan way that, "I am wrong". I actually find this quite humorous, for me the comedy is in expecting to hear the reason, which never comes - presumably because he thinks I ought to be bright enough to see why I am wrong without being told (I am often not). I think there have been some very valid observations made about the design choices of Boost.JSON. I have made some myself, even going so far as to push code to see if I could do better than the existing implementation. I think everyone concerned in its development is quite open that the code is a compromise between "correctness" and performance. In fact while maintaining Boost.Beast, I have been reminded that even in 2020, sometimes you have to break a few rules to get things to go fast. I have been an advocate of "speed is a hardware problem, elegance is a software problem" all my life - but with imperfect compilers sometimes brute force is unfortunately the answer. It seems to me that some of the more scathing criticisms of the library have been made *without offering an alternate implementation.* This might be material to the strength of resistance to these observations - I have personally watched Boost.JSON evolve over the past six (more?) months. The testing, writing, rewriting and retesting has consumed at least 18 man-months (this term used advisedly and unashamedly) of effort. That's 18 months of people's lives dedicated to producing the best possible tool for the most common use-case of a JSON library. There has been an almost messaianic effort made to match and exceed the performance of RapidJSON, so that no-one can say that the Boost offering is anything other than best in class where it matters, *at the point of use*. If people are going to argue that the underlying design choices are wrong, I think they are morally compelled to offer a demonstration of why - preferably as a PR. Knowing Vinnie as I do, I am quite convinced that no matter who submitted the code, if it improved the final result, it would be *welcomed* and the author given full credit, regardless of any previous crossed words spoken in the heat of the moment. I think it's worth reminding everyone that most of us do what we do because we are passionate about it. It is therefore natural for people to argue strongly for what they believe to be right for the good of all. I personally hope that people who are unwilling to contribute for fear of hurt feelings can find it in themselves to speak up - they may end up improving a useful library. Maybe even discovering that through their shared interest in C++ that they can cross boundaries and find interesting friends in unlikely places. R
Niall
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-- Richard Hodges hodges.r@gmail.com office: +442032898513 home: +376841522 mobile: +376380212
Niall Douglas wrote:
For the record, I've had offlist email discussions about proposed Boost.JSON with a number of people where the general feeling was that there was no point in submitting a review, as negative review feedback would be ignored, possibly with personal retribution thereafter, and the library was always going to be accepted in any case.
Personal retribution, really?
Consider this: a Hana Dusíková type all-constexpr JSON parser could let you specify to the compiler at compile time "this is the exact structure of the JSON that shall be parsed". The compiler then bangs out optimum parse code for that specific JSON structure input. At runtime, the parser tries the pregenerated canned parsers first, if none match, then it falls back to runtime parsing.
That's definitely an interesting research project, but our ability to imagine it does not mean that people have no need for what's being submitted - a library with the speed of RapidJSON and the usability of JSON for Modern C++, with some additional and unique features such as incremental parsing. To go on your tangent, I, personally, think that compile-time parsing is overrated because it's cool. Yes, CTRE is a remarkable accomplishment, and yes, Tim Shen, the author of libstdc++'s <regex> also thinks that compile-time regex parsing is the cure for <regex>'s ills. But I don't think so. In my unsubstantiated opinion, runtime parsing can match CTRE's performance, and the only reason current engines don't is because they are severely underoptimized. Similarly, I doubt that a constexpr JSON parser will even match simdjson, let alone beat it.
On 23/09/2020 14:21, Peter Dimov via Boost wrote:
Niall Douglas wrote:
For the record, I've had offlist email discussions about proposed Boost.JSON with a number of people where the general feeling was that there was no point in submitting a review, as negative review feedback would be ignored, possibly with personal retribution thereafter, and the library was always going to be accepted in any case.
Personal retribution, really?
Some have interpreted it as such yes. I have a raft of private email that just arrived there after my post here recounting their stories about being on the receiving end of Vinnie's behaviour, and/or thanking me for writing that post. Richard, I appreciate your "they're being snowflakes" response and standing up for your friend, that was good of you. You should be aware that I've known Vinnie longer than you, possibly as long as anyone here. I think you'll find you're in the "get it done" philosophical camp (at least that's my judgement of you from studying the code you write), so Vinnie's fine with you. I have noticed, from watching him on the internet, he tends to find most issue with those in the "aim for perfection" philosophical camp. Vinnie particularly dislikes other people's visions of abstract perfection if it makes no sense to him, if it's abtuse, or he doesn't understand it. If you're in that camp, then you might have a very different experience than what you've had. Nevertheless, I believe Vinnie's opinion is important as representative of a significant minority of C++ users, and I think it ought to continue to be heard. I might add that the said "snowflakes" that I've spoken to have all to date agreed with that opinion, we're perfectly capable of withstanding severe technical criticism, indeed some of us serve on WG21 where every meeting is usually a battering of oneself. Anyway, I have no wish to discuss this further, all I want to say has been said.
Consider this: a Hana Dusíková type all-constexpr JSON parser could let you specify to the compiler at compile time "this is the exact structure of the JSON that shall be parsed". The compiler then bangs out optimum parse code for that specific JSON structure input. At runtime, the parser tries the pregenerated canned parsers first, if none match, then it falls back to runtime parsing.
To go on your tangent, I, personally, think that compile-time parsing is overrated because it's cool. Yes, CTRE is a remarkable accomplishment, and yes, Tim Shen, the author of libstdc++'s <regex> also thinks that compile-time regex parsing is the cure for <regex>'s ills. But I don't think so. In my unsubstantiated opinion, runtime parsing can match CTRE's performance, and the only reason current engines don't is because they are severely underoptimized.
Hana's runtime benchmarks showed her regex implementation far outpacing any of those in the standard libraries. Like, an order of magnitude in absolute terms, linear scaling to load instead of exponential for common regex patterns. A whole new world of performance. Part of why her approach is so fast is because she didn't implement all of regex. But another part is because she encodes the parse into relationships between literal types which the compiler can far more aggressively optimise than complex types. So basically the codegen is way better, because the compiler can eliminate a lot more code.
Similarly, I doubt that a constexpr JSON parser will even match simdjson, let alone beat it.
simdjson doesn't have class leading performance anymore. There are faster alternatives depending on your use case. Niall
On Wed, Sep 23, 2020 at 2:20 PM Niall Douglas wrote:
For the record, I've had offlist email discussions about proposed Boost.JSON with a number of people where the general feeling was that there was no point in submitting a review, as negative review feedback would be ignored, possibly with personal retribution thereafter, and the library was always going to be accepted in any case.
Personal retribution, really?
Some have interpreted it as such yes. I have a raft of private email that just arrived there after my post here recounting their stories about being on the receiving end of Vinnie's behaviour, and/or thanking me for writing that post.
Niall, I think this might be the first time I've seen you review the submitter, instead of just the submission, so I have to assume it is for a reason this is being raised in public on the Boost mailing list during this review. What is the genuine concern here, with the submission, reviews so far, review manager, etc.? i.e. It seems like there's some subtext that I'm not grasping here, and it would be better if spoken plainly. Mateusz is an active review wizard who can intervene, if you want to raise an issue that something is compromised. Otherwise, I'm not sure why the the off-list opinions of people about Vinnie matter to a Boost review of this library. Glen
On Wed, Sep 23, 2020 at 4:55 AM Niall Douglas via Boost
For the record, I've had offlist email discussions... where the general feeling was that there was no point in submitting a review, as negative review feedback would be ignored, possibly with personal retribution thereafter, and the library was always going to be accepted in any case.
Citing "anonymous sources with knowledge of the matter" to disparage someone is bad enough, but then to make the unfounded allegation that the Boost review process is rigged in advance to favor a particular outcome is disrespectful to everyone who has invested time in the process, including the review manager and the review wizards. As was stated, reviews can be submitted anonymously and they will be evaluated on their merit. Acting as a proxy for anonymous attacks on a submitter is unprofessional and entirely inappropriate. We don't do that here.
...I've known Vinnie longer than you... ...Vinnie's fine with you. I have noticed, from watching him... ...he tends to find most issue with... ...Vinnie particularly dislikes other people's visions...
I would appreciate it if you didn't speak for me, or pretend to think you know what I find issues with, or what visions I like or I dislike. You owe the list an apology. Thanks
Em qua., 23 de set. de 2020 às 15:44, Vinnie Falco via Boost
[...] You owe the list an apology.
wow, quite intimidating words. calm down, fella. I know you're under big pressure. Boost's review is no small undertaking. And pressure does blind the best of our judgments. However... you're being a little paranoid here. Just like you were paranoid when you demanded a second review manager, just like you were being paranoid when you opened this whole thread on the premise of "I find it interesting that people are coming out of the woodwork" (and that was your response to a **single** reject vote). That's a pattern here. Niall never said the process is being rigged. You're imagining it. Niall just said people were discouraged to send any review thanks to your past behaviours. I find that quite easy to believe, actually. I've spent 50 hours on a review for which your answer summed up to cheap baits. It was so depressing that I didn't even reply. And you had my feedback... long time ago. You answer? "Let's put up to review". You never actually considered my feedback. So yeah, you're not quite warming to receive feedback. I wouldn't be surprised to hear a declaration such as Niall's. There's no need for an apology here. Just move on (you all -- Niall included), and try to learn something out of this event. -- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
Niall Douglas wrote:
Hana's runtime benchmarks showed her regex implementation far outpacing any of those in the standard libraries. Like, an order of magnitude in absolute terms, linear scaling to load instead of exponential for common regex patterns. A whole new world of performance.
Part of why her approach is so fast is because she didn't implement all of regex. But another part is because she encodes the parse into relationships between literal types which the compiler can far more aggressively optimise than complex types. So basically the codegen is way better, because the compiler can eliminate a lot more code.
I've looked at CTRE, I know what it does, how it does it, and how well it performs. It is, as I said, a remarkable piece of engineering, and I respect Hana for her work. https://pdimov.github.io/blog/2020/05/15/its-a-small-world/ Nevertheless, I have a not-entirely-uninformed hunch that a runtime engine can perform on par. Of course, until/unless I can substantiate this more thoroughly, by for example writing a runtime regex engine that exhibits similar performance to CTRE, you can file this under "idle speculation".
On 9/23/20 3:07 PM, Peter Dimov via Boost wrote:
Niall Douglas wrote:
Hana's runtime benchmarks showed her regex implementation far outpacing any of those in the standard libraries. Like, an order of magnitude in absolute terms, linear scaling to load instead of exponential for common regex patterns. A whole new world of performance.
Part of why her approach is so fast is because she didn't implement all of regex. But another part is because she encodes the parse into relationships between literal types which the compiler can far more aggressively optimise than complex types. So basically the codegen is way better, because the compiler can eliminate a lot more code.
I've looked at CTRE, I know what it does, how it does it, and how well it performs. It is, as I said, a remarkable piece of engineering, and I respect Hana for her work. https://pdimov.github.io/blog/2020/05/15/its-a-small-world/
Nevertheless, I have a not-entirely-uninformed hunch that a runtime engine can perform on par. Of course, until/unless I can substantiate this more thoroughly, by for example writing a runtime regex engine that exhibits similar performance to CTRE, you can file this under "idle speculation".
Please let me know when/if evidence becomes available that elevates this beyond "idle speculation". Tom.
On Wed, 23 Sep 2020 at 20:20, Niall Douglas via Boost
On 23/09/2020 14:21, Peter Dimov via Boost wrote:
Niall Douglas wrote:
For the record, I've had offlist email discussions about proposed Boost.JSON with a number of people where the general feeling was that there was no point in submitting a review, as negative review feedback would be ignored, possibly with personal retribution thereafter, and the library was always going to be accepted in any case.
Personal retribution, really?
Some have interpreted it as such yes. I have a raft of private email that just arrived there after my post here recounting their stories about being on the receiving end of Vinnie's behaviour, and/or thanking me for writing that post.
Richard, I appreciate your "they're being snowflakes" response and standing up for your friend, that was good of you. You should be aware that I've known Vinnie longer than you, possibly as long as anyone here. I think you'll find you're in the "get it done" philosophical camp (at least that's my judgement of you from studying the code you write), so Vinnie's fine with you. I have noticed, from watching him on the internet, he tends to find most issue with those in the "aim for perfection" philosophical camp. Vinnie particularly dislikes other people's visions of abstract perfection if it makes no sense to him, if it's abtuse, or he doesn't understand it. If you're in that camp, then you might have a very different experience than what you've had. Nevertheless, I believe Vinnie's opinion is important as representative of a significant minority of C++ users, and I think it ought to continue to be heard. I might add that the said "snowflakes" that I've spoken to have all to date agreed with that opinion, we're perfectly capable of withstanding severe technical criticism, indeed some of us serve on WG21 where every meeting is usually a battering of oneself.
Anyway, I have no wish to discuss this further, all I want to say has been said.
Niall, It is unfortunate that you decided to discuss in the great detail the issues of personal interactions with Vinnie *before* the JSON review curtain drops. Such pursuits just make the review manager(s) job (and wizards') much harder than it has to be. No, I don't object to the public collective catharsis. I just think the timing for it was very unfortunate. A reminder to all, we are currently trying to answer a simple question: "Do you think the [JSON] library should be accepted as a Boost library?" based on constructive criticism and technical evidence. Best regards, -- The Review Wizard for Boost C++ Libraries Mateusz Loskot, http://mateusz.loskot.net
Niall Douglas wrote:
For the record, I've had offlist email discussions about proposed Boost.JSON with a number of people where the general feeling was that there was no point in submitting a review, as negative review feedback would be ignored, possibly with personal retribution thereafter, [...]
FWIW, the Boost review process allows reviews to be submitted directly to the review manager, in private, without being posted on the list. That's mostly a mechanism to avoid the exact same scenario as described above. I don't think we've ever needed it ("so far"), but it exists and can be taken advantage of, by those who are so inclined.
On 9/23/20 1:55 PM, Niall Douglas via Boost wrote:
Consider this: a Hana Dusíková type all-constexpr JSON parser could let you specify to the compiler at compile time "this is the exact structure of the JSON that shall be parsed". The compiler then bangs out optimum parse code for that specific JSON structure input. At runtime, the parser tries the pregenerated canned parsers first, if none match, then it falls back to runtime parsing. Given that much JSON is just a long sequence of identical structure records, this would be a very compelling new C++ JSON parsing library, a whole new better way of doing parsing. *That* I would get excited about.
Great. There's really quite a lot of things to imagine about future C++ and libraries to be written in future C++ that get me excited. There are also things about C++ that really don't excite me and probably most other people as well. To name a few examples: std::vector, std::string, etc. They are not perfect, they are not fancy, they are not even pretty. But they are useful. Almost every day. To many, if not most C++ developers. And they perform well. In many ordinary use-cases. That's where I could see Boost Json: It's not perfect and probably also not pretty in parser-aesthetic terms (judging from some the review comments). But for me it combines a simple and widely-used user interface (similar to nlohmann's) with decent performance (similar to rapidjson). That gives me 90% of both worlds. And I get it now / soon. As a user of Json libraries, I find this a worthwhile trade-off. Max
Niall
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
This was one piece of feedback posted during the Boost.JSON review of September 2020:
For the record, I've had offlist email discussions about proposed Boost.JSON with a number of people where the general feeling was that there was no point in submitting a review, as negative review feedback would be ignored, possibly with personal retribution thereafter, and the library was always going to be accepted in any case. So basically it would be wasted effort, and they haven't bothered.
Unless a impassioned on-list reply counts as "personal retribution" I think it is safe to say that the aforementioned retribution never took place. However, the false claim that "the library was always going to be accepted in any case" is really harmful to the reputation of the Boost Formal Review process. As I believe that the review process is a vital piece of social technology that has made the Boost C++ Library Collection the best of breed, I'd like to avoid having the review of the upcoming proposed Boost.URL submission tainted with similar aspersions. Therefore let me state unequivocally, I have no interest in persecuting individuals for criticizing my library submissions. In fact I welcome negative feedback as it affords the opportunity to make the library better - regardless of who is providing the feedback. I am very happy to hear criticisms of my libraries even from those individuals who are actively hostile. However I do have an interest in vigorously opposing bad ideas, such as this one which was tacked on to the end of the message quoted above:
Consider this: a Hana Dusíková type all-constexpr JSON parser could let you specify to the compiler at compile time "this is the exact structure of the JSON that shall be parsed". The compiler then bangs out optimum parse code for that specific JSON structure input. At runtime, the parser tries the pregenerated canned parsers first, if none match, then it falls back to runtime parsing
The totality of the experience gained in developing Boost.JSON suggests that this proposed design is deeply flawed. The bulk of the work in achieving the performance comparable to RapidJSON went not into the parsing but in the allocation and construction of the DOM objects during the parse. This necessitated a profound coupling between parsing and creation of json::value objects. I realize of course that this will invite contradictory replies ("all you need to do is...") but as my conclusion was achieved only after months of experimentation culminating in the production of a complete, working prototype, I would just say: show a working prototype then let's talk. Regards
On 9/08/2022 04:39, Vinnie Falco wrote:
This was one piece of feedback posted during the Boost.JSON review of September 2020:
It does seem a bit peculiar to bring this up again two years later. (Also FWIW because this was a reply it ends up buried deep in the old thread, where some people may overlook it.)
As I believe that the review process is a vital piece of social technology that has made the Boost C++ Library Collection the best of breed, I'd like to avoid having the review of the upcoming proposed Boost.URL submission tainted with similar aspersions. [...] I realize of course that this will invite contradictory replies ("all you need to do is...") but as my conclusion was achieved only after months of experimentation culminating in the production of a complete, working prototype, I would just say: show a working prototype then let's talk.
These two positions seem at odds -- you're inviting and encouraging review, but then trying to set an extremely high bar ("implement at least a skeletal competing library first") for that review to be considered worthwhile. You can't have it both ways. While granted, "why not do it like X?" can be annoying when you did already consider that and found it didn't work for whatever reason (and even more so if you hadn't considered it, it's actually better, but you're a long way down a different path); the proper response is not to dismiss it but to interpret this as feedback that your documentation does not sufficiently clearly explain why you didn't do it like X.
On Mon, Aug 8, 2022 at 8:48 PM Gavin Lambert via Boost
While granted, "why not do it like X?" can be annoying when you did already consider that and found it didn't work for whatever reason
My esteemed colleague Darrell Wright enlightened me as to the meaning of the the aforementioned jargonspeaux. It wasn't a matter of "it didn't work" but rather, that there are several different flavors of JSON libraries which are mutually incompatible in terms of API: DOM, using user-types, in-situ (SIMDJSON) come to mind. The library I offered is the DOM variety. The other types are perfectly valid and useful, I just felt that I personally was both lacking in knowledge and insufficiently enthusiastic to also deliver the other flavors. There is still plenty of room in Boost for the type of JSON library that Niall describes, which is to go directly to and from user defined types and serialized JSON. I still assert that these different approaches to "implementing JSON" each belong in their own library - because optimizing for one case necessarily disadvantages the others. But the point I was trying to make originally had nothing to do with the particulars of the various JSON approaches. Rather, the point is that we must be ever-vigilant never to conflate vigorous and spirited technical debate with "personal retribution", because this false accusation dilutes the value and assaults the reputation of the review process. Thanks
On 08/08/2022 17:39, Vinnie Falco via Boost wrote:
This was one piece of feedback posted during the Boost.JSON review of September 2020:
For the record, I've had offlist email discussions about proposed Boost.JSON with a number of people where the general feeling was that there was no point in submitting a review, as negative review feedback would be ignored, possibly with personal retribution thereafter, and the library was always going to be accepted in any case. So basically it would be wasted effort, and they haven't bothered.
I'm concerned that folks feel that way: Boost has always had a robust and frankly sometimes bruising review process, but IMO we have ended up with better libraries as a result. So I hope everyone will feel free to submit reviews as they feel fit.
I realize of course that this will invite contradictory replies ("all you need to do is...") but as my conclusion was achieved only after months of experimentation culminating in the production of a complete, working prototype, I would just say: show a working prototype then let's talk.
We are at heart empiricists. Working code always triumphs! Best, John.
On 21. Sep 2020, at 20:46, Vinnie Falco via Boost
wrote: * Both of these use-cases are useful, and desirable * Most of the time, a user wants one or the other - rarely both
The point of the proponents of a serialisation solution is that it can potentially also handle the DOM case, but obviously not vice versa. We may end up in a situation where Boost.JSON is accepted, but it does not address all use-cases and so yet another library has to be written. This will lead to code duplication. Perhaps Boost.JSON could build on this hypothetical library once it is there, but are you going to change your implementation if you then have to rely on another Boost library?
* Optimized implementations of these use-cases are unlikely to share much code
I think the whole parsing can be shared.
* These are really two different libraries
I don't think so. If we can find a solution that allows one to deserialize JSON to the dynamic json::value, then this would making reading into the DOM type a special case of normal serialization. In fact, we could then even deserialize JSON into Boost.Any or std::any.
* No one has submitted a JSON library of any kind for review to Boost *ever* * The most popular JSON libraries implement JSON-DOM, not JSON Serialization * Even one of the most popular serialization libraries, Boost.Serialization, does not offer a JSON archive implementation * Boost.Property tree supports JSON-DOM out of the box, but not JSON Serialization
I find it interesting that people are coming out of the woodwork who claim to have written their own JSON libraries, that say REJECT to Boost.JSON, because they feel that conversion between JSON and user-defined types is of the utmost importance and that Boost can't have a JSON library without it.
* If this is so important, why does Boost.Serialization not have it? * Why is no one submitting a pull request to Boost.Serialization for a JSON archive? * Why has no one proposed a library to Boost which implements JSON Serialization? * Why doesn't Boost.PropertyTree have JSON-DOM but no JSON Serialization? * Where are the immensely popular JSON Serialization libraries?
The reason is that some Boost libraries have been successfully cloned by outside people and the development is happening there. Boost.Python is replaced by pybind11 and Boost.Serialization by cereal. https://github.com/USCiLab/cereal The most obvious appeal of cereal is that it is C++11. Boost.Serialization has a lot of pre-C++11 code and is accordingly a bit difficult to work with. Both pybind11 and cereal jumped on the opportunity to rewrite popular Boost libraries in C++11. cereal has a JSON archive. cereal also has 2.5k stars on Github, so it is dramatically popular. Best regards, Hans
Hans Dembinski wrote:
The point of the proponents of a serialisation solution is that it can potentially also handle the DOM case, but obviously not vice versa.
I don't think so. In fact the reverse is true. Deserialization from DOM is trivial. Making a deserialization library build a DOM is decidedly nontrivial. I'm not sure how it could be done.
On 2020-09-22 14:20, Peter Dimov via Boost wrote:
I don't think so. In fact the reverse is true. Deserialization from DOM is trivial. Making a deserialization library build a DOM is decidedly nontrivial. I'm not sure how it could be done.
It is easy with a pull parser. The following header shows both direct serialization from DOM to JSON, and direct deserizalization from JSON to DOM: https://github.com/breese/trial.protocol/blob/develop/include/trial/protocol...
Bjorn Reese wrote:
On 2020-09-22 14:20, Peter Dimov via Boost wrote:
I don't think so. In fact the reverse is true. Deserialization from DOM is trivial. Making a deserialization library build a DOM is decidedly nontrivial. I'm not sure how it could be done.
It is easy with a pull parser. The following header shows both direct serialization from DOM to JSON, and direct deserizalization from JSON to DOM:
https://github.com/breese/trial.protocol/blob/develop/include/trial/protocol...
Yes, you're right, it's indeed easy with a pull parser, if the value and the archive cooperate. This still doesn't make one approach a superset of the other though. You can feed Boost.JSON's push parser incrementally. A pull parser, being a pull one, reverses the flow of control and asks you (its stream) for data when you pull from it. Yes, it makes things like the above possible, but I don't think it entirely supersedes pull parsers. I remember this being brought up before, so you may have a solution for the incremental case. A pull parser could f.ex. return token::need_input or something like that when it's starved, like a nonblocking socket. There'll still be a disconnect between its buffer size and the amount of incoming data though, unless I'm missing something.
On 2020-09-22 15:17, Peter Dimov via Boost wrote:
This still doesn't make one approach a superset of the other though. You can feed Boost.JSON's push parser incrementally. A pull parser, being a pull one, reverses the flow of control and asks you (its stream) for data when you pull from it. Yes, it makes things like the above possible, but I don't think it entirely supersedes pull parsers.
I remember this being brought up before, so you may have a solution for the incremental case. A pull parser could f.ex. return token::need_input or something like that when it's starved, like a nonblocking socket. There'll still be a disconnect between its buffer size and the amount of incoming data though, unless I'm missing something.
https://github.com/breese/trial.protocol/tree/develop/example/json/chunked_p... The current limitation is that the buffer must be large enough to hold the largest string or number.
Bjorn Reese wrote:
https://github.com/breese/trial.protocol/tree/develop/example/json/chunked_p...
The current limitation is that the buffer must be large enough to hold the largest string or number.
You can lift it if you introduce string_part tokens. This will bring it even closer to Boost.JSON. It will make the pull interface less pure though.
On Tue, Sep 22, 2020 at 4:22 AM Hans Dembinski via Boost < boost@lists.boost.org> wrote:
* Where are the immensely popular JSON Serialization libraries?
The reason is that some Boost libraries have been successfully cloned by outside people and the development is happening there. Boost.Python is replaced by pybind11 and Boost.Serialization by cereal.
https://github.com/USCiLab/cereal
The most obvious appeal of cereal is that it is C++11. Boost.Serialization has a lot of pre-C++11 code and is accordingly a bit difficult to work with. Both pybind11 and cereal jumped on the opportunity to rewrite popular Boost libraries in C++11.
cereal has a JSON archive. cereal also has 2.5k stars on Github, so it is dramatically popular.
I believe you're correct -- we moved to cereal for json serialization because it wasn't in Boost. In our case it gets extensive use to marshall objects in and out of Mongo databases, web service/socket requests, config files. Jeff
Am 24.09.2020 um 13:02 schrieb Jeff Garland via Boost:
On Tue, Sep 22, 2020 at 4:22 AM Hans Dembinski via Boost < boost@lists.boost.org> wrote:
cereal has a JSON archive. cereal also has 2.5k stars on Github, so it is dramatically popular.
I believe you're correct -- we moved to cereal for json serialization because it wasn't in Boost. In our case it gets extensive use to marshall objects in and out of Mongo databases, web service/socket requests, config files.
Wie switched almost completely from Boost.Serialization over to cereal, for similar reasons. Pre-C++11 libs get phased out from projects in active maintenance (wherever possible) and are discouraged for new projects where C++17 is the targeted standard. Ciao Dani -- PGP/GPG: 2CCB 3ECB 0954 5CD3 B0DB 6AA0 BA03 56A1 2C4638C5
participants (21)
-
Alexander Grund
-
Bjorn Reese
-
Daniela Engert
-
Gavin Lambert
-
Glen Fernandes
-
Hans Dembinski
-
Janek Kozicki
-
Jeff Garland
-
John Maddock
-
Mateusz Loskot
-
Maximilian Riemensberger
-
Niall Douglas
-
pbristow@hetp.u-net.com
-
Peter Dimov
-
Pranam Lashkari
-
Rainer Deyke
-
Richard Hodges
-
Robert Ramey
-
Tom Honermann
-
Vinnie Falco
-
Vinícius dos Santos Oliveira