Hi everybody, I am studying for an MSc in Computer Science at Oxford Brookes University. I have taken a compiler construction course this year and I am therefore interested in the JSON Parser idea for the Boost library. Open Source JSON parsers have already been implemented in C++ and in Java. Examples of Java libraries are: Gson, quick-json. Even if other libraries already exist, developers who are using Boost for their project will appreciate having a JSON parser within Boost.
From a compiler construction background, writing a JSON parser is not difficult. The JSON grammar is simple and the specification is easy to find. Potential difficulties can exist with robustness of error handling and recovery.
The question is what data structure we should use to represent JSON objects, and how the user can access key/value pairs in those objects. (examples: Boost.PropertyTree, pre-existing C++ object, ...) Ideally we should offer validating and non-validating implementations. We should also offer JSON generation as well as parsing. Let me know what you think. Kind regards, Stephan.
----- Original Message -----
From: "Stephan Bourgeois"
From a compiler construction background, writing a JSON parser is not difficult. The JSON grammar is simple and the specification is easy to find. Potential difficulties can exist with robustness of error handling and recovery.
The question is what data structure we should use to represent JSON objects, and how the user can access key/value pairs in those objects. (examples: Boost.PropertyTree, pre-existing C++ object, ...) Ideally we should offer validating and non-validating implementations. We should also offer JSON generation as well as parsing. Let me know what you think. Kind regards, Stephan. Hi Stephan, Have you looked at boost.property_tree? That includes a JSON parser among other things, as well as an appropriate data structure to hold the tree. Are you invisioning something different? Kind regards, Philip Bennefall
On Thu, Apr 11, 2013 at 12:23 AM, Philip Bennefall
Have you looked at boost.property_tree? That includes a JSON parser among other things, as well as an appropriate data structure to hold the tree. Are you invisioning something different?
It have been pointed several times that boost::property_tree isn't appropriate if you want a JSON library, it only provide a JSON-like serialization but doesn't provide all valid JSON syntax/values, same thing with XML. See http://boost.2283326.n4.nabble.com/Using-property-tree-as-json-reader-writer...
11.04.2013 1:46, Stephan Bourgeois пишет:
I am studying for an MSc in Computer Science at Oxford Brookes University. I have taken a compiler construction course this year and I am therefore interested in the JSON Parser idea for the Boost library. There already is a json parser example in http://svn.boost.org/svn/boost/trunk/libs/spirit/example/qi/json/
I think it would be good to move it to the boost::spirit repository and to the release branch. It may be also a good idea to use this parser in the boost::property_tree instead of the current one.
The question is what data structure we should use to represent JSON objects, and how the user can access key/value pairs in those objects. (examples: Boost.PropertyTree, pre-existing C++ object, ...) The structure used in the spirit/example/qi/json/ looks reasonable: the json object is represented as a map
and the value is represented as a boost::variant (with some wrapper around it).
I don't like the boost::property_tree approach because it loses all the type information for the values. -- Best regards, Sergey Cheban
On 04/10/2013 05:19 PM, Sergey Cheban wrote:
There already is a json parser example in http://svn.boost.org/svn/boost/trunk/libs/spirit/example/qi/json/
FWIW, the json example parser in Spirit is based on our json library that we finally pushed to github this week. Links and docs (soon) can be found here: http://cierelabs.org The goal was to create a json library that allows usage similar to javascript or Python. I believe the parser is fully compliant and we would love to get some feedback from users. Hopefully it is something useful to the community. michael -- Michael Caisse ciere consulting ciere.com
12.04.2013 4:37, Michael Caisse пишет:
FWIW, the json example parser in Spirit is based on our json library that we finally pushed to github this week. Links and docs (soon) can be found here: http://cierelabs.org
The goal was to create a json library that allows usage similar to javascript or Python. I believe the parser is fully compliant and we would love to get some feedback from users.
Hopefully it is something useful to the community. I'm glad to hear it but I have to say that for the software development it is important to minimize the number of the external libraries count.
Every external dependence (i.e. library) leads to the additional costs: - the license compatibility must be checked - the version compatibility must be checked - the library must be built and installed - if the library is abandoned, it's a problem - etc. One of the benefits of the Boost is that it is licensed, tested and distributed as one thing. And I think that it would be good for your json parser if it was included into the Boost. PS. For now, I'm using my own implementation of the json parser. I would be happy to switch to the external one if it was included into one of the libraries I'm already using. -- Best regards, Sergey Cheban
On 04/10/2013 11:46 PM, Stephan Bourgeois wrote:
Open Source JSON parsers have already been implemented in C++ and in Java. Examples of Java libraries are: Gson, quick-json. Even if other libraries already exist, developers who are using Boost for their project will appreciate having a JSON parser within Boost.
Agreed.
The question is what data structure we should use to represent JSON objects, and how the user can access key/value pairs in those objects. (examples: Boost.PropertyTree, pre-existing C++ object, ...)
I would like to have several different interfaces: 1. Tokenizer API which reads the next token from an input string. This is important for streaming data. 2. Iterator API which iterates to the next token in the input string. Remembers its parent scopes (unlike tokenizer.) This is similar to the XmlTextReader API. 3. Tree API which parses the entire input string into a tree structure. This is a bit like the DOM API, and this is what the Spirit example and Boost.PropertyTree provides. 4. Serialization API which provides a Boost.Serialization input archive without going through an intermediate tree representation. For each of the above there should be corresponding generation interfaces. I have already created the tokenizer and serialization APIs for JSON (and several other encoding formats) at: http://protoc.sourceforge.net/ I have not had the opportunity to look into the iterator and tree APIs yet, so this may be a good candidate for a GSoC project. As there is no mentor for the JSON parsing library, I am willing to mentor it if is based on the protoc code. However, I am only a Boost hang-around, so I do not know the proper procedures for this. Unfortunately, the code is currently undocumented, so the best place to start is the code itself: http://sourceforge.net/p/protoc/code/ci/master/tree/include/protoc/json/ http://sourceforge.net/p/protoc/code/ci/master/tree/src/json/ decoder.hpp contains the tokenizer API. encoder.hpp contains the token generator API. iarchive.hpp contains the serialization input archive. oarchive.hpp contains the serialization output archive.
Ideally we should offer validating and non-validating implementations. We should also offer JSON generation as well as parsing.
Agreed. It is mainly the string validation that is going to be a (minor) challenge.
Bjorn Reese wrote:
On 04/10/2013 11:46 PM, Stephan Bourgeois wrote:
Open Source JSON parsers have already been implemented in C++ and in Java. Examples of Java libraries are: Gson, quick-json. Even if other libraries already exist, developers who are using Boost for their project will appreciate having a JSON parser within Boost.
Agreed.
The question is what data structure we should use to represent JSON objects, and how the user can access key/value pairs in those objects. (examples: Boost.PropertyTree, pre-existing C++ object, ...)
I would like to have several different interfaces:
1. Tokenizer API which reads the next token from an input string. This is important for streaming data.
2. Iterator API which iterates to the next token in the input string. Remembers its parent scopes (unlike tokenizer.) This is similar to the XmlTextReader API.
3. Tree API which parses the entire input string into a tree structure. This is a bit like the DOM API, and this is what the Spirit example and Boost.PropertyTree provides.
4. Serialization API which provides a Boost.Serialization input archive without going through an intermediate tree representation.
I would think to add perhaps a fusion API which I think should allow you to have a usage similar to that of JsonFX in C# for any fusion adapted type. http://codepad.org/mu0VD9LG
In JSON we typically deal with maps and arrays. The arrays themselves could
have arbitrary types (string, object, array, numeric, boolean, null) as
elements. The key types in the maps are always strings and the value types
in the maps could be anything that can appear in an array, including
another map or array.
Due to this, I'd imagine being able to use Boost.Variant or Boost.Any in a
list and as a value_type in a map would help.
Regards,
Arindam.
On Thu, Apr 11, 2013 at 3:16 AM, Stephan Bourgeois
Hi everybody, I am studying for an MSc in Computer Science at Oxford Brookes University. I have taken a compiler construction course this year and I am therefore interested in the JSON Parser idea for the Boost library.
Open Source JSON parsers have already been implemented in C++ and in Java. Examples of Java libraries are: Gson, quick-json. Even if other libraries already exist, developers who are using Boost for their project will appreciate having a JSON parser within Boost.
From a compiler construction background, writing a JSON parser is not difficult. The JSON grammar is simple and the specification is easy to find. Potential difficulties can exist with robustness of error handling and recovery.
The question is what data structure we should use to represent JSON objects, and how the user can access key/value pairs in those objects. (examples: Boost.PropertyTree, pre-existing C++ object, ...)
Ideally we should offer validating and non-validating implementations. We should also offer JSON generation as well as parsing.
Let me know what you think. Kind regards, Stephan.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Arindam Mukherjee wrote:
In JSON we typically deal with maps and arrays. The arrays themselves could have arbitrary types (string, object, array, numeric, boolean, null) as elements. The key types in the maps are always strings and the value types in the maps could be anything that can appear in an array, including another map or array.
Due to this, I'd imagine being able to use Boost.Variant or Boost.Any in a list and as a value_type in a map would help.
Probably.
I find the most useful interface is to just provide a datatype you're
expecting and let the json parser try its best to do the right thing.
For example:
string raw_json = R"({
data:{
a:"hello",
b:"world",
c:3,
widget:3.5
}
})";
struct my_type1
{
map
2013/4/12 Michael Marcin
Arindam Mukherjee wrote:
In JSON we typically deal with maps and arrays. The arrays themselves could have arbitrary types (string, object, array, numeric, boolean, null) as elements. The key types in the maps are always strings and the value types in the maps could be anything that can appear in an array, including another map or array.
Due to this, I'd imagine being able to use Boost.Variant or Boost.Any in a list and as a value_type in a map would help.
Probably.
I find the most useful interface is to just provide a datatype you're expecting and let the json parser try its best to do the right thing.
Very like what I experienced with Spirit before, yeah, it just works. I let the user specify the desired types though traits specialization. The old code is here: https://github.com/jamboree/jsume/blob/master/example/pretty_printer/json_co...
For example:
string raw_json = R"({ data:{ a:"hello", b:"world", c:3, widget:3.5
} })";
This doesn't seem like a valid json, the key must be a quoted string.
struct my_type1 { map
data; }; struct my_type2 { map
> data; }; struct my_type3 { map
data; }; struct my_type4 { unordered_map
data; }; struct my_type5 { struct my_data { string a; string b; int c; float widget; } data; };
It's impossible for my_type5 without more advanced Fusion adaption.
On 4/12/13 11:21 AM, TONGARI wrote:
2013/4/12 Michael Marcin
For example:
string raw_json = R"({ data:{ a:"hello", b:"world", c:3, widget:3.5
} })";
This doesn't seem like a valid json, the key must be a quoted string.
In practice you often find json that does not quote keys and I would prefer a useful library to a pedantic one for this task. You could have a strict mode I suppose. I pretty much just go by what http://jsonviewer.stack.hu/ accepts.
It's impossible for my_type5 without more advanced Fusion adaption.
I would hope that if you adapt my_type5 and my_type5::my_data it could work. I guess you would need a little more than basic fusion adaption to get strings for the member variable names in order to parse json pairs associatively.
On 04/12/2013 07:18 PM, Michael Marcin wrote:
In practice you often find json that does not quote keys and I would prefer a useful library to a pedantic one for this task. You could have a strict mode I suppose.
I pretty much just go by what http://jsonviewer.stack.hu/ accepts.
If we are to accept an extended syntax, then it should be absolutely clear what those extensions are. I have no idea what the URL above accepts. Apart from unquoted keys, two other extensions I have seen are C-style comments, and support for the floating-point values of infinity and NaN. Are there other potential extensions?
On 4/12/2013 10:18 AM, Michael Marcin wrote:
On 4/12/13 11:21 AM, TONGARI wrote:
2013/4/12 Michael Marcin
For example:
string raw_json = R"({ data:{ a:"hello", b:"world", c:3, widget:3.5
} })";
This doesn't seem like a valid json, the key must be a quoted string.
In practice you often find json that does not quote keys and I would prefer a useful library to a pedantic one for this task. You could have a strict mode I suppose.
I pretty much just go by what http://jsonviewer.stack.hu/ accepts.
According to both the RFC and the json.org keys should be string values and thus should have quotes. RFC: http://www.ietf.org/rfc/rfc4627.txt JSON.org: http://json.org/ Thank you, Ilya Bobyr
participants (10)
-
Arindam Mukherjee
-
Bjorn Reese
-
Ilya Bobyr
-
Klaim - Joël Lamotte
-
Michael Caisse
-
Michael Marcin
-
Philip Bennefall
-
Sergey Cheban
-
Stephan Bourgeois
-
TONGARI