On 03/20/2014 04:34 AM, Bjorn Reese wrote:
On 03/18/2014 04:46 PM, Stefan Seefeld wrote:
I don't see any reason why such an XML API wouldn't be usable by other Boost libraries.
It should be part of the GSoC project to verify this for the most common use cases (XML serialization is the most obvious one.)
I don't entirely understand your point. The goal is to define an XML API, and implement it, which complies with all related standards. As long as the existing Boost components (e.g. Boost.Serialization) work with standard XML tools, we should be compatible. I don't think, however, that we should be constrained to be API-compatible with existing tools, as otherwise the whole exercise to define a new API would be pointless. On the other hand, making minor adjustments to those libraries to work with Boost.XML would be fine. I just don't think we should make this part of the proposal, as it isn't even clear what existing Boost components would be affected, whether they are actively maintained / developed, etc.
What is the purpose of the S template argument?
To keep the concern for unicode or any other string type orthogonal from the XML library, i.e. to allow Boost.XML to interact with different Unicode implementations. In fact, in the existing demos I'm restricting content to ASCII, so I can in fact get away with using std::string, so this is a good example of the "modularity" design goal I mentioned above: Don't force anything on users they don't actually need.
I agree with the goal, but I am not sure that the S type solves the problem. I must admit that I am having difficulty understanding exactly how you envision it should work for other encodings, because std::string is orthogonal to encoding (locale is usually attached to the I/O stream.)
You are right, encoding and string type are (mostly) orthogonal. I have never said anything else. :-)
What encoding is used for std::string? ASCII, UTF-8, or "whatever the XML library gives me"? This should be documented as part of the API regardless of the answer.
Yes.
Should I define a new string type if I want to use Latin-1 or another encoding in my application? What if the rest of my application uses std::string for Latin-1 encodings? (I am wondering how will work with the current convert trait specialization for std::string.)
How does the convert trait know the XML document encoding so that it is able to convert between this and the application encoding?
I suggest that you adopt the libxml2 design decision to always use UTF-8 for std::string (and UTF-16 for std::wstring if needed.) See the design rationale here:
http://xmlsoft.org/encoding.html
Any backend that does not provide UTF-8 will have to be wrapped.
With such a design decision, the S template parameter becomes superfluous (or should be changed to CharT if you wish to support both std::string and std::wstring.)
Conversion between UTF-8 and application encodings would have to be done explicitly in the application.
At any rate, encoding should be addressed in the GSoC project.
I agree, and this is in fact part of the proposal. To be specific, one of the first steps is to add tests that instantiate the XML classes with existing unicode string classes (such as glib::ustring or Qt's QString), and demonstrate how to use them.
What is the purpose of the convert trait?
To allow conversion between the backend's own string representation and the string type that is used with Boost.XML.
Ok. You should, however, make sure that the strings are converted correctly:
http://xmlsoft.org/html/libxml-xmlstring.html
For instance, convert::in() does not take libxml2 custom allocators into account:
Good point. As I said, the existing Boost.XML was meant to be a proof-of-concept. Thanks for your feedback, Stefan -- ...ich hab' noch einen Koffer in Berlin...