在 2013-5-1,下午11:15,Stefan Seefeld
On 2013-04-28 14:08, Amos Ji wrote:
I've scanned the idea page. The ideas in the page are all very interesting and challenging. But what I'm interested in most is XML library, which is at the bottom of page. I think the XML format is the most popular standard for storing information so it's important for Boost to have a good XML library.
However, I know Boost contains RapidXML in property_tree library to parse XML file. So what I want to make sure is I need to implement a new XML parser in this project instead of make enhancement for RapidXML. Am I correct? If so, I have some ideas to share with you.
I don't think your assumptions are entirely correct. First, I think the "XML" project on the ideas page is mis-classified, as it implies a misunderstanding. XML isn't a parser, and neither a file format - it is in fact quite a bit more.
As I have argued many times before on this list, I think it would be foolish to try to reimplement all the functionality to support XML. There already are quite a few decent implementations available, written in different languages (mostly C and Java), so it might be more appropriate to reuse them.
I agree with others that in the context of boost this should be about defining a good XML API, and then map that to existing libraries. In fact, I have done that a long time ago by wrapping libxml2. You can still see the code in the sandbox at http://svn.boost.org/svn/boost/sandbox/xml/.
Thanks for your comments. I agree that it isn't clever to "reinvent the wheels". The libxml2 and expat are both great xml libraries. It will be much easier to offer API of the existing libraries than to implement a new xml library. But what if the users don't have libxml2 on their computer? Just tell the users to download libxml2 before they want to use boost::xml? Or include the libxml2 in release version? I think neither of them are friendly to users.
In my opinion, an XML parser must be able to do these things:
1. To Iterate over DOM nodes tree; 2. To access the values of nodes and their attributes quickly; 3. To insert or delete nodes or attribute of an exact node easily; 4. To generate new XML from the structure which stores XML in the library.
And there are some optional function too:
1. To support XPATH; 2. To validate whether the XML file is valid; 3. To support various encoding; 4. To manage memory better; 5. To support regular expression.
I agree with all of the above. Still, I think trying to reimplement this as a "pure" boost library is the wrong approach. Focus on the API, then map it to an existing library.
If the 3rd-party libraries are allowed in boost, I'll be glad to work in accordance with your advice. I hope to hear more voice.
The ideas are not mature now. I'll improve them in my proposal. In fact, it's not easy to implement a perfect XML parser but I'll do my best.
And I have another question. Who will mentor this XML library project?
I would be happy to mentor this.
Thank you very much! I'm glad to work with you if I'm accepted. Sincerely, Mingchao