On 2021-04-01 4:30 a.m., Dominique Devienne via Boost wrote:
But different people have different tradeofs. libxml2 and xerces and expat may be complete, and as close to bug free as it gets in C/C++ XML, but they are certainly not modern C++,
Stylistic questions ("modern C++") are secondary to functional correctness.
often not incremental parsing, and certainly don't allow the kind of allocator support Boost.JSON introduced. Nor are they the fastest.
libxml2 offers streaming APIs ("incremental parsing") and is among the fastest implementations you can get. As I said in the FFT thread: thinking that you can match such a library (both in functionality and performance) with a GSoC project is foolish, so it seems wiser to focus on the interface, then bind that to existing implementations.
The main issue with XML are all the little things to get right, like character entities, entity includes inherited from DTDs, DTDs themselves, for validation and default values, whitespace normalization, namespace support, and related techs liks XSDs, XPath, XLink, XInclude, XQuery, etc... Proper PSVI (post schema validation infoset) is also often problematic, but that assumes a validating parser (via DTD or XSD) in the first place.
Exactly. How are you proposing to handle all these questions above ?
There's definitely space to explore a Boost.JSON-like low-level modern parser building only a DOM with value semantic and allocator support, with a modern API. Much could be built on such a foundation, and that's an interesting GSOC project, even if it never "graduates".
In any case, beside the 3 mentioned above, there's also rapidxml and pugixml, the latter still actively maintained. Perhaps they are not as complete, but they are definitely quite a bit faster than the "old" ones. --DD
This is not about which XML library is better. Quite the opposite, in fact: I want to make an argument for establishing a modern C++ API that can be bound to any such library. We don't need more half-baked partial XML implementations, we need a standard C++ API for XML. Stefan -- ...ich hab' noch einen Koffer in Berlin...