[serialization] new portable binary archive based on HDF5
Dear Boost developers, I would like to share a contribution to the boost::serialization module. It provides an answer to the challenge of supporting portable binary archives by using the well-known and stable HDF5 library (http://www.hdfgroup.org/HDF5). After first signals of interest by other users I have decided to fully integrate my work with the current boost::serialization module. My suggestion would be for Robert Ramey and other interested developers to consider the inclusion of my work in the official serialization code. I am not sure about the exact procedure in this case. I have forked from the Git serialization archive. All supplied tests have passed for gcc on Linux and MSVC 10 on Windows. My work can be pulled from https://github.com/dk1978/serialization/tree/development To compile the new archive code, define the paths to the HDF5 installation using the variables HDF5_INCLUDE_PATH, HDF5_LIB_PATH, and HDF5_BIN_PATH either as environment variables or in "project-config.jam". An example using the new archive is also provided in the "example" subdirectory. Motivation for this work: The hdf5_archive project provides a new serialization archive format based on HDF5 to complement the ones already included in the boost::serialization library: plain text, XML, and native binary. By building on the established and well-known boost::serialization framework, application code can use HDF5 or switch to any of the other established archive formats with only minimal changes to the code involved. HDF5 has become a popular format to store scientific data. It is open and well-documented. Further advantages of HDF5 are the following: - the format is self-describing and portable across computing platforms - efficient storage of large arrays, parallel IO using MPI is possible (though not yet implemented) - hierarchical description of stored data - several low-level storage drivers, including single file or multiple-directory layouts - APIs to C, C++, and Fortran Best regards, Daniel Koester
I would like to share a contribution to the boost::serialization module. It provides an answer to the challenge of supporting portable binary archives by using the well-known and stable HDF5 library (http://www.hdfgroup.org/HDF5). After first signals of interest by other users I have decided to fully integrate my work with the current boost::serialization module. My suggestion would be for Robert Ramey and other interested developers to consider the inclusion of my work in the official serialization code. I am not sure about the exact procedure in this case.
I have forked from the Git serialization archive. All supplied tests have passed for gcc on Linux and MSVC 10 on Windows. My work can be pulled from
https://github.com/dk1978/serialization/tree/development
To compile the new archive code, define the paths to the HDF5 installation using the variables HDF5_INCLUDE_PATH, HDF5_LIB_PATH, and HDF5_BIN_PATH either as environment variables or in "project-config.jam". An example using the new archive is also provided in the "example" subdirectory.
Motivation for this work:
The hdf5_archive project provides a new serialization archive format based on HDF5 to complement the ones already included in the boost::serialization library: plain text, XML, and native binary. By building on the established and well-known boost::serialization framework, application code can use HDF5 or switch to any of the other established archive formats with only minimal changes to the code involved.
HDF5 has become a popular format to store scientific data. It is open and well-documented. Further advantages of HDF5 are the following: - the format is self-describing and portable across computing platforms - efficient storage of large arrays, parallel IO using MPI is possible (though not yet implemented) - hierarchical description of stored data - several low-level storage drivers, including single file or multiple- directory layouts - APIs to C, C++, and Fortran
Two questions: 1) What license do you use for your contribution? The source files don't mention it. 2) Do you have performance data comparing the new archive with the existing ones? I'd be very interested to use this archive for HPX (https://github.com/STEllAR-GROUP/hpx) as an alternative for our homegrown derivative of the original portable binary archive available as an example in Boost.Serialization. But for this I'd like to known what performance advantages/disadvantages this might cause. Regards Hartmut --------------- http://boost-spirit.com http://stellar.cct.lsu.edu
On 13 December 2013 19:20, Daniel Koester
Dear Boost developers,
I would like to share a contribution to the boost::serialization module. It provides an answer to the challenge of supporting portable binary archives by using the well-known and stable HDF5 library (http://www.hdfgroup.org/HDF5). After first signals of interest by other users I have decided to fully integrate my work with the current boost::serialization module.
FYI, I'd be interested in such addition.
My suggestion would be for Robert Ramey and other interested developers to consider the inclusion of my work in the official serialization code. I am not sure about the exact procedure in this case.
I *guess*, it may qualify for the Fast Track Review: http://www.boost.org/community/reviews.html
I have forked from the Git serialization archive. All supplied tests have passed for gcc on Linux and MSVC 10 on Windows. My work can be pulled from
Hmm, have you updated the development branch with the upstream lately, as well as the master in your fork? AFAICT it is a few months behind the current upstream, the master too, it seems impossible to generate a diff usable enough to review your additions. Best regards, -- Mateusz Ĺoskot, http://mateusz.loskot.net
participants (3)
-
Daniel Koester
-
Hartmut Kaiser
-
Mateusz Loskot