Compression libraries such as zlib and zstd ought to be first class boost libraries
Hello all, I've been having a fairly hard time enabling zlib and zstd for boost 1.85.0 so we can make use of the compression support in iostreams module. We do have some specifics - static linking, non-standard file names and not system-installed. Windows and Linux. For various reasons it looks like we're going to copy certain files into our modules, rather than puzzle over making this work. It's not ideal, but workable. But along the way it's become clear that this functionality is not first-class boost citizen. Such as the broken links in the documentation, and various historical questions via google reporting similar issues. There are also other copies of zlib in the beast tree, because it's obviously needed for HTTP stuff too. zstd too? So I did think that I should at least mention that zlib, zstd, bzip2 and xv-utils all ought to be promoted to first-class boost libraries even though the majority of the maintenance is coming from upstream. They ought to be part of the "big tarball" and enabled and built by default regardless of what happens to be installed on the system. The cmake stuff ought to "just work" the way it does for other boost modules. I'm not quite volunteering myself, but it seems like a start to pose the question. The developer experience could be significantly improved for this particular corner of things. - Nigel Stewart
Nigel Stewart wrote:
Hello all,
I've been having a fairly hard time enabling zlib and zstd for boost 1.85.0 so we can make use of the compression support in iostreams module. We do have some specifics - static linking, non-standard file names and not system- installed. Windows and Linux.
Were you using b2 or CMake to build Boost?
I've been having a fairly hard time enabling zlib and zstd for boost 1.85.0 so we can make use of the compression support in iostreams module.
Were you using b2 or CMake to build Boost?
All the instructions pointed to b2, and that's what's entrenched here. Will google for cmake howto, thanks. - Nigel
Nigel Stewart wrote:
I've been having a fairly hard time enabling zlib and zstd for boost 1.85.0 so we can make use of the compression support in iostreams module.
Were you using b2 or CMake to build Boost?
All the instructions pointed to b2, and that's what's entrenched here. Will google for cmake howto, thanks.
Instructions for building with CMake are here: https://github.com/boostorg/cmake but I actually had something different in mind. The b2 build supports compiling zlib from source; you can set the environment variable ZLIB_SOURCE or put the path in user-config.jam as in using zlib : 1.2.7 : <source>/home/steven/zlib-1.2.7 ; This is also supported for bzip2 - using BZIP2_SOURCE or using bzip2 : 1.0.6 : <source>/home/sergey/src/bzip2-1.0.6 ; but unfortunately not for lzma or zstd - for those you have to point to the prebuilt binaries: using zstd : : <search>/path/to/libraries <include>/path/to/headers <name>library-name ; If you have separate zstd binaries for different build variants, you'd need more than one "using" line here, with the appropriate requirements. If you tell us how your binaries are named and what directories they are in, we'll be able to tell you the exact lines to put into user-config.
On Tue, Apr 30, 2024 at 11:14 PM Nigel Stewart via Boost < boost@lists.boost.org> wrote:
...zlib, zstd, bzip2 and xv-utils all ought to be promoted to first-class boost libraries
Boost.Beast author here. When I wrote Beast I decided to port zlib to header-only and inline the sources directly in Beast. This made things easier for people that want a header-only library and don't want the hassle of dealing with adjusting their build scripts to compile and link to the zlib library. In retrospect, this was a mistake for the following reasons: 1. The port may have defects 2. Improvements to zlib have to be ported to my version 3. The port can't be linked like a normal zlib library and thus is less reusable As I now have a lot of experience with both header-only and regular flavors of libraries I am no longer enthusiastic about header-only libraries. They make things take longer to build, they expose a lot of implementation details in the header files, and they often indirectly cause large executables because heavy use of templates seems to go hand in hand with header-only. The more sustainable solution I think is to require that users are able to incorporate third party libraries into their build scripts. This is made easier with package managers of course, and now there are enough solutions that we do not need to be treating users like infants incapable of putting together a non-trivial program. It should be obvious that the pattern of "port linkable libraries into header-only Boost libraries that don't require separate compilation" is completely unsustainable, if for no other reason than - what do we do with OpenSSL? Thanks
On Wed, May 1, 2024 at 9:37 AM Vinnie Falco via Boost
The more sustainable solution I think is to require that users are able to incorporate third party libraries into their build scripts. This is made easier with package managers of course, and now there are enough solutions that we do not need to be treating users like infants incapable of putting together a non-trivial program.
+100 -- -- René Ferdinand Rivera Morell -- Don't Assume Anything -- No Supone Nada -- Robot Dreams - http://robot-dreams.net
For what it's worth, I overwhelmingly agree that Boost should have compression as a first-class library. We've lost our way, we keep amalgamating Asio-wrapping libraries but we don't provide the bread-and-butter functionality that developers actually need on a day-to-day basis. Kind of similar for libcrypto. If you want something like cryptographic hashing, there's nothing in Boost to help. - Christian
On Wed, May 1, 2024 at 9:06 AM Christian Mazakas via Boost < boost@lists.boost.org> wrote:
For what it's worth, I overwhelmingly agree that Boost should have compression as a first-class library.
We've lost our way, we keep amalgamating Asio-wrapping libraries but we don't provide the bread-and-butter functionality that developers actually need on a day-to-day basis.
zlib comes preinstalled on most Linux distros. If anything, zlib is more popular than the Boost library collection in terms of the number of installs. So I'm not sure what you're going on about. The "bread-and-butter functionality that developers actually need on a day-to-day basis" is already abundantly available. What is gained by incurring the permanent, per-Boost-release costs that comes with introducing the hypothetical Boost.ZLib which cannot be currently obtained simply by saying (for example) "vcpkg install zlib" ? Thanks
On Wed, May 1, 2024 at 1:16 PM Vinnie Falco via Boost
On Wed, May 1, 2024 at 9:06 AM Christian Mazakas via Boost < boost@lists.boost.org> wrote:
For what it's worth, I overwhelmingly agree that Boost should have compression as a first-class library.
We've lost our way, we keep amalgamating Asio-wrapping libraries but we don't provide the bread-and-butter functionality that developers actually need on a day-to-day basis.
zlib comes preinstalled on most Linux distros. If anything, zlib is more popular than the Boost library collection in terms of the number of installs. So I'm not sure what you're going on about. The "bread-and-butter functionality that developers actually need on a day-to-day basis" is already abundantly available.
What is gained by incurring the permanent, per-Boost-release costs that comes with introducing the hypothetical Boost.ZLib which cannot be currently obtained simply by saying (for example) "vcpkg install zlib" ?
I think Christian meant adding a compression abstraction library above zlib, bz2, zip, etc? -- -- René Ferdinand Rivera Morell -- Don't Assume Anything -- No Supone Nada -- Robot Dreams - http://robot-dreams.net
On Wed, May 1, 2024 at 11:30 AM René Ferdinand Rivera Morell < grafikrobot@gmail.com> wrote:
I think Christian meant adding a compression abstraction library above zlib, bz2, zip, etc?
That is a different question, yes, and one worth answering. My experience integrating zlib as a header-only C++ library into beast has also informed me about the utility of bolting a "modern API" onto an ancient library that has a C interface. And that is, that it is not worth it. At least not for ZLib. The current API for ZLib is about as perfect as you can get. Fill in a struct with the input and output buffers, call a function, and then when the function returns you are informed about the amounts consumed from the respective buffers. Maybe this could be wrapped up in some kind of popular C++ "stream" type interface, or a DSL of sorts as showcased by Ranges and now Sender/Receiver. But doing so incurs a cost: users must learn a new interface. ZLib is already so well understood and entrenched that wrapping it would in a single stroke make obsolete all of the blog posts, tutorials, example code, documentation, and user experience with the current ZLib API, forcing people to learn another New Thing. Cue xkcd regarding "15 competing standards." Thanks
René Ferdinand Rivera Morell wrote:
I think Christian meant adding a compression abstraction library above zlib, bz2, zip, etc?
Pretty sure he meant our own implementations of "well known" compression algorithms. But if he meant the above, this isn't going to change the status quo. It will only move the build problems from iostreams to the new library.
On Wed, May 1, 2024 at 12:52 PM Peter Dimov via Boost
René Ferdinand Rivera Morell wrote:
I think Christian meant adding a compression abstraction library above zlib, bz2, zip, etc?
Pretty sure he meant our own implementations of "well known" compression algorithms.
That's even worse, as we would be throwing away twenty five years of optimizations in the decompressor, much of which is written in assembly language for various architectures. Is he proposing we rewrite all of that? This is not a sound engineering practice to me. Thanks
I think Christian meant adding a compression abstraction library above zlib, bz2, zip, etc?
That's even worse, as we would be throwing away twenty five years of optimizations in the decompressor, much of which is written in assembly language for various architectures. Is he proposing we rewrite all of
I actually really meant this lol. I will say, I too even struggled getting b2 to pick up zlib on my system but that's not something that's impossible to solve. You just log into slack and complain in #boost until Peter solves your build problems for you. It's 2024 and the best solution we have for adding compression to a user's application is: "just learn zlib" or other similar C-like APIs. I didn't have anything particularly concrete in mind. I was just vocalizing my feelings. that?
This is not a sound engineering practice to me.
Well, get prepared to have your mind blown! https://github.com/zlib-ng/zlib-ng Seems like someone already beat us to the punch and went ahead and re-implemented all of zlib for some decent gains, it seems. - Christian
On Thu, May 2, 2024 at 4:01 PM Christian Mazakas via Boost < boost@lists.boost.org> wrote:
https://github.com/zlib-ng/zlib-ng
Seems like someone already beat us to the punch and went ahead and re-implemented all of zlib for some decent gains, it seems.
Yes, this library is well known. It is a fork of zlib with performance improvements added. It is not "reimplemented all of zlib," and it is not "adding a compression abstraction library above zlib" (which you alluded to earlier). The best analogy is to think of zlib-ng as what asio would be if the people who depend on asio got tired of issues and pull requests being ignored and decided to fork their own version called asio-ng. Thanks
For what it's worth, I overwhelmingly agree that Boost should have compression as a first-class library.
zlib comes preinstalled on most Linux distros. If anything, zlib is more popular than the Boost library collection in terms of the number of installs. So I'm not sure what you're going on about. The "bread-and-butter functionality that developers actually need on a day-to-day basis" is already abundantly available.
Our specific requirement is to statically link _our_ preferred version of zlib built from _our_ pristine (and possibly patched) repo. There is no build reproducability via system packages that get patched over time, or are pinned to a historical version for the purpose of system stability. But, building and linking a boost-bundled zlib would be pretty close to that, and pretty workable in terms of bringing our patches on top. - Nigel
Nigel Stewart wrote:
Our specific requirement is to statically link _our_ preferred version of zlib built from _our_ pristine (and possibly patched) repo. There is no build reproducability via system packages that get patched over time, or are pinned to a historical version for the purpose of system stability.
As I said, this is accomplished by putting the following line in one of the b2 configuration files, for example user-config.jam: using zlib : 1.3.1 : <include>/opt/zlib-1.3.1/include <search>/opt/zlib-1.3.1/lib <name>our-zlib-name-1.3.1 ; if we assume that your zlib headers are in /opt/zlib-1.3.1./include and your prebuilt binary is /opt/zlib-1.3.1/lib/our-zlib-name-1.3.1.a. Same for zstd (or bzip2 or lzma).
Our specific requirement is to statically link _our_ preferred version of zlib built from _our_ pristine (and possibly patched) repo.
As I said, this is accomplished by putting the following line in one of the b2 configuration files, for example user-config.jam:
using zlib : 1.3.1 : <include>/opt/zlib-1.3.1/include <search>/opt/zlib-1.3.1/lib <name>our-zlib-name-1.3.1 ;
Peter, Thanks for the information and encouragement. Your name is all over the cmake infrastructure, and I would like to thank you (collectively) for how nice it all is. I'm not a huge fan of cmake, but I accept that it's the go-to for C++ building, generally speaking. Some familiarity can go a long way. This is looking like a great fit for what we want to do - essentially statically link everything from our own repos of everything from the ground up - both Windows and Linux. I will also follow up to respond to some of the points about boost-bundling of these building-block compression libraries. I admit that OpenSSL is a bigger kettle of fish than say zlib or zstd. Thanks! - Nigel Stewart
Am 02.05.2024 um 15:07 schrieb Nigel Stewart via Boost:
Our specific requirement is to statically link _our_ preferred version of zlib built from _our_ pristine (and possibly patched) repo. As I said, this is accomplished by putting the following line in one of the b2 configuration files, for example user-config.jam:
using zlib : 1.3.1 : <include>/opt/zlib-1.3.1/include <search>/opt/zlib-1.3.1/lib <name>our-zlib-name-1.3.1 ;
Peter,
Thanks for the information and encouragement. Your name is all over the cmake infrastructure, and I would like to thank you (collectively) for how nice it all is. I'm not a huge fan of cmake, but I accept that it's the go-to for C++ building, generally speaking. Some familiarity can go a long way. This is looking like a great fit for what we want to do - essentially statically link everything from our own repos of everything from the ground up - both Windows and Linux.
Here on my machine, I don't even need to modify my user-config.jam. b2 is just picking up the directories from the -s options I give on the commandline. The rest is done by b2 and msvc.jam, like here (VS2022): ... libboost_atomic-vc140-mt-x64-1_83.pdb libboost_bzip2-vc140-mt-gd-x64-1_83.lib libboost_bzip2-vc140-mt-gd-x64-1_83.pdb libboost_bzip2-vc140-mt-x64-1_83.lib libboost_bzip2-vc140-mt-x64-1_83.pdb libboost_chrono-vc140-mt-gd-x64-1_83.lib ... Thanks Dani -- PGP/GPG: 2CCB 3ECB 0954 5CD3 B0DB 6AA0 BA03 56A1 2C4638C5
On Wed, May 1, 2024 at 7:20 PM Nigel Stewart
But, building and linking a boost-bundled zlib would be pretty close
It is better for you to clone the zlib repository locally, as this eliminates a middleman.
and pretty workable in terms of bringing our patches on top.
If you have patched zlib, why not contribute them to the upstream repository? Thanks
and pretty workable in terms of bringing our patches on top.
If you have patched zlib, why not contribute them to the upstream
repository?
Hello Vinnie, Indeed that would be ideal for everyone. We did have a cxxopts patch accepted recently. To bring that more into the land of modern C++ using std::optional. https://github.com/jarro2783/cxxopts/pull/421 But we were able to refine that and test it quite a bit on our side, first. Similarly for boost::sort https://github.com/boostorg/sort/pull/79 - Nigel
The more sustainable solution I think is to require that users are able to incorporate third party libraries into their build scripts. This is made easier with package managers of course, and now there are enough solutions that we do not need to be treating users like infants incapable of putting together a non-trivial program. Nigel via Boost: Our specific requirement is to statically link_our_ preferred version of zlib built from_our_ pristine (and possibly patched) repo. [...] But, building and linking a boost-bundled zlib would be pretty close to that, and pretty workable in terms of bringing our patches on top. I have a strong background in building software for HPC systemsfrom
Vinnie Falco via Boost: source with some software using recipes (similar to Docker files). A major pain point are libraries who keep bundling well known libraries in their builds, or even worse: Downloading them during configure/build steps. Especially when the version they use is a patched variant of some upstream source. This leads to a) duplication of libraries on the system and b) complications using different libraries together as the dependencies may conflict with each other -> ODR violations and similar. Hence we (the community behind the "build software") spend considerable effort in making such software use the already installed (sometimes called "system") libraries. So bundling something like zlib with Boost not only increases the maintenance cost of Boost as now it would be our responsibilityto make sure all security patches are applied (at the same time security patches to the system or otherwise installed libraries are not included until a new Boost release) but also may lead to subtile failures in consumers. For use cases like using a specific version of some specific library in an environment where compatibility between libs is otherwise handled it is nowadays trivial to get the source of that library as most are available on e.g. GitHub using version tags. In fact that is how it is done on those HPC systems: Sources get downloaded from known, fixed locations and verified against checksums to ensure reproducibility and authenticity. Alex
A major pain point are libraries who keep bundling well known libraries in their builds, or even worse: Downloading them during configure/build steps. Especially when the version they use is a patched variant of some upstream source.
Alex, Historically that's been typical for supporting Windows builds. I'm guilty of that, but it turns out that Windows users are important. The paradigm of using system-installed libraries is pretty much out the window there. I've done a lot of cross-platform C++ over the years which means trying to minimise platform-specifics where possible. The less we depend on the customer's system, the better chance we have of running there, out of the box.
Hence we (the community behind the "build software") spend considerable effort in making such software use the already installed (sometimes called "system") libraries.
Yes. And typically at work I'm coaxing builds to use _our_ static libraries instead. I don't think it's the most common use case, but I do consider it legitimate, even if the tools are defaulting otherwise. Disabling shared libtiff target breaks the build, for example. Needs patching. So it goes.
So bundling something like zlib with Boost not only increases the maintenance cost of Boost
Yes, that's a consideration. A broader difficulty I'm having with zlib in particular is poor cmake support. Rolling that into the boost cmake scheme is appealing, at least we'd have confidence with that. Possibly I can swap zlib out for zlib-ng, purely because the cmake is done right. But I agree that typically on Linux boost ought to use the shared zlib that is always there, by default.
For use cases like using a specific version of some specific library in an environment where compatibility between libs is otherwise handled
We want our Linux binaries to match our Windows binaries, including dependencies. A new Ubuntu might have notions of modernisation, but that is our choice and responsibility. On the other side of that coin we can use the current version of boost on older Ubuntus, and that is good. Still using Ubuntu 20.04 for production, but have no desire to be tied to boost 1.71.0. Which is all to say we have reasons to do things a little differently. And it would be nice for boost to be "batteries included" for static linking in particular. But it's hard to say how broadly that would be valued or appreciated. - Nigel Stewart
On Sat, May 4, 2024 at 4:37 PM Nigel Stewart via Boost < boost@lists.boost.org> wrote:
A broader difficulty I'm having with zlib in particular is poor cmake support.
ZLib doesn't need to support cmake, you can do it yourself by listing the .c files that the zlib library uses and compiling them together into your own specified target. Thanks
A broader difficulty I'm having with zlib in particular is poor cmake
support.
ZLib doesn't need to support cmake, you can do it yourself by listing the .c files that the zlib library uses and compiling them together into your own specified target.
Vinnie, Technically true. But in practice various things need to consume zlib via cmake. and that's a bit broken. It's not just me that thinks so: https://github.com/madler/zlib/issues/831 I empathise with the maintainer, it's thankless work that I try to avoid in my spare time too. - Nigel Stewart
On Sat, May 4, 2024 at 7:40 PM Nigel Stewart via Boost
A broader difficulty I'm having with zlib in particular is poor cmake
support.
ZLib doesn't need to support cmake, you can do it yourself by listing the .c files that the zlib library uses and compiling them together into your own specified target.
Vinnie,
Technically true. But in practice various things need to consume zlib via cmake. and that's a bit broken. It's not just me that thinks so: https://github.com/madler/zlib/issues/831 I empathise with the maintainer, it's thankless work that I try to avoid in my spare time too.
Use a package manager that has the cmake worked out. -- -- René Ferdinand Rivera Morell -- Don't Assume Anything -- No Supone Nada -- Robot Dreams - http://robot-dreams.net
participants (7)
-
Alexander Grund
-
Christian Mazakas
-
Daniela Engert
-
Nigel Stewart
-
Peter Dimov
-
René Ferdinand Rivera Morell
-
Vinnie Falco