[iostreams] Interest in multithreaded lzma filter?
Accidentally only sent this to James and not the ML:
Gesendet: Donnerstag, 23. Mai 2019 um 15:19 Uhr Von: "James E. King III"
On Thu, May 23, 2019 at 8:39 AM Mike via Boost
wrote: Gesendet: Mittwoch, 22. Mai 2019 um 23:40 Uhr Von: "Jeff Bonyun via Boost"
... How will we detect whether liblzma is capable? If it is not capable but specified, will we allow the compression anyway?
I've been investigating this. liblzma defines a version macro, LZMA_VERSION. It looks like liblzma obtained its multithreaded function in 5.1.1 (in 2011). So the idea would be to branch on this macro, and fall back to the currently-used single-threaded function for an older liblzma version. In that case, requesting any number of threads will carry out the operation, but will only use a single thread. This makes it tolerant of different versions, but does mark up the code with a few #if branches. I'm assuming boost goes for tolerance in this case.
Does a 2019 Boost release really have to be compatible with a Pre 2011 release of liblzma? Is this combination even tested? I.e. are we sure this would be the only incompatibility?
How about just adding this functionality unconditionally and see if someone complains?
So basically if someone has an older lzma they are limited to boost 1.70.0 or earlier. That's perhaps reasonable as long as it is documented, but it hasn't been the typical way I have seen things added to existing code.
Well, my second question was if you are confident that 1.70 does actually work with liblzma versions older than 5.1.1? If this isn't tested, it is not supported and if it is not supported it imho makes little sense to complicate the source code to achieve potential compatibility with (what I assume to be) a very, very small number of setups.
Another way to handle it is to keep the old path in a macro like BOOST_IOSTREAMS_NO_MULTITHREADED_LZMA. This is common through boost. Ideally, detecting the liblzma version from the header could then set BOOST_IOSTREAMS_NO_MULTITHREADED_LZMA and it would always do the right thing, but the simplest way to do it is to let the consumer of the library set this if they need it.
Personally I'm not a fan of configuration macros when they are not absolutely necessary because it effectively means you introduce yet another variant of boost that may or may not be tested and may or may not be compatible with the variant another dependency in my project was tested with (The total configuration space of boost is already far too large for my taste). As Robert Schumacher said in his cppcon talk: It's not just about you and the user, but also every library between you and the user. All of them now either have to support both choices or have to make one choice and hope they don't become incompatible with a different 3rd party library. Yet another reason, why package & dependency management is so hard in c++. However, as long as the interface is untouched by this macro I guess it would be fine. Best Mike
So basically if someone has an older lzma they are limited to boost 1.70.0 or earlier. That's perhaps reasonable as long as it is documented, but it hasn't been the typical way I have seen things added to existing code. Well, my second question was if you are confident that 1.70 does actually work with liblzma versions older than 5.1.1? If this isn't tested, it is not supported and if it is not supported it imho makes little sense to complicate the source code to achieve potential compatibility with (what I assume to be) a very, very small number of setups.
I will be testing, with and without my additions, with every numbered version of liblzma before making my pull request. So I'll be able to answer that soon. But that doesn't deal with your real concern: that testing will not be automated and ongoing, and it could mysteriously break in the future. My production systems are RHEL 6. They have a 4.x version of liblzma installed by the package manager, which is incompatible with the changes I am making. I get that RHEL 6 is old, but some industries have to deal with that sort of thing.
However, as long as the interface is untouched by this macro I guess it would be fine.
The design I have right now would leave the interface the same between the macro on and the macro off. I.e. both would have a new "threads" argument and member. The macro would be used only in the cpp file. Working code would work in all cases. Requests for different numbers of threads would just be ignored if it isn't supported. The deeper questions here are above my pay grade. I will bend to the whims of those with more experience maintaining boost.
Jeff Bonyun wrote:
The deeper questions here are above my pay grade. I will bend to the whims of those with more experience maintaining boost.
There's nothing wrong with your original plan of testing LZMA_VERSION. WRT testing, packages.ubuntu.com no longer lists Trusty, so I can't verify the liblzma version there, but Xenial has 5.1.1 alpha, which leads me to think that Trusty has an older one. So we can test it on Travis, in principle, and in fact have been testing it. (The Travis default has switched to Xenial, no now we test 5.1.1, but that's a matter of one dist: trusty line.) If we could avoid the need for a configuration macro altogether, that would be best; how common is liblzma built with --disable-threads? Or, can we detect that somehow?
On 5/24/19 3:39 AM, Mike via Boost wrote:
Well, my second question was if you are confident that 1.70 does actually work with liblzma versions older than 5.1.1? If this isn't tested, it is not supported and if it is not supported it imho makes little sense to complicate the source code to achieve potential compatibility with (what I assume to be) a very, very small number of setups.
Aha! I have an answer. Boost.iostreams 1.70 works with liblzma 4.999.7beta or later. The function signatures are different before that. So worrying about multithreaded compatibility before that is irrelevant, but it still leaves the 4.999.7beta to 5.2.0 interval to handle. Luckily, the auto-detection of a sufficiently new version of liblzma seems to be working fine. For those keeping track, anything before 5.2.0 will be built without threads, because it wasn't supported in the default build of liblzma. It will work fine as a single threaded library. Anything 5.2.0 or after will enable threads, and will be able to auto-detect the number of cores too. On 5/24/19 11:27 AM, Peter Dimov via Boost wrote:
If we could avoid the need for a configuration macro altogether, that would be best; how common is liblzma built with --disable-threads? Or, can we detect that somehow?
Unfortunately the --disable-threads (or --enable-threads=no) ./configure arg to liblzma doesn't change the header files of liblzma at all. That would have been the easiest for us, if they defined a macro to reflect the lack of threading, but they don't. By building them side-by-side, I can see the differences are in: - the .so files (--disable-threads means it doesn't have `stream_encode_mt` or `lzma_cputhreads`, both of which I use). - the .la file, which isn't typically part of a packaged release, but is installed if built from source. - the .pc file for pkgconfig, which might be installed by packaged releases (my Ubuntu 16.04 has it, my RHEL 6 doesn't). So unless we wanted to try finding the .so file that will be used, and search its symbols, I would say that we should default to assuming that threads are enabled on any sufficiently-new version, and provide a define flag to turn that off if the user is having trouble building. (And to the future developers searching the web for reasons why their boost won't build, I'm sorry. Try `<define>BOOST_IOSTREAMS_LZMA_NO_MULTITHREADED`.) I'm finishing my pull request. Should be available in the next day or so. Jeff
participants (3)
-
Jeff Bonyun
-
Mike
-
Peter Dimov