[cmake] Problem with dependencies for configure checks

Hi, I've been having this problem with CMake in Boost.Filesystem, where I need to have the Boost::library targets available before Boost::filesystem target is defined. This is needed in order to perform configuration checks that will affect how Boost::filesystem will be defined. For example, see the check_cxx_source_compiles calls: https://github.com/boostorg/filesystem/blob/af6ac28b5785820b93a2af7dd6e9b801... For the check_cxx_source_compiles calls, I would have liked to add targets like Boost::config and Boost::winapi to CMAKE_REQUIRED_LIBRARIES so that the checks are compiled with include directories of those dependencies added in the command line. This doesn't work because these targets may not be defined at the point when the configuration checks are performed. My current workaround, as you can see, is to add the Boost root to CMAKE_REQUIRED_INCLUDES, where the boost directory with the common headers tree is located. However, this will not work if we're going to make that directory optional or remove it. Also, you can see that I'm reusing checks implemented in Boost.Config, for which I also had to hard code its relative location from Boost.Filesystem. If we're going to allow different libraries located in separate directory trees, this will also break. My question is, can this be fixed and how? I was wondering if it is possible to make the target generation two-stage, where the first stage identifies library locations, including directories with their headers, and the second stage would define build targets. This way all library directories would be available to the second stage. Although I can already see the problem with second order dependencies with this approach...

Andrey Semashev wrote:
I add ../config/include to CMAKE_REQUIRED_INCLUDES.
https://github.com/boostorg/stacktrace/blob/79aff77771a8f3a3d62852de6619a955...
I don't think we should mandate add_subdirectory calls to come in a specific
order. Even if we can hack something in the superproject to do it, users who use
add_subdirectory manually will need to get the order right.
Incidentally,
#include <${CMAKE_CURRENT_SOURCE_DIR}/config/has_stat_st_blksize.cpp>
doesn't work on Cygwin with the Windows CMake; this ends up generating
#include

On 6/3/21 6:43 PM, Peter Dimov via Boost wrote:
But that would mean one has to specify second order dependencies as well for configure checks. This might be fine for Boost.Config, but not for other libraries that have dependencies. Boost.WinAPI is an example; I wouldn't want to replicate its dependencies in all downstream libraries that have checks involving it.
I'm not sure I understand. Is it that Cygwin doesn't accept angle brackets with Windows paths?

Andrey Semashev wrote:
But that would mean one has to specify second order dependencies as well for configure checks.
¯\_(ツ)_/¯
I'm not sure I understand. Is it that Cygwin doesn't accept angle brackets with Windows paths?
The Cygwin compiler just doesn't accept Windows paths, as far as I can see.
Performing C++ SOURCE FILE Test BOOST_FILESYSTEM_HAS_STAT_ST_BLKSIZE failed with the following output:
Change Dir: C:/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp
Run Build Command(s):C:/cygwin64/bin/make.exe -f Makefile cmTC_8f9df/fast && /usr/bin/make -f CMakeFiles/cmTC_8f9df.dir/build.make CMakeFiles/cmTC_8f9df.dir/build
make[1]: Entering directory '/cygdrive/c/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_8f9df.dir/src.cxx.obj
C:/cygwin64/bin/c++.exe -DBOOST_FILESYSTEM_HAS_STAT_ST_BLKSIZE @CMakeFiles/cmTC_8f9df.dir/includes_CXX.rsp -o CMakeFiles/cmTC_8f9df.dir/src.cxx.obj -c C:/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp/src.cxx
C:/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp/src.cxx:1:10: fatal error: C:/boost-git/develop/libs/filesystem/config/has_stat_st_blksize.cpp: No such file or directory
1 | #include

On 6/3/21 9:30 PM, Peter Dimov via Boost wrote:
Ok, so it seems I'll have to add a check to enforce that the unified headers directory exists. This means that we are still dependent on `b2 headers`, at least until there is a CMake equivalent.
Ok, I think I understand. You're using native Windows CMake that generates Windows paths and a Cygwin compiler that doesn't understand them. That looks like an incorrect config to me; you should be using Cygwin CMake, which should generate Cygwin paths both in source files and command lines.

On 6/3/21 9:59 PM, Peter Dimov via Boost wrote:
Well, the only other alternative I see is globbing through all libraries to collect their include paths. And that would have to be done in every library, unless a common solution exists. I just looked at Boost.Atomic and it also relies on the unified headers dir, and I'm pretty sure I'm going to need it in Boost.Log. So the scanning will happen at least three times per build, which is already not good.

On 6/4/21 1:05 AM, Peter Dimov via Boost wrote:
Probably because most of them don't do configure checks or don't use Boost while doing so. In the cases I mentioned, I do use Boost for configure time checks, and globbing is a way to avoid having to hardcode the dependency tree for those checks.

Andrey Semashev wrote:
I don't think globbing would be a particularly good practice in this case, but
whatever floats your boat, I suppose. I will continue hardcoding the required
include directories. That's not that great a practice either (*), but I don't see
anything better.
Why is that even necessary? Configure checks should be standalone. There's
no problem using

On 6/4/21 3:29 AM, Peter Dimov via Boost wrote:
Sure, I don't like it either. That's why I asked if something better could be done in Boost.CMake, or however it is called.
In the particular case of Boost.WinAPI I want it (Boost.WinAPI) to define all the version macros it defines. This way I ensure that the config check is compiling in the same environment as the library will. I actually include a special header from Boost.Filesystem (platform_config.hpp) to define all platform-specific macros, which indirectly includes Boost.Config and Boost.WinAPI. In Boost.Log I have a config check that uses Boost.Atomic to test if it supports the required lock-free operations on a given target platform.
I'm inclined to simply not support incomplete source trees.

Andrey Semashev wrote:
While on the subject of Filesystem configure checks, I don't know if it's been reported already, but the statx kernel call is causing problems; Filesystem operations failing with ENOTSUPP. You shouldn't be assuming that the machine where the library is built will be the same as the one the program will run on. (I think this has come up in the context of Docker, where Filesystem is probably built in the host environment but the program that uses it runs in Docker and apparently the supported kernel calls aren't the same. Or perhaps it's built when the Docker image is created, not sure.)

On 6/4/21 4:07 AM, Peter Dimov via Boost wrote:
ENOSYS, perhaps? Yes, there were reports. https://github.com/boostorg/filesystem/issues/172
My stance on this is that Boost.Filesystem must be built on a system with kernel headers matching or older than the systems that will run the binary. This matches the same requirements for Windows. I don't support configs with headers newer than runtime kernel.

On 04/06/2021 10:12, Andrey Semashev via Boost wrote:
It's trivially easy to do a runtime check for statx kernel support and fall back on older syscalls if needed. This is LLFIO doing so, and it works on Linux kernel 2.6: https://github.com/ned14/llfio/blob/develop/include/llfio/v2.0/detail/impl/p... Niall

On 6/4/21 2:03 PM, Niall Douglas via Boost wrote:
The code you pointed to is not what I would call trivial. There are quite a few places where statx is used in Boost.Filesystem, in some places exclusively (i.e. there is no fallback). I could add a runtime check, but I definitely don't want that check to happen on every call. This means the result must be stored globally in a thread-safe manner. Then there are other syscalls, on Linux and other systems, should I also add runtime checks for those? All that to support what I consider improperly configured systems. It just doesn't look like a worthwhile effort while I have other things to do, including in Boost.Filesystem too.

On 04/06/2021 13:04, Andrey Semashev via Boost wrote:
There are only a **very** few things which statx provides which older syscalls cannot. statx's big gain is it is a snapshot not subject to races, but that hardly affects Filesystem.
At work we've got llfio::stat_t::fill() being called in some places millions of times per second. It is implemented by attempting statx first, and if it is not supported, then stat or other syscalls to emulate the same. You cannot cache statx support/non-support because only some filing systems support statx, and others do not. So depending on the path used, it may work, but at other times, not. Right now at work we're on kernels which don't support statx, so we are always getting back ENOSYS and then falling back to emulation. We have not found any performance issues.
A very great deal of Boost users are either on older Linux kernels, or are running Boost inside Docker containers where various newer syscalls are disabled e.g. where I work. In my opinion, if you like your users, you need to implement syscall fallback for all recent Linux syscalls, same as you'd do on Windows. I agree that you can reasonably cut off all Linux kernels past EOL, so you only need to support syscall fallback for kernels since EOL. Niall

On 04/06/2021 15:54, Peter Dimov via Boost wrote:
Well, that depends. If you're on a kernel that has no statx syscall, then the userspace syscall dispatcher will spot the unknown syscall number, and you'll get back ENOSYS very quickly, as you point out. But if you're on a kernel with statx syscall, and instead it is the filing system which doesn't support statx, then that's a full fat syscall just to get back ENOSYS. I don't personally think anyone who cares about metadata retrieval performance would ever use Filesystem or std::filesystem because both tend to do dynamic memory allocations, which easily swamp a statx call. So, in that sense, even many syscalls can be done and it'll be unmeasurable compared to say a filesystem path construction. If this were Windows, because Filesystem calls many Win32 functions to build a stat, that is a very different situation. But not on POSIX. Niall

Niall Douglas wrote:
Sure, but the alternative is to do that same fat syscall just to fail, as Filesystem currently does. The performance concerns in our case are about the scenario where the syscall doesn't fail, because that's the case that works. The performance of something that doesn't work is irrelevant.

On 6/4/21 6:17 PM, Niall Douglas via Boost wrote:
Kernel should not return ENOSYS for a syscall that is implemented. This error code is reserved for this specific purpose. What you should get is probably ENOTSUP or EINVAL. Are you sure you're getting specifically ENOSYS?
Boost.Filesystem doesn't implement stat, it uses it as a means to implement other, more fine grained operations. On Windows, these operations tend to be implemented with a single WinAPI call, in some cases with a pair open/close file handle calls.

On 6/4/21 5:44 PM, Niall Douglas via Boost wrote:
Isn't statx supposed to complete successfully through the stat route in the kernel, if the filesystem does not support statx natively?
My point is that if you're running a kernel that doesn't support statx then you should compile Boost.Filesystem with headers from that kernel. It will not have statx syscall and Boost.Filesystem will compile to use the old stat syscall. <rant> Docker people seem to think of Docker images as of VMs while they are not. Hence they consider it normal to build the image on one system and then run it on another, with possibly older kernel. Well, this is wrong. It doesn't work that way, and Boost.Filesystem is not the only piece of software that will break because of that. </rant>

On 6/4/21 7:20 PM, Peter Dimov via Boost wrote:
That's a fair point. Damn, it looks like I'll have to make a runtime check after all...
If this check is unreliable, it should be off by default and opt-in. It does more harm than good.
Stuff like this never gets enabled. And besides, I do need statx unconditionally for some of the operations.

To put this in perspective, since winapi only depends on config and predef, and those depend on nothing else, you only need three include directories: "../config/include" "../predef/include" "../winapi/include" Globbing, in contrast, would add 141 include directories. I fail to see how the latter can in any way be considered a superior solution.

On 6/4/21 5:26 PM, Peter Dimov via Boost wrote:
The number of include directories doesn't really matter, as long as they don't cause conflicts (they shouldn't). The performance hit is unfortunate, and as I said, I would have liked to avoid it. This is ugly, I get it, and I fully agree, but I don't see a better way.
I fail to see how the latter can in any way be considered a superior solution.
It is superior as it does not require active maintenance. Especially given that a failed configure check because of a missing dependency might not be noticeable in CI. Of all resources we have in Boost, I consider maintainer's time and effort the most valuable one. This is evident by the number of libraries without active maintainance we have. Any solution that has zero maintenance cost in the long run, without drastic downsides in other areas, is a worthy solution.

Andrey Semashev wrote:
We're talking about one of your libraries depending on another one of your libraries. The dependencies of winapi aren't going to suddenly change. And no, config and predef aren't going to acquire secondary dependencies either. Checking that winapi can't be included isn't hard; add another check that only includes it and checks nothing and message(AUTHOR_WARNING when it fails.

On 6/4/21 6:27 PM, Peter Dimov via Boost wrote:
I can't be held to remember which of my libraries depend on what. My memory isn't that good, sorry. And I'm not immortal, you know. One day there will be another maintainer (hopefully), and the less work he has to do the better.
The dependencies of winapi aren't going to suddenly change.
Well, I wouldn't bet on that. I'm not going to add dependencies just because, but I want to be able to should I need it without silently breaking someone's configure checks.
Eh, that's still brittle and requires intervention...

For users who add_subdirectory the boost libs I admit this is an issue. But to me this only means that Boost authors should avoid relying on configure-checks which need other libs and where required use the Boost::* aliases so the error in case it was not added is quite clear. Users then need to make sure to e.g. add Boost::predef and Boost::config first. I don't think this is too bad.

On 6/3/21 8:42 PM, Niall Douglas via Boost wrote:
I don't think that helps in this case. The point is not to skip configuration checks if the target is not available. The checks must execute, I just want to have a way to specify include directories even when the targets are not yet defined.
participants (5)
-
Alexander Grund
-
Andrey Semashev
-
Edward Diener
-
Niall Douglas
-
Peter Dimov