[all] Request for out of the box visibility support
I'd like to draw attention to a problem with Boost binaries for Linux. There's an awesome -fvisibility=hidden flag for Linux compilers that improves load times, performance, size of binaries and reduces the chance of symbol collisions. More info at https://gcc.gnu.org/wiki/Visibility . Unfortunately, most of the Boost libraries do not set it by default: - atomic - chrono - container - context - contract - coroutine - date_time - exception - fiber - filesystem - graph - graph_parallel - iostreams - locale - mpi - program_options - python - random - regex - signals - system - test - thread - timer - type_erasure - wave Moreover minority of the above libraries just do not work with the flag. Users just can not run ./b2 cxxflags="-fvisibility=hidden" because there's a chance that some library could stop linking. Actually things are even more ugly. Linux distributions usually do not tune the build flags for each package so at least Debian based distributions build Boost with default flags. Users get suboptimal builds. If you're a maintainer of one of the above libraries *please do* the following steps: * Make sure that all the public symbols are accordingly marked with appropriate BOOST_SYMBOL_* macro. Instruction is available here: https://www.boost.org/doc/libs/1_68_0/libs/config/doc/html/boost_config/boos... * Turn on the visibility=hidden by default: * by adding <target-os>linux:<cxxflags>"-fvisibility=hidden" to the Jamfile if you do not care much for antique compilers (Example https://github.com/boostorg/stacktrace/blob/819f2b1c861dec7530372a990ecabab7... ) * by using a more advanced technique for detecting the flag availability (For example see https://github.com/boostorg/math/blob/develop/build/Jamfile.v2#L20 or https://github.com/boostorg/log/blob/develop/build/Jamfile.v2#L24 ) P.S.: I would appreciate any comments or updates on the feature request. P.P.S.: Log, Math, Serialization (and Stacktrace) libraries already use that flag by default. Many thanks! -- Best regards, Antony Polukhin
On 08/17/18 09:21, Antony Polukhin via Boost wrote:
I'd like to draw attention to a problem with Boost binaries for Linux.
There's an awesome -fvisibility=hidden flag for Linux compilers that improves load times, performance, size of binaries and reduces the chance of symbol collisions. More info at https://gcc.gnu.org/wiki/Visibility .
Unfortunately, most of the Boost libraries do not set it by default: - atomic - chrono - container - context - contract - coroutine - date_time - exception - fiber - filesystem - graph - graph_parallel - iostreams - locale - mpi - program_options - python - random - regex - signals - system - test - thread - timer - type_erasure - wave
Moreover minority of the above libraries just do not work with the flag. Users just can not run ./b2 cxxflags="-fvisibility=hidden" because there's a chance that some library could stop linking. Actually things are even more ugly. Linux distributions usually do not tune the build flags for each package so at least Debian based distributions build Boost with default flags. Users get suboptimal builds.
If you're a maintainer of one of the above libraries *please do* the following steps: * Make sure that all the public symbols are accordingly marked with appropriate BOOST_SYMBOL_* macro. Instruction is available here: https://www.boost.org/doc/libs/1_68_0/libs/config/doc/html/boost_config/boos... * Turn on the visibility=hidden by default: * by adding <target-os>linux:<cxxflags>"-fvisibility=hidden" to the Jamfile if you do not care much for antique compilers (Example https://github.com/boostorg/stacktrace/blob/819f2b1c861dec7530372a990ecabab7... ) * by using a more advanced technique for detecting the flag availability (For example see https://github.com/boostorg/math/blob/develop/build/Jamfile.v2#L20 or https://github.com/boostorg/log/blob/develop/build/Jamfile.v2#L24 )
P.S.: I would appreciate any comments or updates on the feature request. P.P.S.: Log, Math, Serialization (and Stacktrace) libraries already use that flag by default. Many thanks!
I wonder if we should update Boost.Build instead and set visibility to hidden by default. For libraies that need other visibility we could offer a property.
On Fri, Aug 17, 2018, 11:10 Andrey Semashev via Boost
I wonder if we should update Boost.Build instead and set visibility to hidden by default. For libraies that need other visibility we could offer a property.
I'm not experienced with b2 internals but I'd be glad to help with testing/documenting/... this feature. Is there a champion to actually implement it?
On Sun, Aug 19, 2018 at 7:48 AM Antony Polukhin via Boost < boost@lists.boost.org> wrote:
On Fri, Aug 17, 2018, 11:10 Andrey Semashev via Boost < boost@lists.boost.org> wrote: <...>
I wonder if we should update Boost.Build instead and set visibility to hidden by default. For libraies that need other visibility we could offer a property.
I think a better solution is to remove link libraries where possible. For example in date_time there isn't much of a case to keep the linked library. If we look at the issue it causes like https://github.com/boostorg/date_time/pull/34, one of the solutions to that issue of visibility is to remove the link library entirely. I'd have to recommend all libraries attempt to become header-only to avoid these issues. Obviously that won't work for all libraries, but it's possible for a number of them. I opened https://github.com/boostorg/date_time/issues/85 to track that for date_time. - Jim
On 8/19/18 9:31 AM, James E. King III via Boost wrote:
On Sun, Aug 19, 2018 at 7:48 AM Antony Polukhin via Boost < boost@lists.boost.org> wrote:
I think a better solution is to remove link libraries where possible. For example in date_time there isn't much of a case to keep the linked library. If we look at the issue it causes like https://github.com/boostorg/date_time/pull/34, one of the solutions to that issue of visibility is to remove the link library entirely. I'd have to recommend all libraries attempt to become header-only to avoid these issues. Obviously that won't work for all libraries, but it's possible for a number of them. I opened https://github.com/boostorg/date_time/issues/85 to track that for date_time.
TL;DR; The appeal of using header only libraries is sign that we have bigger problems not being addressed. Basically this amounts to deprecating the separate compilation model that is a historical feature of C++. In my view it's an essential feature. Many complain of compile times that are "too long" for their 1000000 line programs. More than anything this is an indicator that their code hasn't been factored into separately compilable modules. Not that I'm totally unsympathetic. The C++ separate compilation model was extended to shared libraries in an ad hoc way by each vendor. Now we have issues with symbol visibility and ABI compatibility. The tools vendors havn't really addressed this and so C++ programs are a pain to configure and build. It's also unwieldy to incorporate library code and especially pre-compiled modules. Just using header only - solves the major convenience problems - but the build doesn't scale anymore creating 1,000,000 compilations. Now we're looking a adding concepts, contracts and modules. I'm really concerned that C++ may collapse under it's own weight. What is really needed is for someone to step back and and re-address the issues of ODR, visibility, ABI compatibility and versioning, build tooling and anything else that ripples to. It's huge task. C++ growth is O(n^2) where n is the number of features added. Basically, C++ has become a victim of it's own success. In the past, this has often resulted in new programming languages. But now I see the C++ universe as being too big for this. Robert Ramey
- Jim
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 08/19/18 14:47, Antony Polukhin via Boost wrote:
On Fri, Aug 17, 2018, 11:10 Andrey Semashev via Boost
wrote: <...> I wonder if we should update Boost.Build instead and set visibility to hidden by default. For libraies that need other visibility we could offer a property.
I'm not experienced with b2 internals but I'd be glad to help with testing/documenting/... this feature. Is there a champion to actually implement it?
https://github.com/boostorg/build/pull/331 I've only tested it on Linux, with gcc and clang. Please, test on other platforms and see if it causes problems.
Looks like all the issues in submodules were fixed and the PR https://github.com/boostorg/boost/pull/190 now can be merged. Restarting the CI before the merge should prove that everything is fine. The sooner the PR is merged the more time for testing we'll have.
Firm reminder that the PR https://github.com/boostorg/boost/pull/190
should be rebased on top of the boostorg/boost for the successful CI run
(and merge).
On Mon, Aug 27, 2018, 09:54 Antony Polukhin
Looks like all the issues in submodules were fixed and the PR https://github.com/boostorg/boost/pull/190 now can be merged.
Restarting the CI before the merge should prove that everything is fine.
The sooner the PR is merged the more time for testing we'll have.
On 09/06/18 08:17, Antony Polukhin via Boost wrote:
On Mon, Aug 27, 2018, 09:54 Antony Polukhin
wrote: Looks like all the issues in submodules were fixed and the PR https://github.com/boostorg/boost/pull/190 now can be merged.
Restarting the CI before the merge should prove that everything is fine.
The sooner the PR is merged the more time for testing we'll have.
Firm reminder that the PR https://github.com/boostorg/boost/pull/190 should be rebased on top of the boostorg/boost for the successful CI run (and merge).
Done, sorry for the delay.
Firm reminder, that it's preferable to merge https://github.com/boostorg/boost/pull/190 sonner and test for more time in develop, rather than merge later and find issues in beta.
On 8/16/18 11:21 PM, Antony Polukhin via Boost wrote: This is a worthy idea. Having a shared library/dll export only those symbols which are required is a big improvement for the user. But, there's much more to it than meets the eye. I implemented this for the serialization library as it was a huge PITA. a) The syntax for marking up exported functions is different for microsoft and other compilers. b) when a library is built on a lot of nested/inherited classes it becomes non-obvious what to export. c) when a library depends upon another library, it can become even more problematic. d) when doing stuff like running gcc compiler creating shared library under windows things can get confused. e) Our testing matrix doesn't display or maintain different results base on whether a test is being run with shared or static linking. So the value of the test results is much diminished. There are so many combinations that one must really depend upon the test results but these don't have the required information. f) There was a paper for the standardization committee which included a proposal to help addressing this stuff - but it failed to gain enough interest to get the problem addressed. In fact, the whole issue of dynamically loaded shared libraries raises a bunch of interesting problems - are global static variables unique?, etc. . The concept can be very, very powerful in creating large extendable applications, but the C++ language is silent on the whole issue. It's a serious problem but too unpleasant to actually address. g) In the absense of any guidence from the standard and compilers not being on the same page, there's an opportunity to create a boost library of macros, documentation, and macros and who knows what else. We have a great starting point with John Maddox's work on the subject - which is what I depend on. But still, the situation is so messed up, I would hope that can be expanded upon. g) Actually it's unclear to me how the -fvisibility switch enters into this. VC doesn't have it. I don't think it's actually necessary and I'm not sure how it alters the functioning of the visibility. I know, read the rules. But like a lot of stuff in C++ the understanding provided by the baroque rules evaporates the minute one has to investigate the next ball of yarn. It's just not possible to understand all of C++ at one time. Robert Ramey
On Fri, Aug 17, 2018, 19:38 Robert Ramey via Boost
On 8/16/18 11:21 PM, Antony Polukhin via Boost wrote:
This is a worthy idea. Having a shared library/dll export only those symbols which are required is a big improvement for the user.
But, there's much more to it than meets the eye. I implemented this for the serialization library as it was a huge PITA.
a) The syntax for marking up exported functions is different for microsoft and other compilers.
Yes, but BOOST_SYMBOL_* macro help a lot. b) when a library is built on a lot of nested/inherited classes it
becomes non-obvious what to export.
Yes, which is not the point for most of the Boost libraries from the above list. <...>
f) There was a paper for the standardization committee which included a proposal to help addressing this stuff - but it failed to gain enough interest to get the problem addressed. In fact, the whole issue of dynamically loaded shared libraries raises a bunch of interesting problems - are global static variables unique?, etc. . The concept can be very, very powerful in creating large extendable applications, but the C++ language is silent on the whole issue. It's a serious problem but too unpleasant to actually address.
Yes, that was my paper. I was hoping to update it and present again in the committee next year. Things do not go smooth with shared libraries in the EWG, so for this year I've changed the approach and made a proposal on dynamic loading (based on Boost.DLL) for LEWG. <...>
g) Actually it's unclear to me how the -fvisibility switch enters into this. VC doesn't have it. I don't think it's actually necessary and I'm not sure how it alters the functioning of the visibility. I know, read the rules. But like a lot of stuff in C++ the understanding provided by the baroque rules evaporates the minute one has to investigate the next ball of yarn. It's just not possible to understand all of C++ at one time.
You may assume that VC has it enabled by default and there's no way to disable it. There are actually many differences in border cases, but trivial cases work in the same way. Robert Ramey
On 8/17/2018 11:14 AM, Robert Ramey via Boost wrote:
On 8/16/18 11:21 PM, Antony Polukhin via Boost wrote:
This is a worthy idea. Having a shared library/dll export only those symbols which are required is a big improvement for the user.
But, there's much more to it than meets the eye. I implemented this for the serialization library as it was a huge PITA.
a) The syntax for marking up exported functions is different for microsoft and other compilers.
b) when a library is built on a lot of nested/inherited classes it becomes non-obvious what to export.
c) when a library depends upon another library, it can become even more problematic.
d) when doing stuff like running gcc compiler creating shared library under windows things can get confused.
e) Our testing matrix doesn't display or maintain different results base on whether a test is being run with shared or static linking. So the value of the test results is much diminished. There are so many combinations that one must really depend upon the test results but these don't have the required information.
f) There was a paper for the standardization committee which included a proposal to help addressing this stuff - but it failed to gain enough interest to get the problem addressed. In fact, the whole issue of dynamically loaded shared libraries raises a bunch of interesting problems - are global static variables unique?, etc. . The concept can be very, very powerful in creating large extendable applications, but the C++ language is silent on the whole issue. It's a serious problem but too unpleasant to actually address.
I do not see any difference, vis-a-vis the visibility problem discussed, between dynamically loaded shared libraries or statically loaded shared libraries. I agree with you that the problem of visibility is often more complicated than what is supposed, given different compilers having their own rules regarding visibility on different platforms.
On 8/18/18 6:58 AM, Edward Diener via Boost wrote:
I do not see any difference, vis-a-vis the visibility problem discussed, between dynamically loaded shared libraries or statically loaded shared libraries.
visibility isn't really an issue with static libraries. visibility decreases the number of externally visible symbols. In linking a static library this might decrease linking times - but I haven't noticed it and I never received complaints about it. I think that the BOOST visibility macros are defined to nothing for static builds. But for shared libraries it's a whole 'nuther issue. The huge number of symbols in libraries can make the shared libraries much, much larger and slow down dynamic linking time considerably. This is why this is a worthy project. From personal experience in implementing this for the serialization library (admittedly a more complex example), anyone who embarks upon this will be disappointed at the amount of time it ends up consuming. Robert Ramey
On 08/18/18 18:19, Robert Ramey via Boost wrote:
On 8/18/18 6:58 AM, Edward Diener via Boost wrote:
I do not see any difference, vis-a-vis the visibility problem discussed, between dynamically loaded shared libraries or statically loaded shared libraries.
visibility isn't really an issue with static libraries. visibility decreases the number of externally visible symbols. In linking a static library this might decrease linking times - but I haven't noticed it and I never received complaints about it.
Visibility in static libraries doesn't affect linking times (i.e. linking as the building stage, not the binary loading stage) because hidden symbols are used in symbol resolution. But visibility markup is still important in static libraries because it is preserved when the final executable is produced. So, for example, if you mark a type with default visibility and define it in a static library, you can link that library into a shared library and the type would still be default-visible*. This is useful when you want to ensure that a certain type, like an exception, is public regardless of how it is compiled into the final executable. * I'm ignoring here linker scripts, which can further hide symbols on the linking stage. AFAIK, we don't use linker scripts in Boost, but some applications do.
I think that the BOOST visibility macros are defined to nothing for static builds.
No, BOOST_SYMBOL* macros are defined the same way regardless of the build.
But for shared libraries it's a whole 'nuther issue. The huge number of symbols in libraries can make the shared libraries much, much larger and slow down dynamic linking time considerably. This is why this is a worthy project.
More importantly, hiding internal symbols ensures there won't be a symbol clash. This is also a security matter.
From personal experience in implementing this for the serialization library (admittedly a more complex example), anyone who embarks upon this will be disappointed at the amount of time it ends up consuming.
I suspect there may be a lot of specifics in Boost.Serialization. However, most of the time supporting hidden visibility is quite straightforward and goes in line with supporting dllexport/dllimport on Windows.
On 8/18/18 10:26 AM, Andrey Semashev via Boost wrote:
So, for example, if you mark a type with default visibility and define it in a static library, you can link that library into a shared library and the type would still be default-visible*. This is useful when you want to ensure that a certain type, like an exception, is public regardless of how it is compiled into the final executable.
Hmmm - don't you run into problems linking static libraries into shared libraries. I'm pretty sure that this is a problem on VC compilers a there are different versions of C runtime - one for dynamic linking and another for static linking. It's all very confusing to me.
On Sat, Aug 18, 2018 at 10:54 PM Robert Ramey via Boost
On 8/18/18 10:26 AM, Andrey Semashev via Boost wrote:
So, for example, if you mark a type with default visibility and define it in a static library, you can link that library into a shared library and the type would still be default-visible*. This is useful when you want to ensure that a certain type, like an exception, is public regardless of how it is compiled into the final executable.
Hmmm - don't you run into problems linking static libraries into shared libraries.
There is no problem linking static libraries into shared libraries, I do it every day.
I'm pretty sure that this is a problem on VC compilers a there are different versions of C runtime - one for dynamic linking and another for static linking. It's all very confusing to me.
MSVC issues with its runtime incompatibilities are fairly specific to MSVC. And those issues are not related to static libraries per se since the same problems can occur with shared libraries. Basically, you should always link with the shared runtime, unless you dead sure know what you're doing and really have to link with the static one. I'm not even sure they still ship the static runtime.
On 19/08/2018 08:17, Andrey Semashev wrote:
There is no problem linking static libraries into shared libraries, I do it every day.
This is not entirely true -- while you can do that, and it can work, it comes with a very big caveat that will trap the unwary. Let's say you have a library A and another library B, and an executable C. Compile everything as shared, now C will dynamically load B and A, and everything will Just Work™. Compile everything as static, now C will statically contain B and A, and everything will Just Work™. Compile A as static and B as shared. Now things *might* work, or you might have a problem -- it depends on how the libraries are actually used; in particular whether A is used by B or C or both. If it's used only by one or the other (and in particular is *not* exposed in the public API of B at all), then everything is fine. If B does a poor job of hiding its use of A (ie. it's included in public header files, even if not part of public API), and if C uses A as well, you now have an ODR problem. And the problem is worse if A's objects are used in the public API of B. The key thing to realise is that "A as statically linked into B" and "A as statically linked into C" are technically separate copies of the library. And passing objects around between separate copies of the same library is highly perilous -- a lot of the time you can get away with it (provided that the same compiler settings were used in both cases, so you don't get different memory layouts) -- but some things will be "wrong" such as separate copies of static variables and the like, and this can cause misbehaviour. This is why you're always encouraged to link to the runtime library as shared -- because since it's a dependency of every other library, the only time it's safe to use it statically is if you don't use *any* shared libraries at all. Otherwise you end up with multiple copies of the runtime, and thus multiple separate heaps, and hilarity ensues. Having multiple copies of other libraries is perhaps less dramatic than that, and you can make it "safe" if you're very careful about segregation, but it's a highly effective source of potential bugs.
On Tue, 21 Aug 2018 at 02:59, Gavin Lambert via Boost
On 19/08/2018 08:17, Andrey Semashev wrote:
There is no problem linking static libraries into shared libraries, I do it every day.
This is not entirely true -- while you can do that, and it can work, it comes with a very big caveat that will trap the unwary.
I'm happy you are commenting on this problem again. Let's say you have a library A and another library B, and an executable C.
Compile everything as shared, now C will dynamically load B and A, and everything will Just Work™.
Compile everything as static, now C will statically contain B and A, and everything will Just Work™.
Compile A as static and B as shared. Now things *might* work, or you might have a problem -- it depends on how the libraries are actually used; in particular whether A is used by B or C or both.
If it's used only by one or the other (and in particular is *not* exposed in the public API of B at all), then everything is fine.
If B does a poor job of hiding its use of A (ie. it's included in public header files, even if not part of public API), and if C uses A as well, you now have an ODR problem. And the problem is worse if A's objects are used in the public API of B.
The key thing to realise is that "A as statically linked into B" and "A as statically linked into C" are technically separate copies of the library. And passing objects around between separate copies of the same library is highly perilous -- a lot of the time you can get away with it (provided that the same compiler settings were used in both cases, so you don't get different memory layouts) -- but some things will be "wrong" such as separate copies of static variables and the like, and this can cause misbehaviour.
This is why you're always encouraged to link to the runtime library as shared -- because since it's a dependency of every other library, the only time it's safe to use it statically is if you don't use *any* shared libraries at all. Otherwise you end up with multiple copies of the runtime, and thus multiple separate heaps, and hilarity ensues.
This is I think not what happens in practice, in vcpkg f.e. creation of a static library will by default also imply that that library is statically linked to the run-time. I also do this when building Boost, while you are actually saying that that option/possibility should not even exist as I presume that the combination of a dynamic library statically linked to the run-time crt makes even less sense. Having multiple copies of other libraries is perhaps less dramatic than
that, and you can make it "safe" if you're very careful about segregation, but it's a highly effective source of potential bugs.
You can only find out that you weren't careful when you are finding it out, which might be a case of Seems To Work™, until it doesn't, as it's a full moon degski -- *“If something cannot go on forever, it will stop" - Herbert Stein* *“No, it isn’t truth. Truth isn’t truth" - Rudolph W. L. Giuliani*
On 21/08/2018 15:35, degski wrote:
This is why you're always encouraged to link to the runtime library as shared -- because since it's a dependency of every other library, the only time it's safe to use it statically is if you don't use *any* shared libraries at all. Otherwise you end up with multiple copies of the runtime, and thus multiple separate heaps, and hilarity ensues.
This is I think not what happens in practice, in vcpkg f.e. creation of a static library will by default also imply that that library is statically linked to the run-time. I also do this when building Boost, while you are actually saying that that option/possibility should not even exist as I presume that the combination of a dynamic library statically linked to the run-time crt makes even less sense.
I've never used vcpkg, so I can't really speak to its choices, but whenever you create a new project in VS (regardless if it is an executable, static library, or dynamic library), it will select the dynamic CRT by default, because this is the safest choice in all cases. And a cursory glance at the vcpkg repository shows that this script: https://github.com/Microsoft/vcpkg/blob/961cd9effd9a5230f211875bb0e9a6773e0e... attempts to prevent you compiling a dynamic library while linking to the static CRT. (I don't know how well it is actually applied; perhaps I'm misreading something.) As I said before, there is no problem with linking to the static CRT if you are *only* using static libraries. There is also less problem if you are only using DLLs that practice strict memory segregation (eg. using an agreed memory allocator for all exchanged objects, such as the COM or Shell allocators), or ensuring that objects are only ever allocated or deallocated on exactly one "side". It's a lot harder (though not impossible, with very careful use of smart pointers and allocators) to achieve this segregation with C++ libraries. (Having said that, this only discusses one aspect of library duplication, that of separate heaps. There can be other issues such as duplicate singletons and other statics; these can also ruin your day, but are more library-specific.) Other "shared" CRT state (locale, exceptions, standard I/O streams, etc) also won't actually be shared if you are using multiple static CRTs.
On Tue, 21 Aug 2018 at 07:45, Gavin Lambert via Boost
I've never used vcpkg, so I can't really speak to its choices, but whenever you create a new project in VS (regardless if it is an executable, static library, or dynamic library), it will select the dynamic CRT by default, because this is the safest choice in all cases.
And a cursory glance at the vcpkg repository shows that this script:
https://github.com/Microsoft/vcpkg/blob/961cd9effd9a5230f211875bb0e9a6773e0e...
attempts to prevent you compiling a dynamic library while linking to the static CRT. (I don't know how well it is actually applied; perhaps I'm misreading something.)
Yes, that's correct, by default a dynamic library links to the dynamic crt and a static library will be statically linked to the crt. There are some cases/packages where this is overridden however, where the dynamic crt linking is enforced.
As I said before, there is no problem with linking to the static CRT if you are *only* using static libraries.
I re-read and see that now, this is what I always try to do, which pre-cludes using intel-tbb, which is a bit of a bummer. There is also less problem if you are only using DLLs that practice
strict memory segregation (eg. using an agreed memory allocator for all exchanged objects, such as the COM or Shell allocators), or ensuring that objects are only ever allocated or deallocated on exactly one "side".
It's a lot harder (though not impossible, with very careful use of smart pointers and allocators) to achieve this segregation with C++ libraries.
(Having said that, this only discusses one aspect of library duplication, that of separate heaps. There can be other issues such as duplicate singletons and other statics; these can also ruin your day, but are more library-specific.)
Other "shared" CRT state (locale, exceptions, standard I/O streams, etc) also won't actually be shared if you are using multiple static CRTs.
I will make some notes of all this, because it seems highly relevant, not widely discussed and frankly quite a mine-field regardless of whether one opts to be fully dynamic or fully static. You are saying there are quite lot lot of caveats either way, what is your basic advice, dll's or static linking (static or dynamic crt?)? I would like to try and get to some rule-set, guidance principle (if even just for myself) degski -- *“If something cannot go on forever, it will stop" - Herbert Stein* *“No, it isn’t truth. Truth isn’t truth" - Rudolph W. L. Giuliani*
On 21/08/2018 17:02, degski wrote:
I will make some notes of all this, because it seems highly relevant, not widely discussed and frankly quite a mine-field regardless of whether one opts to be fully dynamic or fully static.
Yep. I don't speak in any official capacity, just as one who has experienced biJ.
You are saying there are quite lot lot of caveats either way, what is your basic advice, dll's or static linking (static or dynamic crt?)? I would like to try and get to some rule-set, guidance principle (if even just for myself)
As a library creator, you don't really get to choose (unless you want to force the user's choice by only supporting dynamic linking). I don't think it is a safe choice for a C++ library to only support static linking, unless it can make certain guarantees about its consumers (eg. that there will only be one, or that its multiple consumers will only be statically linked themselves). (Related: Boost.Exception makes me nervous in that regard, though I haven't looked into it.) I don't mean that it's always unsafe, just that it's a lot harder, and thus easier to accidentally mess it up. As an application creator, you have a bit more free rein to choose whether to go all-static or all-dynamic (unless you're forced to use a dynamic library). Note that licenses can also force your hand; for example you can only use an LGPL library if you dynamically link to it or if you use GPL or LGPL yourself. Using dynamic libraries is nice because they're more modular, and at least in the case of an actually shared library can reduce system memory usage. They can also aid patch deployment if you know you only need to replace a subset of files, or for installing optional plugins. And they usually Just Work™. But they also increase the security attack surface of your application, both due to exposing symbol names and addresses of corresponding code, and because it's usually trivial to impersonate an external library. Static libraries also have a possible advantage of compiling everything into a single binary, which might make it easier to create a portable application or one that otherwise doesn't require installation. In ye olde bog standard user application I personally tend to default to using shared libraries for everything (notably, also using BOOST_ALL_DYN_LINK to help enforce this). However it might not be the best choice when interfacing with a library like OpenSSL, for example.
On Tue, 21 Aug 2018 at 09:23, Gavin Lambert via Boost
On 21/08/2018 17:02, degski wrote:
I will make some notes of all this, because it seems highly relevant, not widely discussed and frankly quite a mine-field regardless of whether one opts to be fully dynamic or fully static.
Yep. I don't speak in any official capacity, just as one who has experienced biJ.
You are saying there are quite lot lot of caveats either way, what is your basic advice, dll's or static linking (static or dynamic crt?)? I would like to try and get to some rule-set, guidance principle (if even just for myself)
As a library creator, you don't really get to choose (unless you want to force the user's choice by only supporting dynamic linking).
I don't think it is a safe choice for a C++ library to only support static linking, unless it can make certain guarantees about its consumers (eg. that there will only be one, or that its multiple consumers will only be statically linked themselves). (Related: Boost.Exception makes me nervous in that regard, though I haven't looked into it.) I don't mean that it's always unsafe, just that it's a lot harder, and thus easier to accidentally mess it up.
As an application creator, you have a bit more free rein to choose whether to go all-static or all-dynamic (unless you're forced to use a dynamic library).
Note that licenses can also force your hand; for example you can only use an LGPL library if you dynamically link to it or if you use GPL or LGPL yourself.
Using dynamic libraries is nice because they're more modular, and at least in the case of an actually shared library can reduce system memory usage. They can also aid patch deployment if you know you only need to replace a subset of files, or for installing optional plugins. And they usually Just Work™.
But they also increase the security attack surface of your application, both due to exposing symbol names and addresses of corresponding code, and because it's usually trivial to impersonate an external library.
Static libraries also have a possible advantage of compiling everything into a single binary, which might make it easier to create a portable application or one that otherwise doesn't require installation.
In ye olde bog standard user application I personally tend to default to using shared libraries for everything (notably, also using BOOST_ALL_DYN_LINK to help enforce this). However it might not be the best choice when interfacing with a library like OpenSSL, for example.
Thank you for that write-up, I'll print and hang in bath-room :-). degski -- *“If something cannot go on forever, it will stop" - Herbert Stein* *“No, it isn’t truth. Truth isn’t truth" - Rudolph W. L. Giuliani*
On Tue, Aug 21, 2018 at 8:22 AM, Gavin Lambert via Boost
As an application creator, you have a bit more free rein to choose whether to go all-static or all-dynamic (unless you're forced to use a dynamic library).
Note that licenses can also force your hand; for example you can only use an LGPL library if you dynamically link to it or if you use GPL or LGPL yourself.
Using dynamic libraries is nice because they're more modular, and at least in the case of an actually shared library can reduce system memory usage. They can also aid patch deployment if you know you only need to replace a subset of files, or for installing optional plugins. And they usually Just Work™.
But they also increase the security attack surface of your application, both due to exposing symbol names and addresses of corresponding code, and because it's usually trivial to impersonate an external library.
Static libraries also have a possible advantage of compiling everything into a single binary, which might make it easier to create a portable application or one that otherwise doesn't require installation.
IMO dynamic libs make sense if they're managed by the system, if you have to distribute / install them yourself chances are they're not going to be shared anyway. Full static, or something that's not easily possible today (AFAIK), building the library files directly as part of your project, ensures you always link to the same code and in the latter case ensures the lib is build with exactly the same settings as your project. The latter is like header-only libs.. Static CRT is mostly (only?) an issue on Windows isn't it? If only MS / Windows itself would take care of installing it. -- Olaf
On Tue, 21 Aug 2018 at 10:04, Olaf van der Spek via Boost < boost@lists.boost.org> wrote:
Static CRT is mostly (only?) an issue on Windows isn't it? If only MS / Windows itself would take care of installing it.
I lost you here. MSVC installs it on the host-machine, on a target machine you won't need it. A "normal" Windows machine only has the dynamic CRT, a development machine has both. degski -- *“If something cannot go on forever, it will stop" - Herbert Stein* *“No, it isn’t truth. Truth isn’t truth" - Rudolph W. L. Giuliani*
On Tue, Aug 21, 2018 at 9:48 AM, degski
On Tue, 21 Aug 2018 at 10:04, Olaf van der Spek via Boost
wrote: Static CRT is mostly (only?) an issue on Windows isn't it? If only MS / Windows itself would take care of installing it.
I lost you here. MSVC installs it on the host-machine, on a target machine you won't need it.
A "normal" Windows machine only has the dynamic CRT, a
That CRT isn't there by default is it? That's the (only?) reason one opts for the static CRT. -- Olaf
On Tue, 21 Aug 2018 at 10:50, Olaf van der Spek
A "normal" Windows machine only has the dynamic CRT, a
That CRT isn't there by default is it?
Maybe we are having a different interpretation of what I meant with "normal" in this context. With "normal", I mean the man in the street's windows, a non development machine. It comes with the dll's of dynamic CRT (i.e. without the link stubs) that was used to build that windows version, otherwise it couldn't work. Then there is something called SxS, this allows for various dynamic CRT's to co-exist on the machine. This is necessary iff one wants to run apps that were built with CRT's that are different from the one that came with the system. Yes, if you mean this with "is not installed", you are right, as an app distributor you'll have to make sure somehow (by asking the user to install it, or by downloading and installing it on the fly or by including it in your installer and install it) the target machine has the required run-time.
That's the (only?) reason one opts for the static CRT.
Yes, that takes away that bother. Additionally, this allows for a "portable" installation. I have little utility I still use once in a while, it's statically built in 1995, it still works very well, and obviously much faster then when is was released originally :-). degski -- *“If something cannot go on forever, it will stop" - Herbert Stein* *“No, it isn’t truth. Truth isn’t truth" - Rudolph W. L. Giuliani*
On 21/08/2018 19:04, Olaf van der Spek wrote:
IMO dynamic libs make sense if they're managed by the system, if you have to distribute / install them yourself chances are they're not going to be shared anyway.
If your application consists of multiple separate executable binaries, then you might still want to distribute a library that they all share dynamically, even if it's private. And it is still possible to install libraries system-wide, although that tends to be more frowned on nowadays due to long years of people doing it wrong and installing something that breaks other applications.
Full static, or something that's not easily possible today (AFAIK), building the library files directly as part of your project, ensures you always link to the same code and in the latter case ensures the lib is build with exactly the same settings as your project. The latter is like header-only libs..
Yes, although the latter often brings its own headaches with the build environment (include paths and defined symbols, mostly), which is probably why it's not done as often. And also people often want to be able to compile a library once and then just use it rather than having to recompile it periodically.
Static CRT is mostly (only?) an issue on Windows isn't it? If only MS / Windows itself would take care of installing it.
I think you have some wires crossed. The static CRT is compiled into an application, so by definition doesn't require installation at all. And the dynamic CRT is already installed as part of Windows -- every Windows version always includes all the previously-released CRTs bundled into it. But until they invent a handy time machine, they can't bundle the runtimes that are released after the OS is released. Sometimes they get included in Windows Update, sometimes not (traditionally only security fixes were included, not general feature releases). But in general if you're releasing some software that uses a runtime newer than the oldest version of Windows you're expecting it to run on, then you need to include the runtime in your installer, just in case. Or use the static CRT instead. This is not a Windows-only issue either -- it's just more common to use the latest Windows dev tools than it is to use the latest gcc/clang/stdlib on Linux, since the latter require complex building from source. And then if you do that you would again either have to use the static runtime or distribute a dynamic runtime along with your binary if you expect it to be able to run on a system that only has the older runtime installed. (Again, Linux seems less prone to this in general only due to the culture of compiling from source rather than distributing binaries, unless using binaries precompiled and distributed by the system vendor.) This is getting a bit off-topic anyway.
On 08/21/2018 05:35 PM, Gavin Lambert via Boost wrote: On 21/08/2018 19:04, Olaf van der Spek wrote: IMO dynamic libs make sense if they're managed by the system, if you have to distribute / install them yourself chances are they're not going to be shared anyway. If your application consists of multiple separate executable binaries, then you might still want to distribute a library that they all share dynamically, even if it's private. And it is still possible to install libraries system-wide, although that tends to be more frowned on nowadays due to long years of people doing it wrong and installing something that breaks other applications. Full static, or something that's not easily possible today (AFAIK), building the library files directly as part of your project, ensures you always link to the same code and in the latter case ensures the lib is build with exactly the same settings as your project. The latter is like header-only libs.. Yes, although the latter often brings its own headaches with the build environment (include paths and defined symbols, mostly), which is probably why it's not done as often. And also people often want to be able to compile a library once and then just use it rather than having to recompile it periodically. Static CRT is mostly (only?) an issue on Windows isn't it? If only MS / Windows itself would take care of installing it. I think you have some wires crossed. The static CRT is compiled into an application, so by definition doesn't require installation at all. And the dynamic CRT is already installed as part of Windows -- every Windows version always includes all the previously-released CRTs bundled into it. But until they invent a handy time machine, they can't bundle the runtimes that are released after the OS is released. Sometimes they get included in Windows Update, sometimes not (traditionally only security fixes were included, not general feature releases). But in general if you're releasing some software that uses a runtime newer than the oldest version of Windows you're expecting it to run on, then you need to include the runtime in your installer, just in case. Or use the static CRT instead. This is not a Windows-only issue either -- it's just more common to use the latest Windows dev tools than it is to use the latest gcc/clang/stdlib on Linux, since the latter require complex building from source. And then if you do that you would again either have to use the static runtime or distribute a dynamic runtime along with your binary if you expect it to be able to run on a system that only has the older runtime installed. (Again, Linux seems less prone to this in general only due to the culture of compiling from source rather than distributing binaries, unless using binaries precompiled and distributed by the system vendor.) This is getting a bit off-topic anyway. All of these technical issues requiring so much effort and expertise now and for the foreseeable future. I would prefer to at least have the choice of pure header only for all libs, because I don't care how long my compile takes.
On Wed, 22 Aug 2018 at 22:22, Jeffrey Graham via Boost < boost@lists.boost.org> wrote:
All of these technical issues requiring so much effort and expertise now and for the foreseeable future. I would prefer to at least have the choice of pure header only for all libs, because I don't care how long my compile takes.
From what I remember, this was tried for Boost.Random not so long ago, which is a limited problem, and this was considered a fail (unless I'm misinterpreting). I guess that some other libs will equally fail, if attempted. Not all libs are under active maintenance, so this also is an issue, as in, who is gonna do it. This would also create unnecessary risks of creating bugs in the process. I don't mind compile times that much either, but making everything header only seems, even to me - a radical, a
pipe-dream. If you are on windows, I suggest you adopt vcpkg https://github.com/Microsoft/vcpkg, the provide boost-1.68 (and many other libs) without hassle. If you are on nix, I believe there is a plethora of package managers around, conan or chocolaty come highly recommended, from what I read. degski -- *“If something cannot go on forever, it will stop" - Herbert Stein* *“No, it isn’t truth. Truth isn’t truth" - Rudolph W. L. Giuliani*
On 8/18/2018 11:19 AM, Robert Ramey via Boost wrote:
On 8/18/18 6:58 AM, Edward Diener via Boost wrote:
I do not see any difference, vis-a-vis the visibility problem discussed, between dynamically loaded shared libraries or statically loaded shared libraries.
visibility isn't really an issue with static libraries. visibility decreases the number of externally visible symbols. In linking a static library this might decrease linking times - but I haven't noticed it and I never received complaints about it. I think that the BOOST visibility macros are defined to nothing for static builds.
I was not referring to static libraries. You used the phrase "dynamically loaded shared libraries". Did you just mean "shared libraries" as opposed to "static libraries" ? Whether a shared library is dynamically loaded or statically loaded it is still a shared library. My point is that visibility is not affected by whether a shared library is dynamically loaded or statically loaded.
On 8/18/18 10:46 AM, Edward Diener via Boost wrote:
You used the phrase "dynamically loaded shared libraries". Did you just mean "shared libraries" as opposed to "static libraries" ? Whether a shared library is dynamically loaded or statically loaded it is still a shared library. My point is that visibility is not affected by whether a shared library is dynamically loaded or statically loaded.
To me, all shared libraries can be considered dynamically loaded. If not done explicitly they are loaded "automatically" at start up or perhaps on first use. The situation gets very interesting when they have static variables in them. There's a lot going on here. Then there are issues related to contracts and modules. And when one shared library refers to another. etc. ... Robert Ramey
On 17.08.18 08:21, Antony Polukhin via Boost wrote:
I'd like to draw attention to a problem with Boost binaries for Linux.
There's an awesome -fvisibility=hidden flag for Linux compilers that improves load times, performance, size of binaries and reduces the chance of symbol collisions. More info at https://gcc.gnu.org/wiki/Visibility .
Unfortunately, most of the Boost libraries do not set it by default: - atomic - chrono - container - context - contract - coroutine - date_time - exception - fiber - filesystem - graph - graph_parallel - iostreams - locale - mpi - program_options - python - random - regex - signals - system - test - thread - timer - type_erasure - wave
Moreover minority of the above libraries just do not work with the flag. Users just can not run ./b2 cxxflags="-fvisibility=hidden" because there's a chance that some library could stop linking. Actually things are even more ugly. Linux distributions usually do not tune the build flags for each package so at least Debian based distributions build Boost with default flags. Users get suboptimal builds.
If you're a maintainer of one of the above libraries *please do* the following steps: * Make sure that all the public symbols are accordingly marked with appropriate BOOST_SYMBOL_* macro. Instruction is available here: https://www.boost.org/doc/libs/1_68_0/libs/config/doc/html/boost_config/boos... * Turn on the visibility=hidden by default: * by adding <target-os>linux:<cxxflags>"-fvisibility=hidden" to the Jamfile if you do not care much for antique compilers (Example https://github.com/boostorg/stacktrace/blob/819f2b1c861dec7530372a990ecabab7... ) * by using a more advanced technique for detecting the flag availability (For example see https://github.com/boostorg/math/blob/develop/build/Jamfile.v2#L20 or https://github.com/boostorg/log/blob/develop/build/Jamfile.v2#L24 )
P.S.: I would appreciate any comments or updates on the feature request. P.P.S.: Log, Math, Serialization (and Stacktrace) libraries already use that flag by default. Many thanks!
Hi all, I personally was unable to reduce the chances for symbol clashes with the -fvisibility=hidden flag alone: I had to combine it with a map file { global: mexFunction; local: *; }; and the option "-Wl,--exclude-libs,ALL" to achieve some sense of isolation. On Linux, all symbols from all shared libraries are merged into one unique namespace for a given process, whether they come from a directly loaded shared library or an indirect dependency loaded later (shared or static if the shared lib links to static libs). If we consider only the issue of symbol clashes, to my experience, hiding symbols does not help much, as some symbols are still pulled and merged with little to no control on it. I rather found that counter-intuitive as one may think that hiding symbols gives the same namespace isolation for shared libraries as we have on Windows, while it is not the case at all. In case of a symbol clash, we still have a hard time debugging. Also, at that time I had all those issues, it seemed that all libraries were compiled with weak symbols definitions, which made the problem even more difficult to debug. I've found this "STV_PROTECTED" attribute on the visibility (see https://www.ibm.com/developerworks/aix/library/au-aix-symbol-visibility/inde...) but I do not know how to use it properly. All in all, - Linux is a nightmare for symbol clashes - I'd love to learn a good way of doing things to avoid clashes, please educate me :D - I believe this terrain of discussion is in the grey area of C++, and left so far to package managers Raffi PS: The context: I was creating a .mex file for Matlab (from https://github.com/MPI-IS/Grassmann-Averages-PCA). The C++ shared library (the .mex) is using boost.thread, and the problem was that Matlab has its own version of boost.thread. If I remember, even with the same version of boost as the one shipped with Matlab, I was having issues (while technically, all the symbols that were merged should have been equivalent).
participants (10)
-
Andrey Semashev
-
Antony Polukhin
-
degski
-
Edward Diener
-
Gavin Lambert
-
James E. King III
-
Jeffrey Graham
-
Olaf van der Spek
-
Raffi Enficiaud
-
Robert Ramey