linking to boost thread exposes static initialisation order fiasco in ublas?
Hi
This is a weird issue, and I'm by no means sure that my description is
correct. It might be a newbie misunderstanding, but it is weird
behaviour and I would appreciate any help in getting to the bottom of
it.
Here is a minimal example that exhibits the issue: ("zeromain.cpp")
#include ::coordinate_matrix ()
(gdb) bt
#0 0x00001f24 in boost::numeric::ublas::coordinate_matrix
Hi, this is probably not very helpful, but: It works fine on my machine. Linux 2.6.24-19-generic x86_64 GNU/Linux gcc 4.2.3 I linked to boost_thread dynamically: /usr/bin/c++ -Wall -Wreorder -Wnon-virtual-dtor -Wno-non-template-friend -Woverloaded-virtual -Wsign-promo -Wextra -fvisibility=hidden -D_GNU_SOURCE -fPIC "CMakeFiles/DateTime.dir/DateTime.o" "CMakeFiles/DateTime.dir/__/generated/revision.o" -o DateTime -rdynamic -L/home/rbock/Software/Sources/BoostTests/trunk/src -L/home/rbock/Software/Binaries/boost/1.36/lib -lboost_date_time-gcc42-mt-1_36 -lboost_thread-gcc42-mt-1_36 -lboost_date_time-gcc42-mt-1_36 -lboost_system-gcc42-mt-1_36 -lpthread -Wl,-rpath,/home/rbock/Software/Sources/BoostTests/trunk/src:/home/rbock/Software/Binaries/boost/1.36/lib In my previous company, we had some weird problems with the first 4.x versions of gcc, so maybe you should try to upgrade? Of course, I cannot promise that this would help. If valgrind is available for your system, you might also try to use it to gather more details of what is happening with the memory. The report on my machine is clean, but on your machine something should be shown. Regards, Roland David Philp wrote:
Hi
This is a weird issue, and I'm by no means sure that my description is correct. It might be a newbie misunderstanding, but it is weird behaviour and I would appreciate any help in getting to the bottom of it.
Here is a minimal example that exhibits the issue: ("zeromain.cpp")
#include
#include int main (int argc, char * argv[]) { boost::numeric::ublas::coordinate_matrix<double> cm; cm.resize(3,3, false); return 0; }
When compiled [with bjam release] and linked to libboost_thread it gives a "bus error". The backtrace is appended to this email. It works fine in debug mode.
You can see that I have introduced boost.thread but not yet actually used the thread library in any way. If I take out either the include, or don't link to the lib, the bug goes away.
Thanks in advance for any help or suggestions! Details below.
David
Mac OS X 1.5.5, boost 1.36. Intel CPU. The boost libraries are in $DYLD_LIBRARY_PATH. Jamroot contains exe zm : zeromain.cpp /sage//boost_thread : <include>boost ; ("sage" is a separate project whose Jamroot contains lib boost_thread : : <file>local/lib/boost/libboost_thread-xgcc40-mt-1_36.a ; It is a regular build of the boost libraries.)
Backtrace: Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0x00000000 0x00001f24 in boost::numeric::ublas::coordinate_matrix
, 0ul, boost::numeric::ublas::unbounded_array , boost::numeric::ublas::unbounded_array ::coordinate_matrix () (gdb) bt #0 0x00001f24 in boost::numeric::ublas::coordinate_matrix
, 0ul, boost::numeric::ublas::unbounded_array , boost::numeric::ublas::unbounded_array ::coordinate_matrix () #1 0x8fe12e76 in __dyld__ZN16ImageLoaderMachO18doModInitFunctionsERKN11ImageLoader11LinkContextE () #2 0x8fe0e723 in __dyld__ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEj () #3 0x8fe0e809 in __dyld__ZN11ImageLoader15runInitializersERKNS_11LinkContextE () #4 0x8fe04102 in __dyld__ZN4dyld24initializeMainExecutableEv () #5 0x8fe07b5f in __dyld__ZN4dyld5_mainEPK11mach_headermiPPKcS5_S5_ () #6 0x8fe01872 in __dyld__ZN13dyldbootstrap5startEPK11mach_headeriPPKcl () #7 0x8fe01037 in __dyld__dyld_start ()
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Dear Roland Thanks for trying it out, and thanks also because I assume you glanced at the code and there isn't an obvious mistake! The problem is the same whether I link statically or dynamically.
In my previous company, we had some weird problems with the first 4.x versions of gcc, so maybe you should try to upgrade? Of course, I cannot promise that this would help.
I just used bjam release -d+2 and ran the commands step-by-step myself. It turns out that the /strip/ command exposes the problem. If I take the "debug" executable and run strip on it, it all crashes. That suggests to me that it's not GCC. (OS X 10.5.5 uses GCC 4.0.1, which is not the notorious 4.0.0.) As I understand it, valgrind is a Linux-only package. Thanks for your suggestions. I'm convinced that this is is a bug, but I don't have anything like the expertise to chase it down. David On 19/09/2008, at 6:42 AM, Dr. Roland Bock wrote:
Hi,
this is probably not very helpful, but: It works fine on my machine.
Linux 2.6.24-19-generic x86_64 GNU/Linux gcc 4.2.3
I linked to boost_thread dynamically: /usr/bin/c++ -Wall -Wreorder -Wnon-virtual-dtor -Wno-non-template- friend -Woverloaded-virtual -Wsign-promo -Wextra -fvisibility=hidden -D_GNU_SOURCE -fPIC "CMakeFiles/DateTime.dir/DateTime.o" "CMakeFiles/DateTime.dir/__/generated/revision.o" -o DateTime - rdynamic -L/home/rbock/Software/Sources/BoostTests/trunk/src -L/home/ rbock/Software/Binaries/boost/1.36/lib -lboost_date_time-gcc42- mt-1_36 -lboost_thread-gcc42-mt-1_36 -lboost_date_time-gcc42-mt-1_36 -lboost_system-gcc42-mt-1_36 -lpthread -Wl,-rpath,/home/rbock/ Software/Sources/BoostTests/trunk/src:/home/rbock/Software/Binaries/ boost/1.36/lib
In my previous company, we had some weird problems with the first 4.x versions of gcc, so maybe you should try to upgrade? Of course, I cannot promise that this would help.
If valgrind is available for your system, you might also try to use it to gather more details of what is happening with the memory. The report on my machine is clean, but on your machine something should be shown.
Regards,
Roland
David Philp wrote:
Hi This is a weird issue, and I'm by no means sure that my description is correct. It might be a newbie misunderstanding, but it is weird behaviour and I would appreciate any help in getting to the bottom of it. Here is a minimal example that exhibits the issue: ("zeromain.cpp") #include
#include int main (int argc, char * argv[]) { boost::numeric::ublas::coordinate_matrix<double> cm; cm.resize(3,3, false); return 0; } When compiled [with bjam release] and linked to libboost_thread it gives a "bus error". The backtrace is appended to this email. It works fine in debug mode. You can see that I have introduced boost.thread but not yet actually used the thread library in any way. If I take out either the include, or don't link to the lib, the bug goes away. Thanks in advance for any help or suggestions! Details below. David Mac OS X 1.5.5, boost 1.36. Intel CPU. The boost libraries are in $DYLD_LIBRARY_PATH. Jamroot contains exe zm : zeromain.cpp /sage//boost_thread : <include>boost ; ("sage" is a separate project whose Jamroot contains lib boost_thread : : <file>local/lib/boost/libboost_thread-xgcc40- mt-1_36.a ; It is a regular build of the boost libraries.) Backtrace: Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0x00000000 0x00001f24 in boost::numeric::ublas::coordinate_matrix , 0ul, boost::numeric::ublas::unbounded_array , boost::numeric::ublas::unbounded_array >::coordinate_matrix () (gdb) bt #0 0x00001f24 in boost::numeric::ublas::coordinate_matrix , 0ul, boost::numeric::ublas::unbounded_array , boost::numeric::ublas::unbounded_array >::coordinate_matrix () #1 0x8fe12e76 in __dyld__ZN16ImageLoaderMachO18doModInitFunctionsERKN11ImageLoader11LinkContextE () #2 0x8fe0e723 in __dyld__ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEj () #3 0x8fe0e809 in __dyld__ZN11ImageLoader15runInitializersERKNS_11LinkContextE () #4 0x8fe04102 in __dyld__ZN4dyld24initializeMainExecutableEv () #5 0x8fe07b5f in __dyld__ZN4dyld5_mainEPK11mach_headermiPPKcS5_S5_ () #6 0x8fe01872 in __dyld__ZN13dyldbootstrap5startEPK11mach_headeriPPKcl () #7 0x8fe01037 in __dyld__dyld_start () _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
================================== David J Philp Postdoctoral Fellow National Centre for Epidemiology and Population Health Building 62, cnr Mills Rd & Eggleston Rd The Australian National University Canberra ACT 0200 Australia T: +61 2 6125 8260 F: +61 2 6125 0740 M: 0423 535 397 W: http://nceph.anu.edu.au/ CRICOS Provider #00120C
On Fri, Sep 19, 2008 at 09:08:21AM +1000, David Philp wrote:
myself. It turns out that the /strip/ command exposes the problem. If I take the "debug" executable and run strip on it, it all crashes.
do you run plain strip, or strip -g? Without the -g switch (or an equivalent on your platform), strip removes _more_ than just debug symbols, which in turn might cause RTTI or other symbol lookups to fail.
On 19/09/2008, at 3:40 PM, Zeljko Vrba wrote:
On Fri, Sep 19, 2008 at 09:08:21AM +1000, David Philp wrote:
myself. It turns out that the /strip/ command exposes the problem. If I take the "debug" executable and run strip on it, it all crashes.
do you run plain strip, or strip -g? Without the -g switch (or an equivalent on your platform), strip removes _more_ than just debug symbols, which in turn might cause RTTI or other symbol lookups to fail.
Interesting. I just ran "plain strip", as that was what boost-build does. If you are correct, that makes this a bug in boost.build. (I.e. boost.build is not calling strip with correct options.) strip has no -g option on OS X. Unfortunately I don't understand what strip does well enough to figure out what the correct options are. If anyone cares to look at the documentation and send me suggestions, I'm happy to try them out and add them to the report I put in trac. The manpage is here: http://developer.apple.com/documentation/Darwin/Reference/ManPages/man1/stri... Thank you very much for your time. D
On Fri, Sep 19, 2008 at 04:20:25PM +1000, David Philp wrote:
strip has no -g option on OS X. Unfortunately I don't understand what
It seems that -S is the correct option to use on OS X: -S Remove the debugging symbol table entries (those created by the -g option to cc(1) and other compilers). -x Remove all local symbols (saving only global symbols). Try running first strip -S, and then strip -S -x, and see whether it has any effect on the working of your program. (Strip -S -x will remove more than just -S).
On 20/09/2008, at 3:31 PM, Zeljko Vrba wrote:
On Fri, Sep 19, 2008 at 04:20:25PM +1000, David Philp wrote:
strip has no -g option on OS X. Unfortunately I don't understand what
It seems that -S is the correct option to use on OS X:
-S Remove the debugging symbol table entries (those created by the -g option to cc(1) and other compilers). -x Remove all local symbols (saving only global symbols).
Try running first strip -S, and then strip -S -x, and see whether it has any effect on the working of your program. (Strip -S -x will remove more than just -S).
I ran these on the debug binary, and both versions work, and produce working binaries. I have added a comment about these to trac. This seems to be a likely solution to the problem. Thanks everyone for your help. D ================================== David J Philp Postdoctoral Fellow National Centre for Epidemiology and Population Health Building 62, cnr Mills Rd & Eggleston Rd The Australian National University Canberra ACT 0200 Australia T: +61 2 6125 8260 F: +61 2 6125 0740 M: 0423 535 397 W: http://nceph.anu.edu.au/ CRICOS Provider #00120C
Hi David, just for completeness, I stripped my binary, too. No problem on my machine, sorry. Regarding gcc: If upgrading is an option on OS X, personally, I would try it. With language support for C and C++ only, it is not that hard to compile gcc (at least on Linux). Regarding Valgrind: If valgrind is not available for OS X, there have to be other tools to do similar analysis. I read the efence is available, for instance. Anyways, my suggestion is to get tool-support (and try to upgrade). Regards, Roland David Philp wrote:
Dear Roland
Thanks for trying it out, and thanks also because I assume you glanced at the code and there isn't an obvious mistake!
The problem is the same whether I link statically or dynamically.
In my previous company, we had some weird problems with the first 4.x versions of gcc, so maybe you should try to upgrade? Of course, I cannot promise that this would help.
I just used bjam release -d+2 and ran the commands step-by-step myself. It turns out that the /strip/ command exposes the problem. If I take the "debug" executable and run strip on it, it all crashes. That suggests to me that it's not GCC. (OS X 10.5.5 uses GCC 4.0.1, which is not the notorious 4.0.0.)
As I understand it, valgrind is a Linux-only package.
Thanks for your suggestions. I'm convinced that this is is a bug, but I don't have anything like the expertise to chase it down.
David
On 19/09/2008, at 6:42 AM, Dr. Roland Bock wrote:
Hi,
this is probably not very helpful, but: It works fine on my machine.
Linux 2.6.24-19-generic x86_64 GNU/Linux gcc 4.2.3
I linked to boost_thread dynamically: /usr/bin/c++ -Wall -Wreorder -Wnon-virtual-dtor -Wno-non-template-friend -Woverloaded-virtual -Wsign-promo -Wextra -fvisibility=hidden -D_GNU_SOURCE -fPIC "CMakeFiles/DateTime.dir/DateTime.o" "CMakeFiles/DateTime.dir/__/generated/revision.o" -o DateTime -rdynamic -L/home/rbock/Software/Sources/BoostTests/trunk/src -L/home/rbock/Software/Binaries/boost/1.36/lib -lboost_date_time-gcc42-mt-1_36 -lboost_thread-gcc42-mt-1_36 -lboost_date_time-gcc42-mt-1_36 -lboost_system-gcc42-mt-1_36 -lpthread -Wl,-rpath,/home/rbock/Software/Sources/BoostTests/trunk/src:/home/rbock/Software/Binaries/boost/1.36/lib
In my previous company, we had some weird problems with the first 4.x versions of gcc, so maybe you should try to upgrade? Of course, I cannot promise that this would help.
If valgrind is available for your system, you might also try to use it to gather more details of what is happening with the memory. The report on my machine is clean, but on your machine something should be shown.
Regards,
Roland
David Philp wrote:
Hi This is a weird issue, and I'm by no means sure that my description is correct. It might be a newbie misunderstanding, but it is weird behaviour and I would appreciate any help in getting to the bottom of it. Here is a minimal example that exhibits the issue: ("zeromain.cpp") #include
#include int main (int argc, char * argv[]) { boost::numeric::ublas::coordinate_matrix<double> cm; cm.resize(3,3, false); return 0; } When compiled [with bjam release] and linked to libboost_thread it gives a "bus error". The backtrace is appended to this email. It works fine in debug mode. You can see that I have introduced boost.thread but not yet actually used the thread library in any way. If I take out either the include, or don't link to the lib, the bug goes away. Thanks in advance for any help or suggestions! Details below. David Mac OS X 1.5.5, boost 1.36. Intel CPU. The boost libraries are in $DYLD_LIBRARY_PATH. Jamroot contains exe zm : zeromain.cpp /sage//boost_thread : <include>boost ; ("sage" is a separate project whose Jamroot contains lib boost_thread : : <file>local/lib/boost/libboost_thread-xgcc40-mt-1_36.a ; It is a regular build of the boost libraries.) Backtrace: Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0x00000000 0x00001f24 in boost::numeric::ublas::coordinate_matrix , 0ul, boost::numeric::ublas::unbounded_array , boost::numeric::ublas::unbounded_array ::coordinate_matrix () (gdb) bt #0 0x00001f24 in boost::numeric::ublas::coordinate_matrix
, 0ul, boost::numeric::ublas::unbounded_array , boost::numeric::ublas::unbounded_array
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
================================== David J Philp Postdoctoral Fellow National Centre for Epidemiology and Population Health Building 62, cnr Mills Rd & Eggleston Rd The Australian National University Canberra ACT 0200 Australia
T: +61 2 6125 8260 F: +61 2 6125 0740 M: 0423 535 397 W: http://nceph.anu.edu.au/
CRICOS Provider #00120C
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (4)
-
David Philp
-
Dr. Roland Bock
-
Roland Bock
-
Zeljko Vrba