Back on May 2 2016 I made what I thought was an innocuous change to the serialization library build. I then forgot about it. I ran my tests and normal on my OSX set up for the compilers that I use there - clang and g++ 5.1 in various modes 03, 11 etc. Things seemed fine. I uploaded the develop branch as normal. Sometime later I discovered that all the gcc tests on linux platforms were failing with a segfault. Unfortunately time had passed and I couldn't run the tests myself so I never could figure out what the breaking change was. Lately I separated time to get to the bottom of this and in installed via Parallels ubuntu v 14 on my Mac as a guest operating system. After the usual back and forth getting stuff to work I was able to reproduce the problem. Running gdb indicated that things crashed with a call to boost::io::state_saver which led me to investigate THAT library which as it turned out is also failing all it's tests on the plaform. So thought I - no problem - all I have to do is to complain to the author of that library. But this didnt' workout a) I posted message and no one responded. b) I checked trac items and there was what seemed a related item. But it seemed a little hard to understand/fix and in any case, no one addressing it. I looked into just fixing it myself, but it looked to entail delving into the bowels of libstd++. So this dissuaded me as I have my own bowels to delve into. c) So I commented out references to boost::io::state_saver an ran some tests - and damn still failed. d) So I rolled back my source to the master dated 24 April 2016 and incrementally updated and ran some tests until I found the culprit. Needless to say, this is quite a tedious procedure. The issue seems to be that I added the following to the jamfile: <toolset>gcc:<define>_STLP_DEBUG <toolset>gcc:<define>_GLIBCXX_DEBUG The second adds a lot of checking. commenting this permitted tests to pass. Lesson of all this is: a) I'm thinking we should run all tests on the deelop branch with _GLIBCXX_DEBUG defined. That is, our jam files should contain?: <toolset>gcc:<variant>debug><define>_GLIBCXX_DEBUG b) it looks like when an error in libstd++ is encounted - things just segfault. I couldn't find the corresponding source in libstd++. GDB just bailed when I made a call to state_saver. c) boost::io::state_saver should be maintained. d) It would be helpful to be able to roll back the test matrix to any previous git version. That is, the results would be put into a database or datacube and we could scroll back and forth to see track down problems in test results. Robert Ramey
On 10/30/2016 6:16 PM, Robert Ramey wrote:
Back on May 2 2016 I made what I thought was an innocuous change to the serialization library build.
I then forgot about it. I ran my tests and normal on my OSX set up for the compilers that I use there - clang and g++ 5.1 in various modes 03, 11 etc. Things seemed fine. I uploaded the develop branch as normal. Sometime later I discovered that all the gcc tests on linux platforms were failing with a segfault. Unfortunately time had passed and I couldn't run the tests myself so I never could figure out what the breaking change was.
Lately I separated time to get to the bottom of this and in installed via Parallels ubuntu v 14 on my Mac as a guest operating system. After the usual back and forth getting stuff to work I was able to reproduce the problem. Running gdb indicated that things crashed with a call to boost::io::state_saver which led me to investigate THAT library which as it turned out is also failing all it's tests on the plaform. So thought I - no problem - all I have to do is to complain to the author of that library. But this didnt' workout
a) I posted message and no one responded.
I responded although I am not a maintainer of boost::io::state_saver. I have subsequently issued a PR to boost::io::state_saver to fix the test situations where it is failing under gcc and clang. I have no idea if anyone is maintaining the library. I believe that Daryle Walker is the original creater of boost::io::state_saver.
b) I checked trac items and there was what seemed a related item. But it seemed a little hard to understand/fix and in any case, no one addressing it. I looked into just fixing it myself, but it looked to entail delving into the bowels of libstd++. So this dissuaded me as I have my own bowels to delve into.
c) So I commented out references to boost::io::state_saver an ran some tests - and damn still failed.
d) So I rolled back my source to the master dated 24 April 2016 and incrementally updated and ran some tests until I found the culprit.
Needless to say, this is quite a tedious procedure. The issue seems to be that I added the following to the jamfile:
<toolset>gcc:<define>_STLP_DEBUG <toolset>gcc:<define>_GLIBCXX_DEBUG
The second adds a lot of checking. commenting this permitted tests to pass. Lesson of all this is:
a) I'm thinking we should run all tests on the deelop branch with _GLIBCXX_DEBUG defined. That is, our jam files should contain?:
You did not explain why removing <toolset>gcc:<define>_GLIBCXX_DEBUG caused your tests to pass.
<toolset>gcc:<variant>debug><define>_GLIBCXX_DEBUG
b) it looks like when an error in libstd++ is encounted - things just segfault. I couldn't find the corresponding source in libstd++. GDB just bailed when I made a call to state_saver.
c) boost::io::state_saver should be maintained.
Agreed. If it is not maintained it either needs for somebody to volunteer to maintain it or its needs to be added to CMT.
d) It would be helpful to be able to roll back the test matrix to any previous git version. That is, the results would be put into a database or datacube and we could scroll back and forth to see track down problems in test results.
Robert Ramey
On 10/30/16 3:29 PM, Edward Diener wrote: pass. Lesson of all this is:
a) I'm thinking we should run all tests on the deelop branch with _GLIBCXX_DEBUG defined. That is, our jam files should contain?:
You did not explain why removing <toolset>gcc:<define>_GLIBCXX_DEBUG caused your tests to pass.
I don't know why. By binary search I isolated the change at which tests started to fail. The only change was the addition of the _GLIBCXX switch. So I commented it out and everything started to pass. Robert Ramey
participants (3)
-
Edward Diener
-
Peter Dimov
-
Robert Ramey