[serialization] Massive failures in develop?
Local testing seems to have badly failed: http://www.boost.org/development/tests/develop/developer/serialization.html I've only looked at the msvc failures, and there's an issue in the iostream class destructor - which is to say after the archive classes have been destructed, the iostream class they were using is left in a non-sane state. Beyond that I haven't been able to get anything sensible out of the debugger (looks like memory corruption actually, but I can't be sure). Any ideas? Thanks, John.
On 10/17/15 5:04 AM, John Maddock wrote:
Local testing seems to have badly failed: http://www.boost.org/development/tests/develop/developer/serialization.html
I've only looked at the msvc failures, and there's an issue in the iostream class destructor - which is to say after the archive classes have been destructed, the iostream class they were using is left in a non-sane state. Beyond that I haven't been able to get anything sensible out of the debugger (looks like memory corruption actually, but I can't be sure).
Indeed. This is where I've been working. I've been trying to address an issue whereby things go awry while using the utf8 codecvt facet. my GCC system complains that that a deleted memory allocation was being written to. It came up in only one test in one compiler, but I took it as an error which was just almost never detected. I've spend a lot of time trying to isolate this. I came to suspect the gcc library as having some issue in management of codecvt facets (whose design is rather kludgy in the first place). I moved the management of codecvt lifetime out of the standard library and to the library code. This addressed this failure. Of course it might have had repurcussions somewhere else. Sounds like Whack-a-Mole. The above is the short version. While this has been going on, I also needed to make some corrections in a place where wstring is converted to and mbstring and vice-versa. Again, turns out to be trickier than meets the eye. Also a few issues related to visibility of some virtual functions needed to be tracked down. I believe that these are the source of some of the gcc failures 139 which occasionally show up in the test matrix. Addressing all this has resulted in more tests being added, and some tests being enhanced to improve coverage. It also has resulted in my bjam files being enhanced so that visibility=hidden is enforced on my local development system for CLang. In addition, I addressed a number items in the trac database. Turns out that sometimes this takes a lot more time than one would think. I think that as time goes on - the remaining bugs turn out to be more and more obscure and harder to isolate and fix. It's a very odd paradox to me that an obsession with a hopeless goal of perfection is necessary to motivate one to constant improvement which in turn keeps hope alive. Currently there are a couple of pending issues I hope to resolve before next wednesday. a) refinement of codecvt facet lifetime issue. b) one test - test_iterators - depends upon that utf8_codecvt object. This object needs to be tweaked so that the function is visible. Oddly this shows up in my gcc tests but not in the clang ones. c) perhaps this broke the msvc tests Robert Ramey
Any ideas? Thanks, John.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
c) perhaps this broke the msvc tests
Reduced test case for msvc is just:
#include <sstream>
#include
a) refinement of codecvt facet lifetime issue.
I think this is the issue, you: * Create the facet as a data member of the archive. * Imbue it in a locale and then install that locale in the iostream. Now on destruction: * The archive is destroyed first, this destructs the locale facet *that's still in the iostream*. * The iostream then calls the destructor on an object that's already been destroyed. Depending on the virtual function / destructor implementation, you may or may not see this as an error at runtime. So I think you need to back up the iostream's locale and restore it in the archive destructor if you want to go this way. Apologies if I've misread your code, HTH, John. Update: I think I have a fix, testing now...
On 10/17/15 11:07 AM, John Maddock wrote:
a) refinement of codecvt facet lifetime issue.
I think this is the issue, you:
* Create the facet as a data member of the archive. * Imbue it in a locale and then install that locale in the iostream.
Now on destruction:
* The archive is destroyed first, this destructs the locale facet *that's still in the iostream*. * The iostream then calls the destructor on an object that's already been destroyed.
Depending on the virtual function / destructor implementation, you may or may not see this as an error at runtime.
So I think you need to back up the iostream's locale and restore it in the archive destructor if you want to go this way.
Apologies if I've misread your code,
This sound correct - as far as it goes. Previously, I used stream_buffer_saver - which is now not in there. This saved the current locale when created and restored upon termination of the archive. The motivation was to permit a user to have his ostream settings unchanged if the serization i/o was introduced in the middle of the stream. This worked as advertised - accept on some gcc configurations. I tracked the problem (in my mind anyway) to the standard system where by one creates a new codecvt facet on the heap and passes it to the streambuffer that manages the life time of this object. I spent a lot of time trying to make this work but couldn't get to the bottom of it. Thinking about it, I built the facet into the archive class so that the lifetime managements issues would be resolved. In the course of making this work, I dropped the streambuffer_saver. My intention is to put that back in now that I think I've got everything else sorted out. It's sounds simple now - but the original problem I was trying solve only appeared when using the utf8_codecvt facet to convert strings with russian characters in it. Somewhere it seems there is a coupling that is hard to discern. But anyway, I think I've got it almost where I want it to be.
HTH, John.
Update: I think I have a fix, testing now...
Thanks, I'll take a look at it. Robert Ramey
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
This sound correct - as far as it goes.
Previously, I used stream_buffer_saver - which is now not in there. This saved the current locale when created and restored upon termination of the archive. The motivation was to permit a user to have his ostream settings unchanged if the serization i/o was introduced in the middle of the stream.
I think this is the correct way to do things (and is what my patch does, but I'm fine if you revert to the old way). If you do any other thing then you introduce facet-lifetime issues as the archive is destroyed before the iostream leaving the iostream's locale with a dangling pointer (even making the facet lifetime global in the serialization lib doesn't work, because if someone writes an archive to cout, then the serialization lib is unloaded before the std lib and cout is left with a dangling pointer again).
This worked as advertised - accept on some gcc configurations. I tracked the problem (in my mind anyway) to the standard system where by one creates a new codecvt facet on the heap and passes it to the streambuffer that manages the life time of this object. I spent a lot of time trying to make this work but couldn't get to the bottom of it. Thinking about it, I built the facet into the archive class so that the lifetime managements issues would be resolved. In the course of making this work, I dropped the streambuffer_saver. My intention is to put that back in now that I think I've got everything else sorted out.
It's sounds simple now - but the original problem I was trying solve only appeared when using the utf8_codecvt facet to convert strings with russian characters in it. Somewhere it seems there is a coupling that is hard to discern. But anyway, I think I've got it almost where I want it to be.
Let me know if you have a test case you want me to try and debug, Best, John.
On 10/18/15 12:47 AM, John Maddock wrote:
I think this is the correct way to do things (and is what my patch does, but I'm fine if you revert to the old way).
I've included the original stream_buffer_saver. I had originally taken it out to simplify things to track down the original issue. So things should be fine now. I've tested it on my machine which includes gcc and clang and also tested on C++03 and C++11. I've also got my bjam settings so that I think any visibility failures should be detectible on my system. I've uploaded these changes to develop and hopefully thing should start looking good. We'll see.
If you do any other thing then you introduce facet-lifetime issues as the archive is destroyed before the iostream leaving the iostream's locale with a dangling pointer (even making the facet lifetime global in the serialization lib doesn't work, because if someone writes an archive to cout, then the serialization lib is unloaded before the std lib and cout is left with a dangling pointer again).
I get this - that is the point of the stream_buffer_saver. When it's destroyed, it puts the streambuffer back into it's orginal state. Things were made more complicated by the standard codecvt idiom of allocating with new and letting the stream buffer manage the lifetime. Also it's very hard to figure out the design and motivation of the standard library's usage of facets etc. Also, it seems I'm not the only one with this problem. I don't have confidence that all the standard library implementations have this exactly right.
Let me know if you have a test case you want me to try and debug,
The original test case which set me on this path is contained in the test_z test which is what I use to hold the current test case I'm working on. FWIW - my procedure is: a) select a trac item which looks the simplest from the list and has a reasonable test case. b) copy the test case into the test_z source code. c) make sure it fails on my machine d) often I discover that the user has made some mistake, In this case I'll inform the user on the trac item - which keeps a whole thread of dialog between myself and complainer. Usually I'm right then I can just be done. Sometimes I might tweak the documentation in the hope of not getting another similar bogus complaint e) If it's legitimate, I'll debug it and make a change f) about half the time, this has an unintended and maybe obscure side effect which turns the process into a much bigger operation than anticipated. g) I check in the changes into develop and usually find that I've broken something in some environment I don't have. So the loop continues. h) By this time, I want to give up and just role things back and live with the bug. But I can't do that because I've already invested too much to just do that. I know that this is sort of irrational as it's a "sunk cost". But I always convince myself that "I'm almost there - just one more iteration". i) finally it's done and everything works everywhere. j) so I check into master. k) then the cycle continues a little longer. l) things stop happening and then I guess it's done. I only relate the above for those interested in what it means to be a boost library maintainer. It means you've got a borderline personality disorder which you attempt to address by indulging it. Guess it works for me. So I do appreciate your help and participation in cases such as this. Robert Ramey
participants (2)
-
John Maddock
-
Robert Ramey