"peer reviewed" - Rights and responsibilities of maintainers - Boost

newer
[release][predef] Permission to...

"peer reviewed" - Rights and responsibilities of maintainers

Alexander Grund

16 Oct 2018 16 Oct '18

7:59 a.m.

I'm having trouble with a maintainer of a boost library over a PR and am seeking for clarification. Background (may be skipped): A bug was introduced into the library which I noticed in one of "my" (company) projects leading to a reproducible crash. However the scenario is not that common (although others have the same problem) so it wasn't noticed earlier. I analyzed the problem and created a minimum working example to reproduce this and then opened a PR with a fix. The maintainer did not comment on this for multiple months so given it IS complicated, I factored out additional PRs with tests I created to show the problem, so each PR is fairly small. In the result there were PRs with tests that failed on travis and the fix-PR which included the test-PRs and the fix and succeeded on travis (and all other CIs) and users reported success using this PR. Still no response from the maintainer on any PR. He then added an own test claiming to be enough, while I (as the author of the original tests) knew it wasn't as e.g. it didn't fail. (so what's it worth then?) Furthermore he added a commit changing unrelated stuff and ADDITIONALLY changing code I touched in the fix claiming that it would be the essence of my patch and fix the problems. Again I (as the author of the original patch) know that this is not true especially as the test-PRs still fail with his patch applied. So my point/problem is: Is the maintainer allowed to simply do changes as he wishes without any review at all? My expected procedure is: - Someone creates a PR - Maintainer (and optionally the community) reviews (and comments on) it - It eventually gets merged or rejected with comments why OR: - Someones else (including the maintainer) creates a contra-PR fixing the same problem - That author explains, how and why this fixes the same problem and why it is superior to the original PR - This gets reviewed and either PR gets merged Arguments were made, that this doesn't apply to accepted boost libraries as e.g. "it's [the maintainers] heads that got chopped off if something gets broken". I would challenge this by the simple example at hand: This boost library IS broken and it crashes MY application (and others). But I see no heads rolling. Furthermore: If the maintainer is allowed to patch and change as he wants without ANY review, how can Boost still be called "peer reviewed"? My trust in Boost stems from the extensive review a library has to undergo before it is included in Boost. If afterwards no review is happening or reviews are actively avoided (as in the current example), then the older a Boost library gets, the less trust I would have in it as more and more unreviewed changes get included. So again: What is the Boost course on reviews of existing libraries and the statement on the rights and responsibilities of maintainers? Thanks, Alexander Grund

Attachments:

smime.p7s (application/pkcs7-signature — 5.1 KB)

Show replies by date

Peter Dimov

16 Oct 16 Oct

2:23 p.m.

Alexander Grund wrote:

...

So my point/problem is: Is the maintainer allowed to simply do changes as he wishes without any review at all?

In short, yes. I understand your frustration, but the procedure is to square it off with the maintainer, then if you feel you're getting nowhere, escalate to the list. Basically what you've done, except that you're trying to discuss not the issue at hand, but "rights and responsibilities of maintainers", which is not going to help your issue, or anyone. The test in question is https://github.com/boostorg/serialization/pull/111/files which does indeed crash https://travis-ci.org/boostorg/serialization/jobs/441654483#L2546 That's easy enough to follow and does show a legitimate problem, in my opinion, although Robert refuses to acknowledge it for some reason. Your further PRs, #110 and #105, are more convoluted and I haven't had the time to look into them in detail; the "fix" PR was failing on Travis, so I assumed it was still a work in progress. Looking at it now, https://travis-ci.org/boostorg/serialization/builds/442032113 the only failures are with Clang and they happen without your patch as well. In either case, it has to be Robert who approves and applies the patch, as "the community" doesn't understand the library as well as he does. (I certainly do not.)

Robert Ramey

2:49 p.m.

On 10/16/18 7:23 AM, Peter Dimov via Boost wrote:

...

Alexander Grund wrote:

...
https://github.com/boostorg/serialization/pull/111/files

which does indeed crash

https://travis-ci.org/boostorg/serialization/jobs/441654483#L2546

That's easy enough to follow and does show a legitimate problem, in my opinion, although Robert refuses to acknowledge it for some reason.

The test does not show anything at all. It's ill conceived. A similar test could be made and it might be useful, but if I do all I'll get is a lot of heat from people who haven't take the time to understand what is actually going on.

...

the only failures are with Clang and they happen without your patch as well.

My version of the patch captures the essence of the PR while retaining the original intent of the code. It's much simpler and alters many less files. I have no idea why I am being criticized for making this patch.

...

In either case, it has to be Robert who approves and applies the patch, as "the community" doesn't understand the library as well as he does. (I certainly do not.)

FWIW - I often forget how the library works. Sometimes an issue comes up with code that hasn't been visited in over a decade. In such cases I have to review the documentation, tests and source code. I think I'm the only one who actually does this. Robert Ramey

Peter Dimov

3:11 p.m.

Robert Ramey wrote:

...

On 10/16/18 7:23 AM, Peter Dimov via Boost wrote:

...
https://github.com/boostorg/serialization/pull/111/files

which does indeed crash

https://travis-ci.org/boostorg/serialization/jobs/441654483#L2546

That's easy enough to follow and does show a legitimate problem, in my opinion, although Robert refuses to acknowledge it for some reason.

The test does not show anything at all. It's ill conceived.

What is ill-conceived about it? It's a minimal example of two shared libraries each using Serialization. As Alexander says, it's a simplified version of a crash they had in one of their projects.

Robert Ramey

3:21 p.m.

On 10/16/18 8:11 AM, Peter Dimov via Boost wrote:

...

Robert Ramey wrote:

...
On 10/16/18 7:23 AM, Peter Dimov via Boost wrote:

...
https://github.com/boostorg/serialization/pull/111/files

which does indeed crash

https://travis-ci.org/boostorg/serialization/jobs/441654483#L2546

That's easy enough to follow and does show a legitimate problem, in my > opinion, although Robert refuses to acknowledge it for some reason.

The test does not show anything at all. It's ill conceived.

What is ill-conceived about it? It's a minimal example of two shared libraries each using Serialization. As Alexander says, it's a simplified version of a crash they had in one of their projects.

First of all, the test changes every time I look at it. It's hard to follow. On the last version I looked at - since no there is not BOOST_CLASS_EXPORT(class name) invoked, there are no indexed entries in the extended type info tables so get_key() should return null under any circumstances. It tests nothing. Worse, it betrays the fact that the author of the test, and anyone who doesn't see a problem with it, have read neither the document nor the source code. Rather than asking "What's ill conceived about it" better to ask - "What is this supposed to be testing?" Robert Ramey

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Peter Dimov

3:32 p.m.

Robert Ramey wrote:

...

On the last version I looked at - since no there is not BOOST_CLASS_EXPORT(class name) invoked, ...

There are no class names in the test; it uses int and float.

...

Rather than asking "What's ill conceived about it" better to ask - "What is this supposed to be testing?"

The use case of two shared libraries each using Serialization, as already explained.

Robert Ramey

4:01 p.m.

On 10/16/18 8:32 AM, Peter Dimov via Boost wrote:

...

Robert Ramey wrote:

...
On the last version I looked at - since no there is not BOOST_CLASS_EXPORT(class name) invoked, ...

There are no class names in the test; it uses int and float.

...
Rather than asking "What's ill conceived about it" better to ask - "What is this supposed to be testing?"

The use case of two shared libraries each using Serialization, as already explained.

Well it's not testing that.

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Alexander Grund

3:33 p.m.

First: This would have been the thing I was looking for in the MR.

...

On the last version I looked at - since no there is not BOOST_CLASS_EXPORT(class name) invoked, there are no indexed entries in the extended type info tables so get_key() should return null under any circumstances. Ok, but this is defined, correct? It is not undefined behaviour or anything else. Besides: And it is using standard types: "int" and "float", no classes. So what export is missing there? It tests nothing. It tests the construction and destruction of Boost.Serialization library types in a specific context.|||| Worse, it betrays the fact that the author of the test, and anyone who doesn't see a problem with it, have read neither the document nor the source code. I was going for a MWE: Minimal code to show the problem. https://www.boost.org/doc/libs/1_67_0/libs/serialization/doc/tutorial.html does NOT show, that BOOST_CLASS_EXPORT is required. I can change the code to something like: int i = 42;| boost::archive::text_oarchive(std::cout) & i;

And | |float i = 42;| boost::archive::text_oarchive(std::cout) & i; || ||And it would still crash.|| ||||

...

Rather than asking "What's ill conceived about it" better to ask - "What is this supposed to be testing?" From the description: This adds a test (from #105 <https://github.com/boostorg/serialization/pull/105>) which uses singletons in shared libraries linked against static boost. This crashes on termination. It simply **uses** singletons and crashes.

Mike Dev

3:49 p.m.

...

-----Original Message----- From: Boost <boost-bounces@lists.boost.org> On Behalf Of Robert Ramey via Boost Sent: Tuesday, October 16, 2018 5:22 PM

On 10/16/18 8:11 AM, Peter Dimov via Boost wrote:

...
Robert Ramey wrote:

...
On 10/16/18 7:23 AM, Peter Dimov via Boost wrote:

...
https://github.com/boostorg/serialization/pull/111/files

which does indeed crash

https://travis-ci.org/boostorg/serialization/jobs/441654483#L2546

That's easy enough to follow and does show a legitimate problem, in my > opinion, although Robert refuses to acknowledge it for some reason.

The test does not show anything at all. It's ill conceived.

What is ill-conceived about it? It's a minimal example of two shared libraries each using Serialization. As Alexander says, it's a simplified version of a crash they had in one of their projects.

First of all, the test changes every time I look at it. It's hard to follow.

If you keep saying that the test is bad / irrelevant, it is no wonder the author changes the code to improve it.

...

On the last version I looked at - since no there is not BOOST_CLASS_EXPORT(class name) invoked, there are no indexed entries in the extended type info tables so get_key() should return null under any circumstances.

If I understand this correctly, the purpose of the test is to show that the application crashes. In that case, the return value is completely irrelevant. If you think that crash is not the fault of Boost.Serialization (e.g. because it is used wrongly), then it would be helpful to explain what you believe to be the actual reason for the crash. What I don't know: Is there a travis build that shows that the test from Flamefire (https://github.com/boostorg/serialization/pull/111/files ) still fails with the changes introduced by Robert (I'm not sure when they were introduced)? Mike

Alexander Grund

4:48 p.m.

...

If you keep saying that the test is bad / irrelevant, it is no wonder the author changes the code to improve it. +1 And there have been no major changes. Just comments and return value check changes as per PR comments.

If I understand this correctly, the purpose of the test is to show that the application crashes. In that case, the return value is completely irrelevant. +1 What I don't know: Is there a travis build that shows that the test from Flamefire (https://github.com/boostorg/serialization/pull/111/files ) still fails with the changes introduced by Robert (I'm not sure when they were introduced)? I rebased the changes to develop yesterday. So yes: https://travis-ci.org/boostorg/serialization/jobs/441654483, https://travis-ci.org/boostorg/serialization/jobs/441654485

Robert Ramey

5:12 p.m.

On 10/16/18 8:49 AM, Mike Dev via Boost wrote:

...

...
-----Original Message----- From: Boost <boost-bounces@lists.boost.org> On Behalf Of Robert Ramey via Boost Sent: Tuesday, October 16, 2018 5:22 PM

On 10/16/18 8:11 AM, Peter Dimov via Boost wrote:

...
Robert Ramey wrote:

...
On 10/16/18 7:23 AM, Peter Dimov via Boost wrote:

...
https://github.com/boostorg/serialization/pull/111/files

which does indeed crash

https://travis-ci.org/boostorg/serialization/jobs/441654483#L2546

That's easy enough to follow and does show a legitimate problem, in my > opinion, although Robert refuses to acknowledge it for some reason.

The test does not show anything at all. It's ill conceived.

What is ill-conceived about it? It's a minimal example of two shared libraries each using Serialization. As Alexander says, it's a simplified version of a crash they had in one of their projects.

First of all, the test changes every time I look at it. It's hard to follow.

If you keep saying that the test is bad / irrelevant, it is no wonder the author changes the code to improve it.

If the test is bad/irrelevant, what else should I say. I have no problem if the author want's to improve his test. It's just not happening.

...

...
On the last version I looked at - since no there is not BOOST_CLASS_EXPORT(class name) invoked, there are no indexed entries in the extended type info tables so get_key() should return null under any circumstances.

If I understand this correctly, the purpose of the test is to show that the application crashes. In that case, the return value is completely irrelevant.

I haven't seen the application so I don't know why it crashes. I'm guessing that it might be related to the current (and recently improved state) singleton. I DO know test_exported_dll also crashes on one compiler configuration. I would like to fix this but so far no one has been able to provide a convincing change nor a test which points out where the problem might be.

...

If you think that crash is not the fault of Boost.Serialization (e.g. because it is used wrongly), then it would be helpful to explain what you believe to be the actual reason for the crash.

Which crash: in his app, in test_exported_dll, or in his test?

...

What I don't know: Is there a travis build that shows that the test from Flamefire (https://github.com/boostorg/serialization/pull/111/files ) still fails with the changes introduced by Robert (I'm not sure when they were introduced)?

Right - that are the exact same results which result from Alexander's original patch. That is, I merged in a simpler, less intrusive, less fragile version of Alexander's patch. And it "fixes" everything that Alexander's patch fixed. Clearly there are still some problems left - which are likely to be even more difficult to nail down. Note that to get this far has required 9 months of effort - part of which entailed including a breaking patch in to the release. Under ordinary circumstances, I would just fix Alexander's test and be done with it just to save time. But I've come to understand that it won't actually save time so I haven't done so. Robert Ramey

...

Mike

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Peter Dimov

5:22 p.m.

Robert Ramey wrote:

...

Which crash: in his app, in test_exported_dll, or in his test?

His app crashes. His test, a simplified version of his app, crashes.

Alexander Grund

3:15 p.m.

...

In short, yes. I understand your frustration, but the procedure is to square it off with the maintainer, then if you feel you're getting nowhere, escalate to the list. Basically what you've done, except that you're trying to discuss not the issue at hand, but "rights and responsibilities of maintainers", which is not going to help your issue, or anyone. I already escalated the issue to the list to avoid getting the crashing version into the last release. This obviously failed. My post here was after in the discussion of/in PR I got the impression

that I'm expecting to much of the maintainer: A review of my changes, I developed, tested and submitted. Someones else agreed to my point of view while others opposed. Hence I wanted clarification on the meta issue with a reference which I could bring on. Seems like the facts are against me in this case so I (or my changes) are in the mercy of the maintainer. As you mentioned it: #105 adds more CI #110 includes #105 and tests that singletons are eventually destructed and `is_destroyed` returns true in both use cases of the singleton #111 Is the condensed version of my MWE that shows the crash. Essence: Use 2 shared libraries linked against static boost and each uses a part of Boost.Serialization. The crash comes from a destruction order problem, that is currently unsolved and could be avoided if `is_destroyed` would return true, but it doesn't But Robert insists that #111 "does not show anything at all. It's ill conceived." He does not acknowledge that Boost.Serialization makes the application crash and does not bring any argument why it would be "ill conceived". I do not understand this. It obviously crashes. This is a fact easily reproducible by anyone on linux. If the test misuses C++ (UB, ...) or the library, then where and how? If not, why is it seen as "ill conceived"?

...

My version of the patch captures the essence of the PR while retaining the original intent of the code.

This is simply wrong. If it would #110 and #111 would pass. It fails to fix the `is_destroyed` function or the core problem with destruction order

...

It's much simpler and alters many less files.

My fix alters 1 file to fix `is_destroyed` and 4 more to remove a single line with an assertion which doesn't hold as shown in #111. Most of the changes are added reverts to an earlier, working version and comments on why and how stuff works, so it doesn't get accidentally broken by optimization/reduction efforts as happened in 1.65

...

I have no idea why I am being criticized for making this patch.

Because a) it combines unrelated changes, hiding the fix and b) pushing it w/o review by the original patch author who might have reasonable arguments why it isn't enough or "the essence of the PR". I learned it is your right to do that. But then again it is my right to to tell you that your claim is wrong. As I as the author of the patch might understand the patch better, it is quite possible, that I'm right, isn't it?

Robert Ramey

5:03 p.m.

On 10/16/18 8:15 AM, Alexander Grund via Boost wrote:

...

But Robert insists that #111 "does not show anything at all. It's ill conceived." True He does not acknowledge that Boost.Serialization makes the application crash False and does not bring any argument why it would be "ill conceived". False I do not understand this. It obviously crashes. This is a fact easily reproducible by anyone on linux.

If "It" refers to the test_dll_exported on clang - True else False If the test misuses C++

...

(UB, ...) or the library, then where and how? If not,

If the test refers to your recent test - you test doesn't test anything. If the test refers to the serialization library test_export_dll on clang (with which version of standard library - who knows?) there is a problem. That is why I made such a test after all. So True why is it seen as

...

"ill conceived"?

It's based on a mis-understanding of how the serialization uses the singleton and of how the singleton works.

...

...
My version of the patch captures the essence of the PR while retaining the original intent of the code.

This is simply wrong.

False If it would #110 and #111 would pass. False It fails to

...

fix the `is_destroyed` function or the core problem with destruction order

I see no evidence of this.

...

...
It's much simpler and alters many less files.

Verifiably true by inspection.

...

My fix alters 1 file to fix `is_destroyed` and 4 more to remove a single line with an assertion which doesn't hold as shown in #111. Most of the changes are added reverts to an earlier, working version and comments on why and how stuff works, so it doesn't get accidentally broken by optimization/reduction efforts as happened in 1.65

...
I have no idea why I am being criticized for making this patch.

Because a) it combines unrelated changes,

If a) refers to my patch -> Verifiably True by inspection hiding the fix and b) pushing

...

it w/o review by the original patch author who might have reasonable arguments why it isn't enough or "the essence of the PR". I learned it is your right to do that.

For good reason. But then again it is my right to to tell you

...

that your claim is wrong.

True. As I as the author of the patch might

...

understand the patch better, it is quite possible, that I'm right, isn't it?

It's possible - but after having considered that possibility, I've concluded that it's False Robert Ramey

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Alexander Grund

17 Oct 17 Oct

7:27 a.m.

I really wonder how someone can show that much ignorance if all facts are against him. "#111 doesn't show anything", " I have no problem if the author want's to improve his test. It's just not happening. " EVERYONE sees how it does show a crash. I explained that it is a MWE reproducer of a real application crash. So while the test may SEEM to not be doing much useful it DOES use Boost.Serialization and it DOES crash as described. There were NO arguments on WHY this test is invalid C++ code. Based on his comments (after months) I slightly changed the test and now I even created #128 which shall show a small, but valid and regular use of Boost.Serialization and it shows the exact same behaviour. Isn't it important to add tests for bugs found? Isn't it important to have these tests be as minimal as possible, so side effects are avoided? How can he still claim the test "doesn't show anything" if it clearly shows a construction/destruction problem IN the library? He also still fails to acknowledge the broken `is_destroyed` function. A good C++ programmer can see this from the code (just ask: When is is_destroyed set to false? When can this fail to happen?) and there is a test case where CI says "|Assertion failed: boost::serialization::singleton<plainSingleton>::is_destroyed()|" (see also comment https://github.com/boostorg/serialization/pull/110#issuecomment-429561420) Note how this was independently noticed by others too: https://github.com/boostorg/serialization/pull/110#issuecomment-417601919 And again and again he lacks reasoning:

...

It's based on a mis-understanding of how the serialization uses the singleton and of how the singleton works.

If it was, it would be easy to point out what mis-understanding this is. And it all boils down THAT the singleton is used. So even if the values in the singleton are bogus, all that is tested is the construction and destruction of the singleton. And the library DOES do that too. #127 clearly shows this.

...

...
As I as the author of the patch might understand the patch better, it is quite possible, that I'm right, isn't it?

It's possible - but after having considered that possibility, I've concluded that it's False

Again: No reasons, just claims. My claims are all backed by references to actual executions on CI and findings of other people coming to the same conclusion as me. Maybe it is time to put Boost.Serialization under CMT...

James E. King III

1:11 p.m.

Perhaps working towards removing the singleton from the implementation would be a worthy course of action. It has come up often in pull requests over the last few years, correct? What problem(s) does the singleton solve in the library, and could those issues be solved in a different way? - Jim

Vinnie Falco

1:59 p.m.

On Wed, Oct 17, 2018 at 6:12 AM James E. King III via Boost <boost@lists.boost.org> wrote:

...

Perhaps working towards removing the singleton from the implementation would be a worthy course of action. It has come up often in pull requests over the last few years, correct? What problem(s) does the singleton solve in the library, and could those issues be solved in a different way?

Great minds think alike! There was a nice discussion yesterday on Cpplang Slack of why the singletons exist, in the #boost channel (between me and Peter D) you should sign up and access the history: http://slack.cpp.al Regards

Robert Ramey

3:41 p.m.

On 10/17/18 12:27 AM, Alexander Grund via Boost wrote:

...

I really wonder how someone can show that much ignorance if all facts are against him.

"#111 doesn't show anything", " I have no problem if the author want's to improve his test. It's just not happening. " EVERYONE sees how it does show a crash.

When I looked at the test my attention was immediately drawn to the following: const char* impossible = "impossible"; return impossible != boost::serialization::extended_type_info_typeid<float>::get_const_instance().get_key(); I memory serves me, get_key() will return the address of string of characters inside the the address space of the DLL. This address is compared to the address of "impossible" which is an address inside the address space of the calling mainline application. I could not and cannot understand what is being tested here. Honestly, when I see this, I felt it unproductive to continue. Given that EVERYONE has looked at this and no one has seen any problem with it, I wondering if I'm missing something really obvious and I would be grateful if anyone could explain this to me. On a different note, I would point out that this whole issue appeared when support for visibility for gcc was introduced last year. This is kind of suspicious to me. An interesting test would be to: a) find a platform on which test_dll_exported fails b) rebuild/test with out support for gcc visibility - (by removing "visibility=default") the the command line. And see if the change. Robert Ramey

Peter Dimov

3:52 p.m.

Robert Ramey wrote:

...

When I looked at the test my attention was immediately drawn to the following:

const char* impossible = "impossible"; return impossible != boost::serialization::extended_type_info_typeid<float>::get_const_instance().get_key();

I memory serves me, get_key() will return the address of string of characters inside the the address space of the DLL. This address is compared to the address of "impossible" which is an address inside the address space of the calling mainline application. I could not and cannot understand what is being tested here.

It's a dummy comparison whose only purpose is to make use of the return value to prevent the optimizer from removing the call entirely.

Robert Ramey

4:25 p.m.

On 10/17/18 8:52 AM, Peter Dimov via Boost wrote:

...

Robert Ramey wrote:

...
When I looked at the test my attention was immediately drawn to the following:

const char* impossible = "impossible"; return impossible != boost::serialization::extended_type_info_typeid<float>::get_const_instance().get_key();

I memory serves me, get_key() will return the address of string of characters inside the the address space of the DLL. This address is compared to the address of "impossible" which is an address inside the address space of the calling mainline application. I could not and cannot understand what is being tested here.

It's a dummy comparison whose only purpose is to make use of the return value to prevent the optimizer from removing the call entirely.

Hmmm - but the value of the comparison is used in the determination of the test results. Very unintuitive. Why not compare to nullptr (or 0 or whatever) which is what get_key() expected to return since no entries have been added to the extended type info table. There is also an interesting issue of using float and int as type arguments. This is never been done before since no user of the serialization library would do this. It's an interesting new variable in the mix and offhand I don't really know if there are any consequences to this. Robert Ramey

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Peter Dimov

4:32 p.m.

Robert Ramey wrote:

...

There is also an interesting issue of using float and int as type arguments. This is never been done before since no user of the serialization library would do this. It's an interesting new variable in the mix and offhand I don't really know if there are any consequences to this.

In response to these concerns, the author has since reformulated the test so that it only uses documented Serialization functionality: https://github.com/boostorg/serialization/pull/128/files

Alexander Grund

18 Oct 18 Oct

7:03 a.m.

...

const char* impossible = "impossible"; return impossible != boost::serialization::extended_type_info_typeid<float>::get_const_instance().get_key();

I memory serves me, get_key() will return the address of string of characters inside the the address space of the DLL. This address is compared to the address of "impossible" which is an address inside the address space of the calling mainline application. I could not and cannot understand what is being tested here. Honestly, when I see this, I felt it unproductive to continue. Given that EVERYONE has looked at this and no one has seen any problem with it, I wondering if I'm missing something really obvious and I would be grateful if anyone could explain this to me.

As Peter noted it is a dummy comparison, its only purpose is to instantiate 2 different singletons in 2 different shared libraries. #128 does the same but in a more non-obvious way, but in a "valid use" of Boost.Serialization. #111 is the condensed version of this.

...

Hmmm - but the value of the comparison is used in the determination of the test results. Very unintuitive. Why not compare to nullptr (or 0 or whatever) which is what get_key() expected to return since no entries have been added to the extended type info table.

In response to those concerns raised recently I changed the code and comments: - use "impossible !=..." to make it clear that the key cannot have this value and hence the result is always true - Thats why I haven't used NULL anymore (I DID use it, but still you complained, so I made it more explicit that I don't care about WHAT get_key returns as long as it is called) - I have to admit, I wasn't aware what get_key is supposed to return which is why i return "0" from main if they are NOT "NULL" and the default "0" if they are. You objected this as it is non-intuitive that main has an implicit "return 0" and I changed this noticing that get_key returned "NULL" when I did. To save time trying to find out what `get_key` is supposed to return I added "impossible" - add a comment for this: "// Make sure to return true." - There already was a hint, that different types are required: "// Use a different(!) type than multi_shared1" - The purpose is in written in the doc of the cpp file: "library simply using extended_type_info_typeid" - All of this is even explained in DETAIL in the main test file (https://github.com/boostorg/serialization/blob/0e2a9aa95b3ec6f7dde1db7ddfa4c...) - It is also explained in the PR and the testcase that "This crashes on termination" Given all this I don't understand what was left so unclear, that you expected a "useful" test result. The whole purpose of this is to test IF it runs AT ALL which it currently doesn't. If any description is unclear and caused the confusion, simply tell me and I can amend it. That would be constructive and much appreciated.

...

There is also an interesting issue of using float and int as type arguments. This is never been done before since no user of the serialization library would do this. It's an interesting new variable in the mix and offhand I don't really know if there are any consequences to this.

Again: They are just used to instantiate the singletons whose destructions cause the crash. The same can be achieved with custom classes (see #127) but that makes the code more convoluted than required for a MWE. If there is an inherent failure in the lib why primitive types cannot be used, then I didn't found it. I don't see a difference when passing `int` as opposed to `Foo` from `struct Foo{};` and #127 seems to support this. If this is what bothers you, please comment on #111 what types I shall pass instead (e.g. Foo and Bar both being simple structs), but I didn't deem it necessary.

Gavin Lambert

16 Oct 16 Oct

11:29 p.m.

On 17/10/2018 04:15, Alexander Grund wrote:

...

As you mentioned it: #105 adds more CI #110 includes #105 and tests that singletons are eventually destructed and `is_destroyed` returns true in both use cases of the singleton #111 Is the condensed version of my MWE that shows the crash. Essence: Use 2 shared libraries linked against static boost and each uses a part of Boost.Serialization. The crash comes from a destruction order problem, that is currently unsolved and could be avoided if `is_destroyed` would return true, but it doesn't

Granted that I have not looked at the code or the PR in any detail (and thus this might be entirely off base): Conceptually "is_destroyed" is usually a bad idea. By definition, after the destructor has run the memory is free to get stomped on by other objects and thus you can't rely on is_destroyed to return any particular value. Trying to rely on it is UB. Applying this to singletons at global scope is both a blessing and a curse -- it reduces the likelihood that something else would actually stomp over the memory in practice, but also introduces a static destruction order indeterminacy issue. Usually the motivation to introduce this sort of thing is a side effect of having dangling pointers to deleted objects. In which case usually the best solution is to recognise that you actually have a shared ownership issue (where the object's actual lifetime is not what you thought it was) and start using shared_ptr/weak_ptr and/or having an object keep track of which singleton it is registered with rather than assuming that it will find the same ambient singleton later. Having said *that*, as previously discussed on the list, using a static library from multiple shared libraries in itself introduces an aliasing problem, where sometimes singletons aren't actually singletons -- and woe betide you if you try to pass an object that references one singleton across the library divide into the domain of the other singleton, because then you can have issues such as registering with one singleton and then trying to deregister from the other (which all by itself can be a source of dangling pointers). (FWIW, introducing dynamic libraries also increases the risk of dangling pointers, if a library can be unloaded while references to it survive outside.) As far as I am aware (when you are using static-from-dynamic), you will get these duplicated singletons at all times on Windows, and on Linux it will happen whenever you are compiling with -fvisibility=hidden (which appears to be true in this case given the build output) and if the singleton definitions don't use BOOST_SYMBOL_VISIBLE. I leave it to the rest of you to figure out whether this points at an underlying issue in Boost.Serialization or whether it's a usage error. Or both.

Peter Dimov

17 Oct 17 Oct

12:22 a.m.

Gavin Lambert wrote:

...

Conceptually "is_destroyed" is usually a bad idea. By definition, after the destructor has run the memory is free to get stomped on by other objects and thus you can't rely on is_destroyed to return any particular value. Trying to rely on it is UB.

It would be undefined to access the "is_destroyed" bool if it were a member of the singleton, but it's a function local static in Alexander's code, and it looks OK to me in that formulation. Static bools are never destroyed.

Gavin Lambert

12:54 a.m.

On 17/10/2018 13:22, Peter Dimov wrote:

...

Gavin Lambert wrote:

...
Conceptually "is_destroyed" is usually a bad idea. By definition, after the destructor has run the memory is free to get stomped on by other objects and thus you can't rely on is_destroyed to return any particular value. Trying to rely on it is UB.

It would be undefined to access the "is_destroyed" bool if it were a member of the singleton, but it's a function local static in Alexander's code, and it looks OK to me in that formulation. Static bools are never destroyed.

They are if they're in a shared library which is unloaded. :) Technically it's UB to access a global bool after "destruction" (just like any other object) and it could possibly be "destroyed" before the singleton due to unspecified global destruction order. Though you're correct that since bool has a trivial destructor it is *usually* not actually a problem in practice. (AFAIK a compiler is within its rights to zero or otherwise stomp the memory after "destruction", though most probably wouldn't bother, except perhaps as a diagnostic/sanitiser.) It's still a sign of a fundamental lifetime mismatch somewhere, though.

Peter Dimov

1:08 a.m.

Gavin Lambert wrote:

...

Technically it's UB to access a global bool after "destruction" (just like any other object)...

The only way to destroy a bool is to placement-new some other object over it, if I'm not mistaken. :-) (Or, if it's in a union, you can assign to some other member, I suppose.)

Gavin Lambert

2:16 a.m.

On 17/10/2018 14:08, Peter Dimov wrote:

...

Gavin Lambert wrote:

...
Technically it's UB to access a global bool after "destruction" (just like any other object)...

The only way to destroy a bool is to placement-new some other object over it, if I'm not mistaken. :-)

(Or, if it's in a union, you can assign to some other member, I suppose.)

The quotes were significant, not merely for emphasis.

Steven Watanabe

1:10 a.m.

AMDG On 10/16/2018 06:54 PM, Gavin Lambert via Boost wrote:

...

<snip> Technically it's UB to access a global bool after "destruction" (just like any other object) and it could possibly be "destroyed" before the singleton due to unspecified global destruction order.

That's not correct. If an object has static storage duration and a trivial destructor, then it is never "destroyed." Its storage exists "for the lifetime of the program" [basic.stc.static] and its lifetime ends when "the storage which the object occupies is reused or released" [basic.life]

...

Though you're correct that since bool has a trivial destructor it is *usually* not actually a problem in practice. (AFAIK a compiler is within its rights to zero or otherwise stomp the memory after "destruction", though most probably wouldn't bother, except perhaps as a diagnostic/sanitiser.)

It's still a sign of a fundamental lifetime mismatch somewhere, though.

In Christ, Steven Watanabe

Gavin Lambert

2:05 a.m.

On 17/10/2018 14:10, Steven Watanabe wrote:

...

That's not correct. If an object has static storage duration and a trivial destructor, then it is never "destroyed." Its storage exists "for the lifetime of the program" [basic.stc.static] and its lifetime ends when "the storage which the object occupies is reused or released" [basic.life]

Which occurs when the shared library that contains its storage is unloaded. The standard completely ignores the existence of shared libraries, so you're not on particularly stable ground. :)

Peter Dimov

2:16 a.m.

Gavin Lambert wrote:

...

On 17/10/2018 14:10, Steven Watanabe wrote:

...
That's not correct. If an object has static storage duration and a trivial destructor, then it is never "destroyed." Its storage exists "for the lifetime of the program" [basic.stc.static] and its lifetime ends when "the storage which the object occupies is reused or released" [basic.life]

Which occurs when the shared library that contains its storage is unloaded.

How, specifically, are you planning to read a function local static from an unloaded library? You've already crashed way before that.

Gavin Lambert

2:25 a.m.

On 17/10/2018 15:16, Peter Dimov wrote:

...

Gavin Lambert wrote:

...
...
That's not correct. If an object has static storage duration and a trivial destructor, then it is never "destroyed." Its storage exists "for the lifetime of the program" [basic.stc.static] and its

On 17/10/2018 14:10, Steven Watanabe wrote: lifetime > ends when "the storage which the object occupies is reused or released" > [basic.life]

Which occurs when the shared library that contains its storage is unloaded.

How, specifically, are you planning to read a function local static from an unloaded library? You've already crashed way before that.

It's not inherently impossible for it to happen; the code can be inlined with only that variable being used as an external symbol pointing into unloaded memory. Which might not even crash when run on an OS that doesn't free the pages immediately (or at all). Again, though, this is a side track. My main point is that if you find yourself asking "has this object been destroyed?" then it's usually a sign of a more fundamental lifetime mismatch, and it's probably worthwhile exploring that.

Gavin Lambert

2:36 a.m.

Mere moments ago, quoth I:

...

It's not inherently impossible for it to happen; the code can be inlined with only that variable being used as an external symbol pointing into unloaded memory. Which might not even crash when run on an OS that doesn't free the pages immediately (or at all).

FWIW (and again I have to confess ignorance of the specific code in question, and of how the Linux dynamic linker behaves), if some symbol *has* been marked with BOOST_SYMBOL_VISIBLE then I wonder if something like this could explain why things apparently work on Windows but not Linux: Code is inlined with separate (hidden) instantiations in each shared library, which internally refers to some shared (visible) external symbol. During load, the dynamic linker happens to pick the one in the library which you later unload first; then the method in the still-loaded library is called and tries to access this no-longer-existing symbol. Although I assume that this would probably only occur if you tried to unload the libraries in the wrong order. It shouldn't be possible if you unload strictly in reverse order of load.

Alexander Grund

7:06 a.m.

...

Again, though, this is a side track. My main point is that if you find yourself asking "has this object been destroyed?" then it's usually a sign of a more fundamental lifetime mismatch, and it's probably worthwhile exploring that.

This is true. But: a) is_destroyed is part of the public interface of Boost.Serialization (see the docs) and it doesn't return true when it should as shown by #110 even in the case of a simple executable. b) In the case I described and tested with #111 (and now even with #128) There are 2 shared libraries with their own instances of the singletons (this can be checked by debugging the ctor/dtor of them). The lifetime mismatch happens due to "some linux mechanism" destroying same types from different libraries at the same time invalidating the expectation of each library. Example (actually witnessed by printf-debugging) singleton<A> (0xlib1) singleton (0xlib1) singleton<A> (0xlib2) singleton (0xlib2) ~singleton (0xlib2) ~singleton<A> (0xlib2) ~singleton<A> (0xlib1) ~singleton (0xlib1) Note the switched order of the lib1 destruction. You can verify this yourself with #111 or #128. The problem arises because singleton accesses singleton<A> in both its ctor and dtor but that has been destroyed and due to a) above this is not detected. We CAN try to find the reason for the mixup, OR we can rely on the public interface of singleton and detect it and handle it gracefully. The former is VERY hard, the latter is easy and actually it *is currently being done*!!! See https://github.com/boostorg/serialization/pull/105/commits/89d0910033527c34f... All that is missing to avoid the crash is the fix for `is_destroyed`

Robert

7:51 p.m.

On 10/17/2018 2:06 AM, Alexander Grund via Boost wrote:

...

...
Again, though, this is a side track. My main point is that if you find yourself asking "has this object been destroyed?" then it's usually a sign of a more fundamental lifetime mismatch, and it's probably worthwhile exploring that.

This is true. But:

a) is_destroyed is part of the public interface of Boost.Serialization (see the docs) and it doesn't return true when it should as shown by #110 even in the case of a simple executable. b) In the case I described and tested with #111 (and now even with #128) There are 2 shared libraries with their own instances of the singletons (this can be checked by debugging the ctor/dtor of them). The lifetime mismatch happens due to "some linux mechanism" destroying same types from different libraries at the same time invalidating the expectation of each library. Example (actually witnessed by printf-debugging)

singleton<A> (0xlib1) singleton (0xlib1) singleton<A> (0xlib2) singleton (0xlib2) ~singleton (0xlib2) ~singleton<A> (0xlib2) ~singleton<A> (0xlib1) ~singleton (0xlib1) Note the switched order of the lib1 destruction. You can verify this yourself with #111 or #128. The problem arises because singleton accesses singleton<A> in both its ctor and dtor but that has been destroyed and due to a) above this is not detected.

We CAN try to find the reason for the mixup, OR we can rely on the public interface of singleton and detect it and handle it gracefully. The former is VERY hard, the latter is easy and actually it *is currently being done*!!! See https://github.com/boostorg/serialization/pull/105/commits/89d0910033527c34f... All that is missing to avoid the crash is the fix for `is_destroyed`

Hello All, I am the Robert who posts the static int counting via "printf" debugging output on Windows. Please do not confuse my first name with Mr. Ramey. I am not him. FWIW, the ongoing outstanding issues within Serialization, and how the maintainer responds to them, has caused zero use of Boost.Serialization, within the professional development team I am part of. In fact, Boost.Beast is the only one in use. I hope all of Alexander Grund's PRs are added/merged. If I have the time available, I will test his PRs on the 1.68 Serialization on Windows, one executable calling/using one or more DLLs, containing the Serialization and its multiple Singletons. At least, I believe I can obtain PR changes at any time. Please tell me if I misunderstand how github works obtaining a PR. Thank You, Robert

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Olaf van der Spek

18 Oct 18 Oct

7:52 a.m.

On Wed, Oct 17, 2018 at 9:06 AM, Alexander Grund via Boost <boost@lists.boost.org> wrote:

...

b) In the case I described and tested with #111 (and now even with #128) There are 2 shared libraries with their own instances of the singletons (this can be checked by debugging the ctor/dtor of them). The lifetime mismatch happens due to "some linux mechanism" destroying same types from different libraries at the same time invalidating the expectation of each library. Example (actually witnessed by printf-debugging)

Using static libs in shared libs is a recipe for disaster isn't it? It's undefined behavior.. Does it make sense to try to 'work around' it on specific implementations? -- Olaf

Alexander Grund

8:57 a.m.

...

Using static libs in shared libs is a recipe for disaster isn't it? It's undefined behavior.. Aren't shared libs per se undefined behavior? As far as I remember the standard does not say much about them and it happens easily "in the wild": You might use Boost.Serialization yourself but also use a 3rd-party library which does use it for its own stuff. Does it make sense to try to 'work around' it on specific implementations?

This is exactly what is happening: Observed behavior on "specific implementations" is: - Destruction order on Windows+OSX is as expected, so no problems - On linux the order gets messed up. This gets detected by a dedicated `is_destroyed` flag Hence the "work around" is to detect and handle this: ~singletonB(){ if(!singletonA::is_destroyed()) singletonA::get_instance().unregister(this); } There are only 2 problems: `is_destroyed` has a bug, breaking it and the code is actually: ~singletonB(){ BOOST_ASSERT(!singletonA::is_destroyed()); if(!singletonA::is_destroyed()) singletonA::get_instance().unregister(this); } My PR fixes both problems which is shown by tests.

Olaf van der Spek

9:35 a.m.

On Thu, Oct 18, 2018 at 10:57 AM, Alexander Grund <alexander.grund@tu-dresden.de> wrote:

...

...
Using static libs in shared libs is a recipe for disaster isn't it? It's undefined behavior..

Aren't shared libs per se undefined behavior? As far as I remember the

Implementation defined

...

standard does not say much about them and it happens easily "in the wild":

True

...

You might use Boost.Serialization yourself but also use a 3rd-party library which does use it for its own stuff.

...
Does it make sense to try to 'work around' it on specific implementations?

This is exactly what is happening:

Observed behavior on "specific implementations" is: - Destruction order on Windows+OSX is as expected, so no problems - On linux the order gets messed up. This gets detected by a dedicated

What does Linux 'say' about using static libs in shared libs?

...

Hence the "work around" is to detect and handle this:

My point is that it might not make sense to work around this as other issues are likely to pop up. -- Olaf

Andrey Semashev

10:25 a.m.

On 10/18/18 10:52 AM, Olaf van der Spek via Boost wrote:

...

On Wed, Oct 17, 2018 at 9:06 AM, Alexander Grund via Boost <boost@lists.boost.org> wrote:

...
b) In the case I described and tested with #111 (and now even with #128) There are 2 shared libraries with their own instances of the singletons (this can be checked by debugging the ctor/dtor of them). The lifetime mismatch happens due to "some linux mechanism" destroying same types from different libraries at the same time invalidating the expectation of each library. Example (actually witnessed by printf-debugging)

Using static libs in shared libs is a recipe for disaster isn't it? It's undefined behavior..

No, there's nothing undefined about it, as long as you know what you're doing. Linking static Boost libraries into shared user's libraries is actually rather common. It is often done in attempt to conceal Boost usage or bundle a particular, possibly modified Boost version with the user's app. I didn't inspect this particular issue to tell if this specific case is valid, though.

John Maddock

11:12 a.m.

On 18/10/2018 08:52, Olaf van der Spek via Boost wrote:

...

On Wed, Oct 17, 2018 at 9:06 AM, Alexander Grund via Boost <boost@lists.boost.org> wrote:

...
b) In the case I described and tested with #111 (and now even with #128) There are 2 shared libraries with their own instances of the singletons (this can be checked by debugging the ctor/dtor of them). The lifetime mismatch happens due to "some linux mechanism" destroying same types from different libraries at the same time invalidating the expectation of each library. Example (actually witnessed by printf-debugging) Using static libs in shared libs is a recipe for disaster isn't it?

I was about to agree with you, but there is one very important use case - that of the application pluggin. In that situation one would expect that as long as everything is built with -fvisibility-hidden so that each shared library is entirely self contained, then everything should really be fine. John. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

Robert Ramey

3:17 p.m.

On 10/18/18 4:12 AM, John Maddock via Boost wrote:

...

I was about to agree with you, but there is one very important use case - that of the application pluggin. In that situation one would expect that as long as everything is built with -fvisibility-hidden so that each shared library is entirely self contained, then everything should really be fine.

Hmm - I'm not sure about this. Consider the C++ runtime library on windows with msvc. The app specifies dynamic linking of some boost library. I believe that will of necessity use the dynamically linked version of the C++ runtime library. Now we call a plugin which includes another boost library - statically linked this time. The plugin will use the statically linked version of the C++ runtime library. Now as the program executes, it might be calling different versions of the C++standard library and who knows what else. Sooooo - I would be pretty reluctant to start down this path. Perhaps this is why microsoft, in it's wisdom, has separate "visibility" attributes for import / export. In any case, this is one aspect that should really be solved at the committee level - but they're occupied with more pressing issues such as the spaceship operator. Robert Ramey

Dominique Devienne

3:26 p.m.

On Thu, Oct 18, 2018 at 5:17 PM Robert Ramey via Boost < boost@lists.boost.org> wrote:

...

[...] Now we call a plugin which includes another boost library - statically linked this time. The plugin will use the statically linked version of the C++ runtime library.

I don't think so. The plugin can link statistically against its dependencies, while still linking against the dynamically linked C++ runtime. I used to do that for Java extensions. The JNI library linked with all its C++ dependencies statically, so it's a single self-contained shared-library one could then java.lang.System.load() w/o having to mess with PATH or LD_LIBRARY_PATH, yet still linked against the dynamic runtime if I recall correctly (that's over 10 years ago). This is a legitimate use case IMHO... --DD

John Maddock

5:10 p.m.

On 18/10/2018 16:17, Robert Ramey via Boost wrote:

...

On 10/18/18 4:12 AM, John Maddock via Boost wrote:

...
I was about to agree with you, but there is one very important use case - that of the application pluggin. In that situation one would expect that as long as everything is built with -fvisibility-hidden so that each shared library is entirely self contained, then everything should really be fine.

Hmm - I'm not sure about this. Consider the C++ runtime library on windows with msvc. The app specifies dynamic linking of some boost library. I believe that will of necessity use the dynamically linked version of the C++ runtime library. Now we call a plugin which includes another boost library - statically linked this time. The plugin will use the statically linked version of the C++ runtime library.

No not true, just because a Boost library is static, it can still be built against the dll runtime and then everything is just fine. In fact static linking to Boost (even when using the dll runtime) is actually the default behaviour on windows. Best, John.

...

Now as the program executes, it might be calling different versions of the C++standard library and who knows what else. Sooooo - I would be pretty reluctant to start down this path. Perhaps this is why microsoft, in it's wisdom, has separate "visibility" attributes for import / export.

In any case, this is one aspect that should really be solved at the committee level - but they're occupied with more pressing issues such as the spaceship operator.

Robert Ramey

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

--- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

Niall Douglas

8:19 p.m.

New subject: Solving Robert's singleton problem (was: Re: "peer reviewed" - Rights and responsibilities of maintainers)

...

On 18/10/2018 08:52, Olaf van der Spek via Boost wrote:

...
On Wed, Oct 17, 2018 at 9:06 AM, Alexander Grund via Boost <boost@lists.boost.org> wrote:

...
b) In the case I described and tested with #111 (and now even with #128) There are 2 shared libraries with their own instances of the singletons (this can be checked by debugging the ctor/dtor of them). The lifetime mismatch happens due to "some linux mechanism" destroying same types from different libraries at the same time invalidating the expectation of each library. Example (actually witnessed by printf-debugging) Using static libs in shared libs is a recipe for disaster isn't it?

Rather than criticising Robert's maintainership, can we maybe spend more effort instead on how to best to solve his problem? I in the past have faced the problem where: 1. There must be a complex singleton instantiated in unknowable compilation units. 2. I don't control what weird linker flags the end user uses, or what combination of shared and static libraries they might use with me in each of them. 3. Code may use me during the shared library bootstrap and eviction process i.e. during static data init/deinit of plugins being dynamically loaded and evicted. The way I have solved this problem is brutal, but it works. In my singleton constructor, I create a named shared memory area unique to my process. In that named shared memory area, I store a pointer to myself, and nothing else. When future singletons get constructed, they reopen the same named shared memory area, realise they are a second or third or fourth instance, and redirect themselves to use the original singleton. They add their own address to the shared memory area. When a shared library containing the original singleton gets dynamically kicked out, it asks the next singleton in the list of singletons in the shared memory area to take over. It transfers all its contents to the new master singleton, and the shared memory list is updated. I appreciate how brutal and overkill this solution is. However, it solves the problem well, in perpetuity, and under whatever crazy weird settings or use cases any user could think of. Niall

Robert Ramey

8:47 p.m.

New subject: Solving Robert's singleton problem

On 10/18/18 1:19 PM, Niall Douglas via Boost wrote:

...

I appreciate how brutal and overkill this solution is. However, it solves the problem well, in perpetuity, and under whatever crazy weird settings or use cases any user could think of.

I'm quite confident that the problem will be solved in the near future just by doing the work to fix it. It's not rocket science, it just requires investment of time and effort by someone who understands the library and C++ Robert Ramey

Peter Dimov

9:58 p.m.

New subject: Solving Robert's singleton problem

Robert Ramey wrote:

...

I'm quite confident that the problem will be solved in the near future just by doing the work to fix it. It's not rocket science, it just requires investment of time and effort by someone who understands the library and C++

The problem has already been fixed, by Alexander Grund. The fix, in its latest incarnation, is in #131. To recap, - the Serialization library crashes on exit in certain scenarios, as demonstrated by the tests in #111 and #128. - this is caused by the fact that singleton<X>::is_destroyed() always returns false, as demonstrated by the test in #129. (At extended_type_info_typeid.cpp:99, the code tests singleton<tkmap>::is_destroyed(), and if false, proceeds to use singleton<tkmap>::get_mutable_instance(). But since is_destroyed always returns false, the check doesn't work and doesn't prevent the library from accessing the destroyed tkmap in the scenario in question.) - a fix for is_destroyed() is proposed in #131. It makes the tests from #111/#128/#129 pass.

Robert Ramey

10:24 p.m.

New subject: Solving Robert's singleton problem

On 10/18/18 2:58 PM, Peter Dimov via Boost wrote:

...

Robert Ramey wrote:

...
I'm quite confident that the problem will be solved in the near future just by doing the work to fix it. It's not rocket science, it just requires investment of time and effort by someone who understands the library and C++

The problem has already been fixed, by Alexander Grund. The fix, in its latest incarnation, is in #131.

To recap,

- the Serialization library crashes on exit in certain scenarios, as demonstrated by the tests in #111 and #128.

- this is caused by the fact that singleton<X>::is_destroyed() always returns false, as demonstrated by the test in #129.

(At extended_type_info_typeid.cpp:99, the code tests singleton<tkmap>::is_destroyed(), and if false, proceeds to use singleton<tkmap>::get_mutable_instance(). But since is_destroyed always returns false, the check doesn't work and doesn't prevent the library from accessing the destroyed tkmap in the scenario in question.)

- a fix for is_destroyed() is proposed in #131. It makes the tests from #111/#128/#129 pass.

Hmmm - OK - I'll take yet another look at this. Robert Ramey

Alexander Grund

30 Oct 30 Oct

5:17 p.m.

New subject: Solving Robert's singleton problem

...

...
The problem has already been fixed, by Alexander Grund. The fix, in its latest incarnation, is in #131.

To recap,

- the Serialization library crashes on exit in certain scenarios, as demonstrated by the tests in #111 and #128.

- this is caused by the fact that singleton<X>::is_destroyed() always returns false, as demonstrated by the test in #129.

(At extended_type_info_typeid.cpp:99, the code tests singleton<tkmap>::is_destroyed(), and if false, proceeds to use singleton<tkmap>::get_mutable_instance(). But since is_destroyed always returns false, the check doesn't work and doesn't prevent the library from accessing the destroyed tkmap in the scenario in question.)

- a fix for is_destroyed() is proposed in #131. It makes the tests from #111/#128/#129 pass.

Hmmm - OK - I'll take yet another look at this.

Robert Ramey

With the upcoming release of 1.69 can we get those PRs finally merged now? What is still missing?

Robert Ramey

7:53 p.m.

New subject: Solving Robert's singleton problem

On 10/30/18 10:17 AM, Alexander Grund via Boost wrote:

...

...
...
The problem has already been fixed, by Alexander Grund. The fix, in its latest incarnation, is in #131.

To recap,

- the Serialization library crashes on exit in certain scenarios, as demonstrated by the tests in #111 and #128.

- this is caused by the fact that singleton<X>::is_destroyed() always returns false, as demonstrated by the test in #129.

(At extended_type_info_typeid.cpp:99, the code tests singleton<tkmap>::is_destroyed(), and if false, proceeds to use singleton<tkmap>::get_mutable_instance(). But since is_destroyed always returns false, the check doesn't work and doesn't prevent the library from accessing the destroyed tkmap in the scenario in question.)

- a fix for is_destroyed() is proposed in #131. It makes the tests from #111/#128/#129 pass.

Hmmm - OK - I'll take yet another look at this.

Robert Ramey

With the upcoming release of 1.69 can we get those PRs finally merged now? What is still missing?

I merged your latest version of your "test" and it failed no where. I also managed to reproduce the remaining test failure - test_dll_exported - on my own machine and have traced it to an related issue which has now been fixed. I've run all tests on clang and several gcc compilers on my own system. I'm waiting for final results on the test matrix and appveyor for windows. These were failing but I need to upload a jamfile tweak. Sooo - I believe that that there are no pending issues regarding singletons, visibility and export. Robert Ramey

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Alexander Grund

1 Nov 1 Nov

7:55 a.m.

New subject: Solving Robert's singleton problem

...

I merged your latest version of your "test" and it failed no where.

Can you point me to the merge or commit and the passing travis build? All I have seen is some of my PR manually put into other commits 65f05dc2099e376f5306cc0c16c29ec0a1a1d2d4 and 237ff81ac62cb9cf8206a050647a1b47601ce4bd and due to not using the full MR it fails to even link on linux: https://travis-ci.org/boostorg/serialization/jobs/446809035 which is probably why you removed it in c5549b2d6ba79a60a98ddceb854954c090c70cb1 Why don't you just merge the open PR which does work as intended and DOES fail on travis?

...

I also managed to reproduce the remaining test failure - test_dll_exported - on my own machine and have traced it to an related issue which has now been fixed. I've run all tests on clang and several gcc compilers on my own system. I'm waiting for final results on the test matrix and appveyor for windows. These were failing but I need to upload a jamfile tweak. Good to hear another issue is resolved. Sooo - I believe that that there are no pending issues regarding singletons, visibility and export.

Unfortunately not true. 1) The crash-producing test is not or wrongly included 2) The is_destroyed test is not merged and the bug not fixed Again: Please simply merge my PRs (at least the "simple ones" with the tests) or tell me, what you don't understand or like about them so I can change them without breaking them. We are running in circles here where you do not seem to understand my intentions and hence the bugs, change things "based on my changes" but without using the important parts. Others have already confirmed the validity of my tests and that my fix works, so it can't be bad.

Alexander Grund

7 Nov 7 Nov

11:41 a.m.

New subject: Solving Robert's singleton problem

Am 30.10.18 um 20:53 schrieb Robert Ramey via Boost:

...

Sooo - I believe that that there are no pending issues regarding singletons, visibility and export.

I rebased my PRs on current develop and as written in the previous mail they still show both bugs (crash and broken is_destroyed). This can be validated by everyone: https://github.com/boostorg/serialization/pull/128 https://github.com/boostorg/serialization/pull/129 Also the fix PR (https://github.com/boostorg/serialization/pull/131) contains both and the fix and passes all CIs

Alexander Grund

19 Oct 19 Oct

7:14 a.m.

New subject: Solving Robert's singleton problem

...

The problem has already been fixed, by Alexander Grund. The fix, in its latest incarnation, is in #131.

To recap,

- the Serialization library crashes on exit in certain scenarios, as demonstrated by the tests in #111 and #128.

- this is caused by the fact that singleton<X>::is_destroyed() always returns false, as demonstrated by the test in #129.

(At extended_type_info_typeid.cpp:99, the code tests singleton<tkmap>::is_destroyed(), and if false, proceeds to use singleton<tkmap>::get_mutable_instance(). But since is_destroyed always returns false, the check doesn't work and doesn't prevent the library from accessing the destroyed tkmap in the scenario in question.)

- a fix for is_destroyed() is proposed in #131. It makes the tests from #111/#128/#129 pass.

Thanks for voicing this again. From the comments of you (and a few others) I condensed the fix code, and (tried to) make the tests and fix more easily understandable. As such the code comments in the fix are absolutely essential so future developers won't break it again. These comments and the discussion about it were the reason to bring it up in this way. The tests and fix were already there but not merged for reasons absolutely unclear to me as there was no discussion. This discussion brought some of them up so I could work on them. Thanks again. Minor note: "singleton<X>::is_destroyed() always returns false" is not completely true. There is 1 use case where it does return true: "struct Foo: singleton<Foo>" (as shown in the test). So the correct statement would be "singleton<X>::is_destroyed() always returns false when X does not inherit from singleton<X>". Note that this is the 2nd official use case.

Vinnie Falco

2:20 p.m.

New subject: Solving Robert's singleton problem

On Fri, Oct 19, 2018 at 12:15 AM Alexander Grund via Boost <boost@lists.boost.org> wrote:

...

Thanks for voicing this again.

Things were a bit tense for a while. Are we all back to being friends now? Robert? Alexander?

Bjorn Reese

20 Oct 20 Oct

6:33 p.m.

New subject: Solving Robert's singleton problem

On 10/18/18 10:19 PM, Niall Douglas via Boost wrote:

...

The way I have solved this problem is brutal, but it works. In my singleton constructor, I create a named shared memory area unique to my process. In that named shared memory area, I store a pointer to myself, and nothing else.

Since you are using shared memory, another (untested) solution could be to let the singleton data be instantiated in shared memory together with a reference count. A singleton is simply a proxy for this data, and the reference count is use to track which singleton should create or destroy the data.

2429

Age (days ago)

2451

Last active (days ago)

List overview

Download

52 comments

15 participants

participants (15)

Alexander Grund
Andrey Semashev
Bjorn Reese
Dominique Devienne
Gavin Lambert
James E. King III
John Maddock
Mike Dev
Niall Douglas
Olaf van der Spek
Peter Dimov
Robert
Robert Ramey
Steven Watanabe
Vinnie Falco