Aliased parameters in boost::bind
Hello, The tutorial for Boost.Bind library contains the following sample code: bind can handle functions with more than two arguments, and its argument substitution mechanism is more general: bind(f, _2, _1)(x, y); // f(y, x) bind(g, _1, 9, _1)(x); // g(x, 9, x) bind(g, _3, _3, _3)(x, y, z); // g(z, z, z) bind(g, _1, _1, _1)(x, y, z); // g(x, x, x) As can be seen, all but the first of those samples alias the placeholder parameters, substituting a single passed argument several times. What is not mentioned in the tutorial (http://www.boost.org/doc/libs/1_63_0/libs/bind/doc/html/bind.html#bind.purp ose.using_bind_with_functions_and_fu) is the fact that such aliasing for any parameter with nontrivial move behavior is unexpected and possibly dangerous. For instance, given the function: void f(std::string x, std::string y, std::string z) { std::cout << x << y << z; } One could following the tutorial naively write: auto triple = boost::bind(f, _1, _1, _1); and this would even work for: std::string x = "a pretty long string"; triple(x); printing the string thrice, as expected. However the result of triple(std::string("a pretty long string")) would be the string only printed once, which is hardly what the user would expect. In fact, any bind expression produced with aliasing would be sensitive to the type of the reference passed, violating the functor abstraction badly. So, the question is: 1. This behavior seems to match the std::bind behavior documented in the standard. However, would it not be wise to warn against parameter aliasing in the tutorial then? (instead of demonstrating it as one of the first samples) 2. Maybe we can actually check if a parameter is aliased in boost::bind and copy it if it is - I have not tried it, but I think it can be done. Would it be a more sensible thing to do? The sample above might seem contrived, but I've actually encountered this bug in production code when porting it to boost 1.63 (the boost.signals code implicitly does boost::forward, so it was not outwardly apparent what was happening) Sincerely yours, Evgeny
On 30/03/2017 06:46, Evgeny Fraimovitch via Boost-users wrote:
For instance, given the function: void f(std::string x, std::string y, std::string z) { std::cout << x << y << z; } One could following the tutorial naively write: auto triple = boost::bind(f, _1, _1, _1);
and this would even work for: std::string x = "a pretty long string"; triple(x); printing the string thrice, as expected. However the result of triple(std::string("a pretty long string")) would be the string only printed once, which is hardly what the user would expect. In fact, any bind expression produced with aliasing would be sensitive to the type of the reference passed, violating the functor abstraction badly.
Why would you declare f as copying its parameters by value instead of taking them by const& instead? const& parameters would be safe in this case. const& parameters should be your default for anything with non-trivial copy/move behaviour anyway, so you'd only run into this issue in already-bad code. Although it would be nice if there were some way to emit a compiler warning for this sort of construct, I agree.
For instance, given the function: void f(std::string x, std::string y, std::string z) { std::cout << x << y << z; } One could following the tutorial naively write: auto triple = boost::bind(f, _1, _1, _1);
and this would even work for: std::string x = "a pretty long string"; triple(x); printing the string thrice, as expected. However the result of triple(std::string("a pretty long string")) would be the string only printed once, which is hardly what the user would expect. In fact, any bind expression produced with aliasing would be sensitive to the type of the reference passed, violating the functor abstraction badly.
Why would you declare f as copying its parameters by value instead of taking them by const& instead? const& parameters would be safe in this case.
const& parameters should be your default for anything with non-trivial copy/move behaviour anyway, so you'd only run into this issue in already-bad code.
Although it would be nice if there were some way to emit a compiler warning for this sort of construct, I agree.
Although the code is suboptimal, I do not agree it is bad. This lack of pass by reference in this case should not be affecting correctness, since we are not going to modify the object. Also, it might not be std::string. Shared points are often passed by value and they exhibit the same issue. They can be passed as const reference too, as an optimization, however whether this justifies additional clutter in the parameter list is a matter of personal taste. I can think of no other situation where there would be a semantic difference in passing by (const) value vs passing by const reference. Sincerely yours, Evgeny
On 30/03/2017 11:59, Evgeny Fraimovitch wrote:
Although the code is suboptimal, I do not agree it is bad. This lack of pass by reference in this case should not be affecting correctness, since we are not going to modify the object. Also, it might not be std::string. Shared points are often passed by value and they exhibit the same issue. They can be passed as const reference too, as an optimization, however whether this justifies additional clutter in the parameter list is a matter of personal taste.
Perhaps this is a domain-specific concern, but where I work you get severely beaten* if you pass strings *or* shared_ptrs by value. (Or in general anything with non-trivial copy semantics.) (* or at least mocked at the water cooler and code reviews) Neither memory allocations nor atomic refcount adjustments are free, and the function call boundary is rarely the correct place for them to occur. But yes, ideally this would produce correct output despite being suboptimal. It's one of the unfortunate side effects of perfect forwarding. Having said that, with the existence of perfect forwarding it means you're using C++11, which means you could use lambdas instead of using bind. They don't suffer the same issue because they don't use perfect forwarding: auto triple = [](std::string x) { f(x, x, x); }; auto triple = [](std::string const& x) { f(x, x, x); }; auto triple = [](std::string && x) { f(x, x, x); }; Substituting any of these produces correct output; you would only provoke the incorrect output if you explicitly std::move(x) in the last version, which is an obvious programmer error. It can be most problematic for large codebases with existing pre-C++11 bind code, of course, that is later updated to C++11. Hopefully you have unit tests to catch these sorts of things. :) There's some other corner cases with bind that are a bit weird, for example, given: void g(std::string&& x) { std::string y(std::move(x)); std::cout << y << std::endl; } (which is even worse than pass-by-value in this case, of course.) This compiles and works as expected: auto fn = boost::bind(g, _1); fn(std::string("some string")); As does this: std::string x("some string"); auto fn = boost::bind(g, _1); fn(std::move(x)); But this fails to compile: std::string x("some string"); auto fn = boost::bind(g, std::move(x)); fn(); Even though logically they ought to be the same. (The reason this happens AFAIK is because fn could be called more than once, but wouldn't be in a valid state after the first time; I think someone decided this should be a compile error rather than throwing an exception if you do happen to call it twice. It does compile if g doesn't have an rvalue-reference parameter.) Ironically this is actually one of the scenarios where bind was still superior to lambdas in C++11. Given void g(std::string const& x), you can compile that same last block of code just fine (even with the move), but you couldn't write an equivalent lambda (without contortions) since they lacked move-capturing. This has been fixed in C++14 though, albeit a bit wordy.
-----Original Message----- From: Boost-users [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Gavin Lambert via Boost-users Sent: Thursday, March 30, 2017 6:09 AM To: boost-users@lists.boost.org Cc: Gavin Lambert
Subject: Re: [Boost-users] Aliased parameters in boost::bind On 30/03/2017 11:59, Evgeny Fraimovitch wrote:
Although the code is suboptimal, I do not agree it is bad. This lack of pass by reference in this case should not be affecting correctness, since we are not going to modify the object. Also, it might not be std::string. Shared points are often passed by value and they exhibit the same issue. They can be passed as const reference too, as an optimization, however whether this justifies additional clutter in the parameter list is a matter of personal taste.
Perhaps this is a domain-specific concern, but where I work you get severely beaten* if you pass strings *or* shared_ptrs by value. (Or in general anything with non-trivial copy semantics.)
(* or at least mocked at the water cooler and code reviews) I agree it is very important for performance critical code, much less so for utility code which won't be invoked in tight loops. Anyway, I consider being mocked at water cooler being better than the compiler silently corrupting
the code on boost version transition.
Neither memory allocations nor atomic refcount adjustments are free, and the function call boundary is rarely the correct place for them to occur.
But yes, ideally this would produce correct output despite being suboptimal. It's one of the unfortunate side effects of perfect forwarding.
Having said that, with the existence of perfect forwarding it means you're using C++11, which means you could use lambdas instead of using bind. They don't suffer the same issue because they don't use perfect forwarding:
auto triple = [](std::string x) { f(x, x, x); }; auto triple = [](std::string const& x) { f(x, x, x); }; auto triple = [](std::string && x) { f(x, x, x); };
Substituting any of these produces correct output; you would only provoke the incorrect output if you explicitly std::move(x) in the last version, which is an obvious programmer error.
Using lambdas is exactly how I worked around for it in the place that originally aliased parameters. What I am saying is, though, the Boost.Bind tutorial should not promote parameter aliasing, since it obviously produces extremely fragile code. I believe forwarding twice anything is a *mistake*, it's just a harmless one, when the type has trivial move semantics.
It can be most problematic for large codebases with existing pre-C++11
bind
code, of course, that is later updated to C++11. Hopefully you have unit tests to catch these sorts of things. :)
Well, this instance got caught by a unit test. However, when something as fundamental as parameter passing is concerned you start wondering how well you cover that.
There's some other corner cases with bind that are a bit weird, for
given:
void g(std::string&& x) { std::string y(std::move(x)); std::cout << y << std::endl; }
(which is even worse than pass-by-value in this case, of course.) This compiles and works as expected:
auto fn = boost::bind(g, _1); fn(std::string("some string"));
As does this:
std::string x("some string"); auto fn = boost::bind(g, _1); fn(std::move(x));
But this fails to compile:
std::string x("some string"); auto fn = boost::bind(g, std::move(x)); fn();
Even though logically they ought to be the same. (The reason this happens AFAIK is because fn could be called more than once, but wouldn't be in a valid state after the first time; I think someone decided this should be a compile error rather than throwing an exception if you do happen to call it twice. It does compile if g doesn't have an rvalue-reference parameter.)
Ironically this is actually one of the scenarios where bind was still superior to lambdas in C++11. Given void g(std::string const& x), you can compile
example, that
same last block of code just fine (even with the move), but you couldn't write an equivalent lambda (without contortions) since they lacked move- capturing. This has been fixed in C++14 though, albeit a bit wordy.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
On 30.03.2017 00:12, Gavin Lambert via Boost-users wrote:
const& parameters should be your default for anything with non-trivial copy/move behaviour anyway, so you'd only run into this issue in already-bad code.
Passing by const& can actually force extra copies to be created that would not be created with pass-by-value. Consider: std::string f(); struct X { void by_value(std::string s) { this->some_member = std::move(s); } void by_const_ref(std::string const& s) { // Cannot move from const& this->some_member = s; } std::string some_member; }; X x; // The following line creates no copies. x.by_value(f()); // The following line must create an extra copy. x.by_const_ref(f()); -- Rainer Deyke - rainerd@eldwood.com
On 30/03/2017 20:34, Rainer Deyke wrote:
On 30.03.2017 00:12, Gavin Lambert wrote:
const& parameters should be your default for anything with non-trivial copy/move behaviour anyway, so you'd only run into this issue in already-bad code.
Passing by const& can actually force extra copies to be created that would not be created with pass-by-value.
Consider:
std::string f();
struct X { void by_value(std::string s) { this->some_member = std::move(s); }
void by_const_ref(std::string const& s) { // Cannot move from const& this->some_member = s; }
std::string some_member; };
X x;
// The following line creates no copies. x.by_value(f());
// The following line must create an extra copy. x.by_const_ref(f());
True, but in my experience it's a rare case. It replaces a single copy with two moves, and since moves can decay to copies (in types that implement a copy constructor without a move constructor, which is quite a lot of older code), it's quite likely to be a pessimisation in the general case. Passing by value as a means to take advantage of rvalue move semantics and copy elision is *only* a good choice if you can prove that all of the following are true: 1. The type has a move constructor which is at least twice as fast as the copy constructor. (This is an easy bar to meet if the copy constructor needs to allocate heap memory, and borderline for a shared_ptr copy.) 2. The body of the method *always* (and I do mean really always) needs a copy of the parameter (eg. to initialise a member field or create a lambda/binding -- this is an easy bar to meet for constructors and setters, but very uncommon for any other method). If there is an execution path that doesn't need to make that copy, then it's a pessimisation whenever that path is taken (an avoidable copy -- which you can usually ignore for exception-throw paths, but not other conditions). 3. The method is very likely to be called with both lvalue and rvalue parameters, either equally or with a higher incidence of rvalues. (If most calls are with lvalues with few or no rvalues, it's typically not worth it, because that would be replacing a single copy with a copy+move.) Note that you *can't* meet the first condition for generic templated-type code -- type traits tell you nothing about performance and you can't even detect whether a type actually *has* a "real" move constructor (rather than decaying to a copy constructor). Using multiple overloads or perfect forwarding is generally the best choice for generic code instead. And it's hard (though not impossible) to justify the last condition in library code.
participants (3)
-
Evgeny Fraimovitch
-
Gavin Lambert
-
Rainer Deyke