[multi_array] operator= requires resize - why?
Hi! IMHO the behaviour of operator= for multi_array is a little bit broken. A lot of preconditions required: template <typename ConstMultiArray> multi_array_ref& operator=(const ConstMultiArray& other) { [...] // make sure the dimensions agree assert(other.num_dimensions() == this->num_dimensions()); assert(std::equal(other.shape(),other.shape()+this->num_dimensions(), this->shape())); Besides the fact that I get mad about asserts instead of exceptions (please change this to a way users may choose) why do we have such an odd behaviour here?
From page 1, line 3 of the docs I cite: "The interface design is in line with the precedent set by the C++ Standard Library containers" - this is not true for operator=, since std::<container>::operator= works fine for me all the time _without_ any resize.
Could we please change the behaviour of multi_array::operator= to take care of resize where in need to? IMHO operator= should result in a true copy ignoring previous state. Markus
Markus Werle
Hi!
IMHO the behaviour of operator= for multi_array is a little bit broken.
And: I cannot figure out from the docs how to prepare a multi_array for
assignment.
I tried:
#include
Markus Werle
Markus Werle
writes: Hi!
IMHO the behaviour of operator= for multi_array is a little bit broken.
And: I cannot figure out from the docs how to prepare a multi_array for assignment.
not very elegant, but try m2.resize(boost::extents[m1.shape()[0]][m1.shape()[1]]); Does this interface need a redesign?
On Jan 16, 2007, at 9:54 AM, Markus Werle wrote:
Markus Werle
writes: Hi!
IMHO the behaviour of operator= for multi_array is a little bit broken.
And: I cannot figure out from the docs how to prepare a multi_array for assignment.
For this purpose, you will want to use the resize() operation, not reshape. resize is documented in the library reference documentation. Regards, ron
Hello Markus, On Jan 16, 2007, at 9:06 AM, Markus Werle wrote:
IMHO the behaviour of operator= for multi_array is a little bit broken. A lot of preconditions required:
Indeed operator= for multi_array requires the sizes of the multi_arrays to match, hence the assertions. The multi_array type is not meant to automatically resize when assigned to. All resizing is intentionally meant to be explicit. For your application, you will need to resize your array prior to assignment.
Besides the fact that I get mad about asserts instead of exceptions (please change this to a way users may choose) why do we have such an odd behaviour here?
I'm afraid that we do not agree about the place for asserts versus exceptions. The assertions in operator=() enforce library preconditions regarding assignment. See the document http:// boost.org/more/error_handling.html under "What about programmer errors?" Regards, ron
On Jan 16, 2007, at 4:59 PM, Ronald Garcia wrote:
Hello Markus,
On Jan 16, 2007, at 9:06 AM, Markus Werle wrote:
IMHO the behaviour of operator= for multi_array is a little bit broken. A lot of preconditions required:
Indeed operator= for multi_array requires the sizes of the multi_arrays to match, hence the assertions. The multi_array type is not meant to automatically resize when assigned to. All resizing is intentionally meant to be explicit. For your application, you will need to resize your array prior to assignment.
This is awkward since it means that any class with a multi_array member cannot use the default operator= but needs to implement a special operator=. We just ran into a subtle bug with this because of the semantics of the valarray operator=, and I would strongly argue for an operator = that makes the semantics of A x = y; equivalent to A x; x = y; Matthias
On Jan 16, 2007, at 3:51 PM, Matthias Troyer wrote:
On Jan 16, 2007, at 4:59 PM, Ronald Garcia wrote:
Hello Markus,
On Jan 16, 2007, at 9:06 AM, Markus Werle wrote:
IMHO the behaviour of operator= for multi_array is a little bit broken. A lot of preconditions required:
Indeed operator= for multi_array requires the sizes of the multi_arrays to match, hence the assertions. The multi_array type is not meant to automatically resize when assigned to. All resizing is intentionally meant to be explicit. For your application, you will need to resize your array prior to assignment.
This is awkward since it means that any class with a multi_array member cannot use the default operator= but needs to implement a special operator=. We just ran into a subtle bug with this because of the semantics of the valarray operator=, and I would strongly argue for an operator = that makes the semantics of
A x = y;
equivalent to
A x; x = y;
Hi Matthias, multi_array's operator=() was designed to behave in a manner consistent with the other array objects provided by the library (subarrays, views, multi_array_ref). In particular, assignment between any of these other types requires that the dimensions of the arrays match, and it performs an assignment to the analogous locations in the array. I am loath to alter these semantics. One possible approach, which I'm not completely happy with, is to provide special behavior for default constructed multi_arrays. In particular, assignment to a default-constructed multi_array would cause it to resize during assignment. This would make such default- constructed multi_arrays special. Cheers, ron
Ronald Garcia
On Jan 16, 2007, at 3:51 PM, Matthias Troyer wrote:
On Jan 16, 2007, at 4:59 PM, Ronald Garcia wrote:
Hello Markus,
On Jan 16, 2007, at 9:06 AM, Markus Werle wrote:
IMHO the behaviour of operator= for multi_array is a little bit broken. A lot of preconditions required:
Indeed operator= for multi_array requires the sizes of the multi_arrays to match, hence the assertions. The multi_array type is not meant to automatically resize when assigned to. All resizing is intentionally meant to be explicit. For your application, you will need to resize your array prior to assignment.
This is awkward since it means that any class with a multi_array member cannot use the default operator= but needs to implement a special operator=. We just ran into a subtle bug with this because of the semantics of the valarray operator=, and I would strongly argue for an operator = that makes the semantics of
A x = y;
equivalent to
A x; x = y;
Hi Matthias,
multi_array's operator=() was designed to behave in a manner consistent with the other array objects provided by the library (subarrays, views, multi_array_ref). In particular, assignment between any of these other types requires that the dimensions of the arrays match, and it performs an assignment to the analogous locations in the array. I am loath to alter these semantics.
To Ron's credit, the documentation does not say that multi_array models Assignable. If it did, it would have to obey the semantics listed above for any two multi_arrays that have the same template parameters. I am curious to know the source of Ron's loathing, though. But I agree with Markus' point that modeling Assignable is a good thing. It means multi_array would become Regular since it already has equality and a total ordering (assuming the element type does). This would make multi_array a more useful class, in my opinion.
One possible approach, which I'm not completely happy with, is to provide special behavior for default constructed multi_arrays. In particular, assignment to a default-constructed multi_array would cause it to resize during assignment. This would make such default- constructed multi_arrays special.
I don't think creating more special cases is the way to proceed. The question seems to be: are two multi_arrays with the same template parameters the same type or not? Right now the answer is "sometimes" because of assignment. I think everyone would be better served if the answer was either "yes" or "no." I will vote "yes" if given a chance, because it will simplify the class instead of making it more complex. If the answer is "no", the dimensions might as well be set at compile time. Sincerely, Mark
Mark Ruzon
To Ron's credit, the documentation does not say that multi_array models Assignable.
I disagree and this is why I started the thread, since I was really, really surprised it did not: <cite from="http://boost.org/libs/multi_array/doc/user.html"> The interface design is in line with the precedent set by the C++ Standard Library containers. </cite> So either we remove that sentence, mention the exceptions or change the code.
If it did, it would have to obey the semantics listed above for any two multi_arrays that have the same template parameters. I am curious to know the source of Ron's loathing, though.
I think this is a performance issue (numerical simulation people tend to erase any extra overhead, because their code must compete with braind dead f77 nerd code that had too much time to evolve), but as I said in my other post, we can get around that one with a free unchecked assign function: I found it by now: std::copy. If my profiler points me to multi_array::operator= being the bottleneck I still can get along with std::copy(v1.data(), v1.data() + v1.num_elements(), v2.data()); So please, Mr. Garcia, reconsider my proposal. You write in another post:
multi_array's operator=() was designed to behave in a manner consistent with the other array objects provided by the library (subarrays, views, multi_array_ref).
but since multi_array is special in that it is not a view on data, but a container it is really OK to have different assignment semantics, too.
The question seems to be: are two multi_arrays with the same template parameters the same type or not? Right now the answer is "sometimes" because of assignment. I think everyone would be better served if the answer was either "yes" or "no." I will vote "yes" if given a chance, because it will simplify the class instead of making it more complex. If the answer is "no", the dimensions might as well be set at compile time.
Good point. Markus
On Jan 25, 2007, at 10:15 AM, Markus Werle wrote:
Mark Ruzon
writes: To Ron's credit, the documentation does not say that multi_array models Assignable.
I disagree and this is why I started the thread, since I was really, really surprised it did not:
<cite from="http://boost.org/libs/multi_array/doc/user.html"> The interface design is in line with the precedent set by the C++ Standard Library containers. </cite>
So either we remove that sentence, mention the exceptions or change the code.
I would strongly encourage to make multi_array Assignable.
If it did, it would have to obey the semantics listed above for any two multi_arrays that have the same template parameters. I am curious to know the source of Ron's loathing, though.
I think this is a performance issue (numerical simulation people tend to erase any extra overhead, because their code must compete with braind dead f77 nerd code that had too much time to evolve),
I doubt that for any reasonable size the check for equal sizes and optional resize will hurt performance
So please, Mr. Garcia, reconsider my proposal.
You write in another post:
multi_array's operator=() was designed to behave in a manner consistent with the other array objects provided by the library (subarrays, views, multi_array_ref).
but since multi_array is special in that it is not a view on data, but a container it is really OK to have different assignment semantics, too.
I agree. Matthias
Ronald Garcia
On Jan 16, 2007, at 9:06 AM, Markus Werle wrote:
IMHO the behaviour of operator= for multi_array is a little bit broken. A lot of preconditions required:
Indeed operator= for multi_array requires the sizes of the multi_arrays to match, hence the assertions. The multi_array type is not meant to automatically resize when assigned to. All resizing is intentionally meant to be explicit. For your application, you will need to resize your array prior to assignment.
Please explain why this is good design. An operator= that has preconditions for the object state which will change through the assignment is somehow unexpected and AFAICS useless. I learned from Herb Sutter (http://www.gotw.ca/gotw/059.htm) that operator= is 'using the "create a temporary and swap" idiom' which is based on the copy consructor and clearly indicates what normal users might expect from operator= (and yes, this behaviour makes sense to me). There have to be very good reasons for not to follow that "convention" for multi_array. Please convince me. As I stated before: the whole STL has a different behaviour. This is one of the rare cases where I agree with its design :-) I do not catch the advantage of m2.resize(boost::extents[m1.shape()[0]][m1.shape()[1]]); m2 = m1; over m2 = m1; At least change the docs: If there are good reasons for an esoteric copy assignment, the docs of multi_array should not claim to be as close to STL as can be. They are not and I felt kind of betrayed when I found out. Also I cannot understand why the interface fails to at least support m2.resize(m1.extents()); and m2.swap(array_type(m1)); Was the lack of swap discussed during peer review?
Besides the fact that I get mad about asserts instead of exceptions (please change this to a way users may choose) why do we have such an odd behaviour here?
I'm afraid that we do not agree about the place for asserts versus exceptions.
Who is "we"?
The assertions in operator=() enforce library preconditions regarding assignment. See the document http:// boost.org/more/error_handling.html under "What about programmer errors?"
OK, this is another battlefield. Just a comment: Unfortunately it is _NOT_ a programmer's error, but a user's error. I have written a rather elaborate application with a command-line interface similar to matlab based on boost::spirit. So it is the _user_ who defines objects containing arrays of different sizes. It is a matter of QOI that I catch anything that may go wrong and send information about the error and possible reasons to a text output window. So now instead of having a throwing lib (which I'd prefer even for NDEBUG which is another case against assert) I have to duplicate error checks already contained in the library just to make sure not to have any uncaught error path that leads to a crash. So instead of try-call-catch I have check-then-call which I find odd. But as I said before: we are on a side track here. Exceptions clearly are a matter of religion and your mileage may vary. This is why I prefer libs where the user can select the behaviour. The root of my problem with multi_array is the (IMHO) wrong assignment behaviour which I really like to see changed and I'd rather narrow down the discussion to that issue. best regards, Markus
Ronald Garcia
writes: On Jan 16, 2007, at 9:06 AM, Markus Werle wrote:
IMHO the behaviour of operator= for multi_array is a little bit broken. A lot of preconditions required:
Indeed operator= for multi_array requires the sizes of the multi_arrays to match, hence the assertions. The multi_array type is not meant to automatically resize when assigned to. All resizing is intentionally meant to be explicit. For your application, you will need to resize your array prior to assignment.
Please explain why this is good design. An operator= that has preconditions for the object state which will change through the assignment is somehow unexpected and AFAICS useless. multi_array is primarily designed so that the parts of the state that are checked, in particular the shape and number of dimensions, will not change. Resizing functionality is special for the multi_array class, and is not available for the other array types provided by the
Also I cannot understand why the interface fails to at least support
m2.resize(m1.extents()); This looks like an oversight on my part. I will see about supporting
On Jan 17, 2007, at 5:52 AM, Markus Werle wrote: library (multi_array_ref, views, and subarrays). Resize was added to the multi_array() long after the library was introduced to boost. The class is primarily intended to keep the same shape throughout its lifetime, but resizing is supported primarily to enable default construction of multi_arrays. reshape() is particular to multi_array and multi_array_ref. this. Thanks for pointing it out.
m2.swap(array_type(m1));
Was the lack of swap discussed during peer review?
Is there something insufficient about using std::swap?
I'm afraid that we do not agree about the place for asserts versus exceptions.
Who is "we"?
We = you and I. Sorry for the confusion.
OK, this is another battlefield. Just a comment: Unfortunately it is _NOT_ a programmer's error, but a user's error. I have written a rather elaborate application with a command-line interface similar to matlab based on boost::spirit. So it is the _user_ who defines objects containing arrays of different sizes. It is a matter of QOI that I catch anything that may go wrong and send information about the error and possible reasons to a text output window. So now instead of having a throwing lib (which I'd prefer even for NDEBUG which is another case against assert) I have to duplicate error checks already contained in the library just to make sure not to have any uncaught error path that leads to a crash.
So instead of try-call-catch I have check-then-call which I find odd.
But as I said before: we are on a side track here. Exceptions clearly are a matter of religion and your mileage may vary. This is why I prefer libs where the user can select the behaviour.
Yes, this is a use that I hadn't considered before. Furthermore, at least resizing is not something likely to be present in an inner loop. I am willing to reconsider the use of asserts in light of your explanation above.
The root of my problem with multi_array is the (IMHO) wrong assignment behaviour which I really like to see changed and I'd rather narrow down the discussion to that issue.
I hope that my comments above have provided some insight into the design decisions underlying the library. Cheers, ron
Ronald Garcia
[...] multi_array is primarily designed so that the parts of the state that are checked, in particular the shape and number of dimensions, will not change. Resizing functionality is special for the multi_array class, and is not available for the other array types provided by the library (multi_array_ref, views, and subarrays).
which makes perfect sense to me, since the semantics are "containment" for multi_array and "reference" for all others. I am in accordance with this design. Also note that besides the defect of operator= I consider multi_array to be a perfect lib.
Resize was added to the multi_array() long after the library was introduced to boost.
.. because I missed the review ;-)
The class is primarily intended to keep the same shape throughout its lifetime,
Isn't this a somewhat artificial usage constraint?
but resizing is supported primarily to enable default construction of multi_arrays.
So this is water on my mills: since you allow default construction it is intuitive beaviour that an assignment following default construction will not fail. You require a resize between default construction and assignment and I still do not get the advantage. What Matthias pointed out earlier weighs even more: being forced not to rely on compiler generated copy constructor renders multi_array useless for many applications I can think of. Think again: - AFAICS ublas assignment semantics differ from yours. - STL assignment semantics differ from yours (!!!!) What stands against changing operator=? Do we break existing code? Do we get inferior performance? Does a possible performace loss outweigh the pain of having to write copy constructors and assignment operators by hand?
reshape() is particular to multi_array and multi_array_ref.
Seems OK for me, since shape is another word for "N-D way to look at 1-D data"
Also I cannot understand why the interface fails to at least support
m2.resize(m1.extents());
This looks like an oversight on my part. I will see about supporting this. Thanks for pointing it out.
Thanks for taking this serious.
m2.swap(array_type(m1));
Was the lack of swap discussed during peer review?
Is there something insufficient about using std::swap?
The extra temporary perhaps? AFAICS the following will be performed: template<class _Ty> inline void swap(_Ty& _Left, _Ty& _Right) { // exchange values stored at _Left and _Right _Ty _Tmp = _Left; _Left = _Right, _Right = _Tmp; } Swapping base_ directly and updating the extents would speed up the game for large extents - or am I missing something?
I'm afraid that we do not agree about the place for asserts versus exceptions.
Who is "we"?
We = you and I. Sorry for the confusion.
OK, this is another battlefield. Just a comment: [lenghty discussion from markus snipped]
So instead of try-call-catch I have check-then-call which I find odd. [...]
Yes, this is a use that I hadn't considered before. Furthermore, at least resizing is not something likely to be present in an inner loop. I am willing to reconsider the use of asserts in light of your explanation above.
Thank you very much.
The root of my problem with multi_array is the (IMHO) wrong assignment behaviour which I really like to see changed and I'd rather narrow down the discussion to that issue.
I hope that my comments above have provided some insight into the design decisions underlying the library.
Yes, but I still do not agree with the design and kindly ask you to consider implementing swap and then operator= in terms of copy construction and swap, unless performance considerations bring us to another conclusion. Or in order to serve both worlds, especially the performance paranoids: what about having an operator= in terms of copy-construction and swap _and_ the free function friend unchecked_assign(multi_array & me, multi_array const & other); ? for all inner loops. regards, Markus
Hi Markus, My apologies for the delay in replying. I have been quite busy.
Resize was added to the multi_array() long after the library was introduced to boost.
.. because I missed the review ;-)
:)
The class is primarily intended to keep the same shape throughout its lifetime,
Isn't this a somewhat artificial usage constraint?
Multiarray was meant in particular to be a Generic array library. In particular, all the array data structures were designed to model the MultiArray concept, which is specified in the reference documentation. Based on this concept, a programmer can write generic algorithms that operate on arrays (are templated on arrays) and so long as they adhere to the concept, any of the array types (subarrays, multi_array, views, refs, and compliant user-defined array types) would be usable with the arrays. The concept covers assignment between arrays, specifically treating them as values: assignment between arrays results in a deep copy of the elements, and it requires them to have the same shape. If multi_array's assignment had different semantics from the rest of the arrays, then assignment would behave very differently generic algorithms depending on whether you pass a multi_array or a different function.
but resizing is supported primarily to enable default construction of multi_arrays.
So this is water on my mills: since you allow default construction it is intuitive beaviour that an assignment following default construction will not fail. You require a resize between default construction and assignment and I still do not get the advantage.
Yes, I see how this can be confusing. Taking multi_array alone and its support for default construction, the library's behavior seems arbitrarily strange. However, my intent to support generic array programming (including assignment between arrays) guided the original design, and resizing was added later because it is very useful particularly in non-generic contexts.
What Matthias pointed out earlier weighs even more: being forced not to rely on compiler generated copy constructor renders multi_array useless for many applications I can think of.
I think you mean "compiler generated assignment operator" here (at least, that is what Matthias Troyer was pointing to). Copy construction should work fine. The auto-generated operator= works fine, so long as it is understood that assignment can only happen between arrays that have the same shape.
Think again: - AFAICS ublas assignment semantics differ from yours.
However, MultiArray has been part of boost longer than ublas has.
- STL assignment semantics differ from yours (!!!!)
That is true. However, I hope that my discussion of generic arrays above might explain why I intentionally differed from the STL in this regard. I did not intend to mislead when I compared this library to the STL in the documentation. The comparison is not exact, but for specific reasons (that at the least are worth documenting. I appreciate that you have pointed out this inconsistency)
m2.swap(array_type(m1));
Was the lack of swap discussed during peer review?
Is there something insufficient about using std::swap?
The extra temporary perhaps?
Good point :).
On the whole, there seems to be a tension between my goals for
genericity (in the sense of generic programming) and the needs of
boosters that need multi_array to auto-resize for non-generic usage.
One possible solution (I am interested in feedback), is to add a
policy option to the multi_array type that would enable resizing
assignment at compile-time, as in:
multi_array
I have a couple of questions about this topic that seems to keep with beeing mind-bugging for several boost users (see more recent posts on this). Ronald Garcia wrote
Multiarray was meant in particular to be a Generic array library. In particular, all the array data structures were designed to model the MultiArray concept, which is specified in the reference documentation. [...] The concept covers assignment between arrays
I do not find that in the reference documentation. The assignment is covered when talking about multi_array, not when defining the MultiArray concept. Ronald Garcia wrote
If multi_array's assignment had different semantics from the rest of the arrays, then assignment would behave very differently generic algorithms depending on whether you pass a multi_array or a different function.
I cannot get why introducing automatically resizing would cause any trouble for any generic algorithm, since it would not prevent from calling the assignment for objects that do have the same shape. Do you have any precise example to explain why it would be such a trouble? Regards, Etienne -- View this message in context: http://boost.2283326.n4.nabble.com/multi-array-operator-requires-resize-why-... Sent from the Boost - Users mailing list archive at Nabble.com.
Ronald Garcia wrote
If multi_array's assignment had different semantics from the rest of the arrays, then assignment would behave very differently generic algorithms depending on whether you pass a multi_array or a different function.
I cannot get why introducing automatically resizing would cause any trouble for any generic algorithm, since it would not prevent from calling the assignment for objects that do have the same shape. Do you have any precise example to explain why it would be such a trouble?
A total guess... You have two 4D multi_arrays used in an algorithm. Algorithm checks that their fastest dimension has the same size. It does. The algorithm is called recursively on a 3D subproblem. The algorithm checks that the two 3D sub-multi_arrays have a fastest dimension with the same size. They don't. Now what should the generic algorithm do? If it resizes the 3D subproblem bad things happen to the original 4D multi_array. - Rhys
Rhys Ulerich-2 wrote:
A total guess... You have two 4D multi_arrays used in an algorithm. Algorithm checks that their fastest dimension has the same size. It does. The algorithm is called recursively on a 3D subproblem. The algorithm checks that the two 3D sub-multi_arrays have a fastest dimension with the same size. They don't. Now what should the generic algorithm do? If it resizes the 3D subproblem bad things happen to the original 4D multi_array.
I completely understand that many generic algorithms require that sizes match. And we agree on that they should not resize any of the (sub-)multi_array. But I'm afraid this is missing the point. My feeling is that generic algorithm should check that the sizes match if it requires so, NOT the assignment operator. What the assignment operator should require seems largely independent to me. Etienne -- View this message in context: http://boost.2283326.n4.nabble.com/multi-array-operator-requires-resize-why-... Sent from the Boost - Users mailing list archive at Nabble.com.
Now what should the generic algorithm do? If it resizes the 3D subproblem bad things happen to the original 4D multi_array.
I completely understand that many generic algorithms require that sizes match. And we agree on that they should not resize any of the (sub-)multi_array. But I'm afraid this is missing the point. My feeling is that generic algorithm should check that the sizes match if it requires so, NOT the assignment operator. What the assignment operator should require seems largely independent to me.
The assignment operator is implemented using a generic, dimension-agnostic, iterator-based copy. The implementation does not know when resizing is safe and therefore does not resize. - Rhys
Rhys Ulerich-2 wrote:
The assignment operator is implemented using a generic, dimension-agnostic, iterator-based copy. The implementation does not know when resizing is safe and therefore does not resize.
In my opinion, it is of the developer busyness to know whether or not it is safe to resize. The implementation should not impose that it is not safe. Etienne -- View this message in context: http://boost.2283326.n4.nabble.com/multi-array-operator-requires-resize-why-... Sent from the Boost - Users mailing list archive at Nabble.com.
participants (6)
-
Etienne Bresciani
-
Mark Ruzon
-
Markus Werle
-
Matthias Troyer
-
Rhys Ulerich
-
Ronald Garcia