Re: [boost] Final benchmark graphs for Colony vs std:: containers now available

4 May 2016

      ...
Downloading a zip is a higher barrier of entry than what is usually
expected. It would help the feedback if the source code would be browsable
online. Boost uses github and other libraries being submitted do as well
but that is not the only alternative. This would probably increase the
amount of feedback.
You said you were concerned about people's ability to replicate the 
tests. I pointed out they have that ability. I don't consider a zip file 
a barrier but I accept it may be easier for some people.
...
The text just says that N elements are inserted into a container. Is the
container empty before the insertion? If not, how many elements does it
have? Is it the same number of elements for all N?
You say you've read the source code, so I take it you're using this as a 
way of pointing out what you think is necessary to be communicated, and 
that's fine. But for the record we are starting with empty containers 
and inserting singly on the fly. This replicates typical game engine 
code and the scenario which colony is made for, described throughout the 
benchmarks as fast insertion/erasure in realtime without invalidating 
pointers/iterators/indexes.
Single insertion is also the only insertion criteria for any insertion 
benchmark I've seen, so I personally feel it's a logical leap to say it 
needs a large amount of explanation, but I accept that you feel that 
should be made more clear.
...
...
Regards insertion, it is all the fastest method available to that class,
No, it is not. From the source code it looks like you use push_back for
vector when inserting a range of N elements at the end instead of
vector::insert(end, N, T()) or vector::insert(end, begin, end). This is why
I would like to see a table in the text indicating which method it is used
because this is relevant performance-wise.
That doesn't typically replicate game engine code, as it assumes you 
know the number of elements in advance and have all data in advance, in 
which case, all the AAA dev's I have talked to or listened to have said, 
if you know the size in advance, and already have all the data, you 
would use an C-style array for performance/vectorizing/compatibility 
reasons. Watch the SG14's and mike acton's cppcon talks for more info.
...
Since the text is omitting relevant details to me as a reader it feels like
it is trying to hide something.
I don't feel this is a polite or appropriate thing to say. There is a 
lot of information and a LOT of writing in between those benchmarks, I'm 
sorry that it doesn't include the Specific information you are 
interested in, but that's not a reason to go making assumptions.
...
You haven't answered my question about erasure, but from the source code it
seems that the benchmarking code is erasing single elements of a vector and
a deque which is N^2 and something nobody would do in a real
performance-sensitive application.
Actually this happens in game engines a lot, but a raw unmodified vector 
or deque wouldn't be used in this context because of the performance hit 
you've mentioned. But what I'm doing in those first tests is measuring 
raw performance for comparison in later situations where the vectors etc 
are modified to avoid pointer/index invalidation - as described in the 
text. In the situations where colony would be used - where pointer/index 
validity matters ie. the bulk of situations for game engine code - 
remove_if will invalidate pointers and indexes to elements within the 
container, and is not an appropriate solution by itself.

In the context of modified containers, a remove_if-*like* command might 
be used, but only when the modification prevents invalidation via the 
erase command, when more than a few erasures were happening per-frame, 
the data set is reasonably large, and the erase command was expensive. 
In that case, an additional flag would have to be added to the stored 
structure or the modified container, as conditions for erasure are not 
likely to be the same between elements. Then a remove_if-like command 
would fire at the end-of-frame.

As mentioned in detail, the first, raw benchmarks are ONLY to measure 
which containers might be useful, modified or otherwise, in comparative 
tests. But I grant that the above information is probably something that 
most people would not know, and should possibly be included.

At some point in the future, if I have the time, I may construct a 
variant of one of the modification tests featuring extremely high 
modification and compare using a remove_if-like command. That is the 
only context within which it might be useful.
...
In a nutshell, the thing that I think would benefit the benchmark text the
most is including more details about what exactly are the benchmarks doing
and how (a table would probably suffice).
I think this is a reasonable point, but I think for the most part this 
is pretty clear. I take your point that a differentiation as to what 
erasure method is being used and why is appropriate, but I don't agree 
with you for the insertion details. There's one benchmark where data is 
re-inserted back in, and that is clearly labelled, and toward the end of 
all tests. Regardless if I update the page, which I might, then I will 
include the specification that we are inserting into empty containers, 
though I fail to see how the initial insertion benchmarks would be 
useful or rational if we weren't.

Making the code browsable online
...
would certainly help with getting more feedback and improving the
benchmarks to further pinpoint the best use cases of colony. Choosing the
range insert methods for the containers that offer them as well as
erase_remove_if for the erasure benchmarks is something that is worth
exploring (since in the use cases considered nobody would use vector and
deque that way).
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [boost] Final benchmark graphs for Colony vs std:: containers now available

Soul Studios