Boost range - Add variadic join/zip

Code available here: https://gist.github.com/gnzlbg/5547905
Let,
std::array

However, boost's zip_iterators are not writable. I don't really think it would be possible to make them writable, but that would allow code like this:
<snip> </snip>
template
auto join(C&& c, D&& d, Args&&... args) -> decltype(boost::join(boost::join(boost::make_iterator_range(std::forward<C>(c)),
boost::make_iterator_range(std::forward<D>(d))), join(std::forward<Args>(args)...))) { return boost::join(boost::join(boost::make_iterator_range(std::forward<C>(c)),
boost::make_iterator_range(std::forward<D>(d))), join(std::forward<Args>(args)...)); }
A variadic zip is more complicated. If we want write access we cannot use boost zip_iterator. Still one can use Anthony Williams' TupleIterator:
template
auto zip(T&&... c) -> boost::iterator_range< decltype(iterators::makeTupleIterator(std::begin(std::forward<T>(c))...))> { return boost::make_iterator_range (iterators::makeTupleIterator(std::begin(std::forward<T>(c))...), iterators::makeTupleIterator(std::end(std::forward<T>(c))...)); }
For read-only access one could use boost::zip_iterator, but I think write-access is _really_ important (e.g. sort wouldn't work).
Would it be possible to add similar functionality to boost range?
Yes and I shall do so when I find time. If you find time to create a patch it would be even quicker of course, but I understand if this is not possible. It seems we are all running with less and less time to work on these extra-curricula projects!
This code wouldn't work without the help of everyone who participated in the following SO discussions: -
http://stackoverflow.com/questions/14366576/boostrangejoin-for-multiple-rang... -
http://stackoverflow.com/questions/13840998/sorting-zipped-locked-containers... and without Anthony Williams's tupleIterator. See also http://www.justsoftwaresolutions.co.uk/articles/pair_iterators.pdf
I'll have to see if I can persuade Anthony to be charitable enough to allow us to include his tuple iterator. Ideally this would be made a public feature of Boost.Iterator I think.
Bests, Gonzalo BG
Thank you for your feedback. It's great to have positive suggestions that are actionable like this. Regards, Neil Groves

On 9 May 2013 16:01, Gonzalo BG wrote:
A variadic zip is more complicated. If we want write access we cannot use boost zip_iterator. Still one can use Anthony Williams' TupleIterator:
template
auto zip(T&&... c) -> boost::iterator_range< decltype(iterators::makeTupleIterator(std::begin(std::forward<T>(c))...))> { return boost::make_iterator_range (iterators::makeTupleIterator(std::begin(std::forward<T>(c))...), iterators::makeTupleIterator(std::end(std::forward<T>(c))...)); }
For read-only access one could use boost::zip_iterator, but I think write-access is _really_ important (e.g. sort wouldn't work).
It looks as though it's undefined behaviour to zip ranges of different lengths, because you'll walk off the end of the shorter ranges. My variadic zip stops at the end of the shortest range, which seems to be consistent with zip functions in most other languages I've looked at. Your adaptors also get dangling references if used with rvalue ranges, although this is a problem with the existing boost range adaptors too.

On Fri, May 10, 2013 at 11:28 AM, Jonathan Wakely
On 9 May 2013 16:01, Gonzalo BG wrote:
A variadic zip is more complicated. If we want write access we cannot use boost zip_iterator. Still one can use Anthony Williams' TupleIterator:
template
auto zip(T&&... c) -> boost::iterator_range< decltype(iterators::makeTupleIterator(std::begin(std::forward<T>(c))...))> {
return boost::make_iterator_range (iterators::makeTupleIterator(std::begin(std::forward<T>(c))...), iterators::makeTupleIterator(std::end(std::forward<T>(c))...)); }
For read-only access one could use boost::zip_iterator, but I think write-access is _really_ important (e.g. sort wouldn't work).
It looks as though it's undefined behaviour to zip ranges of different lengths, because you'll walk off the end of the shorter ranges.
My variadic zip stops at the end of the shortest range, which seems to
be consistent with zip functions in most other languages I've looked at.
I like being able to avoid the cost of checking for the end of every item in the zip especially for non-random access iterators. In anything I put into Boost.Range I think it of paramount importance to obey the zero overhead principle. It seems that it would be simple to allow both end detection mechanisms.
Your adaptors also get dangling references if used with rvalue ranges, although this is a problem with the existing boost range adaptors too.
Yes, this has come up numerous times. It's a problem far beyond just ranges and range adaptors. Knowing you a little, I suspect you have a solution I have not thought of to better deal with the issue. Is the variadic zip iterator you implemented public? Thanks, Neil Groves

On 10 May 2013 12:55, Neil Groves wrote:
On Fri, May 10, 2013 at 11:28 AM, Jonathan Wakely
It looks as though it's undefined behaviour to zip ranges of different lengths, because you'll walk off the end of the shorter ranges.
My variadic zip stops at the end of the shortest range, which seems to
be consistent with zip functions in most other languages I've looked at.
I like being able to avoid the cost of checking for the end of every item in the zip especially for non-random access iterators. In anything I put into Boost.Range I think it of paramount importance to obey the zero overhead principle. It seems that it would be simple to allow both end detection mechanisms.
Hi Neil, That makes sense. My implementation always truncates the ranges to the shortest length but I should make it unchecked and then provide a second interface to do the checking and truncating if needed.
Your adaptors also get dangling references if used with rvalue ranges, although this is a problem with the existing boost range adaptors too.
Yes, this has come up numerous times. It's a problem far beyond just ranges and range adaptors. Knowing you a little, I suspect you have a solution I have not thought of to better deal with the issue.
Is the variadic zip iterator you implemented public?
I don't have my own zip_iterator (well, I do, but I'm still working on
it :-) but my zip() function is at
https://gitorious.org/redistd/redistd/blobs/master/include/redi/zip.h
and just uses boost::zip_iterator.
The solution to the dangling reference problem is surprisingly simple in C++11:
template<typename Traversable>
struct adaptor
{
explicit adaptor(Traversable&& r) : range(r) { }
Traversable range;
auto begin() const -> decltype(range.begin()) { return range.begin(); }
auto end() const -> decltype(range.end()) { return range.end(); }
};
// When called with lvalue, Traversable deduces to R&
// When called with rvalue, Traversable deduces to R
template<typename Traversable>
adaptor<Traversable>
adapt(Traversable&& t)
{
return adaptor<Traversable>(std::forward<Traversable>(t));
}
When you call adapt(lvalue) you get an adaptor

On 5/10/2013 8:44 AM, Jonathan Wakely wrote:
On 10 May 2013 12:55, Neil Groves wrote:
On Fri, May 10, 2013 at 11:28 AM, Jonathan Wakely
It looks as though it's undefined behaviour to zip ranges of different lengths, because you'll walk off the end of the shorter ranges.
My variadic zip stops at the end of the shortest range, which seems to
be consistent with zip functions in most other languages I've looked at.
I like being able to avoid the cost of checking for the end of every item in the zip especially for non-random access iterators. In anything I put into Boost.Range I think it of paramount importance to obey the zero overhead principle. It seems that it would be simple to allow both end detection mechanisms.
The closest existing practice I see is the std::mismatch algorithm which requires the first of the two input sequences to be the longest, and only checks first1 != last1. boost::range::mismatch pays the overhead of checking both input sequence iterators against their corresponding end iterators. Perhaps the std::mismatch approach is appropriate for zip iterator as well. In accommodating std::mismatch requirements when I don't know that the first sequence is longest I've a wrapped mismatch that check's the sizes and swaps argument references, though this is limited in it's genericity. Jeff

On 10 May 2013 15:31, Jeff Flinn wrote:
The closest existing practice I see is the std::mismatch algorithm which requires the first of the two input sequences to be the longest, and only checks first1 != last1.
As an aside, see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3607.html which was voted into C++14

On Fri, May 10, 2013 at 3:31 PM, Jeff Flinn
On 5/10/2013 8:44 AM, Jonathan Wakely wrote:
On 10 May 2013 12:55, Neil Groves wrote:
On Fri, May 10, 2013 at 11:28 AM, Jonathan Wakely
It looks as though it's undefined behaviour to zip ranges of different lengths, because you'll walk off the end of the shorter ranges.
My variadic zip stops at the end of the shortest range, which seems to
be consistent with zip functions in most other languages I've looked at.
I like being able to avoid the cost of checking for the end of every item in the zip especially for non-random access iterators. In anything I put into Boost.Range I think it of paramount importance to obey the zero overhead principle. It seems that it would be simple to allow both end detection mechanisms.
The closest existing practice I see is the std::mismatch algorithm which requires the first of the two input sequences to be the longest, and only checks first1 != last1.
boost::range::mismatch pays the overhead of checking both input sequence iterators against their corresponding end iterators.
It does indeed pay the price under all circumstances with no option to opt-out. This is my design error. I have a renewed effort on obeying the zero-overhead principle in the last year or so. I've found design decisions in libraries that I have used to not provide zero-overhead options extremely limiting. I shall go back and review all of the Boost.Range code and provide zero-overhead options wherever I have failed to do so. For a zip iterator, of course, the overhead could be considerably greater depending on how many ranges were zipped together. I don't believe there is a need for me to choose the right solution for my clients. It's trivial to allow both.
Perhaps the std::mismatch approach is appropriate for zip iterator as well. In accommodating std::mismatch requirements when I don't know that the first sequence is longest I've a wrapped mismatch that check's the sizes and swaps argument references, though this is limited in it's genericity.
Ah yes this comes back to the boost::size modifications I also have to do! I could utilise an optimised boost::size() that provides O(1) for containers such as list to provide optimised implementations under more scenarios. It looks like I've got some work to do!
Jeff
Thank you for pointing this out since I had honestly forgotten that I'd made this design decision for mismatch. Regards, Neil Groves

On 12 May 2013 20:33, Dave Abrahams wrote:
on Thu May 09 2013, Gonzalo BG
wrote: However, boost's zip_iterators are not writable.
That's surprising to hear. IIRC, at least, they used to be writable.
They still are. The docs even make note of it: http://www.boost.org/doc/libs/1_53_0/libs/iterator/doc/zip_iterator.html#zip... "The fact that the zip_iterator models only Readable Iterator does not prevent you from modifying the values that the individual iterators point to."
participants (5)
-
Dave Abrahams
-
Gonzalo BG
-
Jeff Flinn
-
Jonathan Wakely
-
Neil Groves