Re: [boost] Boost Digest, Vol 7351, Issue 3
Hi Andrzej, Thanks for the continued conversation.
From: Andrzej Krzemienski
Subject: [boost] [multi] A successor to Boost.MultiArray? Did I get it right that Multi can be considered a superior alternative to Boost.MultiArray?
yes If so, it triggers a couple of points:
* Can Boost.MultiArray be upgraded instead?
MultiArray is a superset of BMA. I have a test in the CI to show that this library fulfills all the concepts BMA defines. There are still semantic differences in BMA, such as the quality of implementation issues or plain semantic bugs. Before we go down this road, I did try many times to overhaul BMA. The architecture of BMA is beyond repair. Some obstacles seem to be even purposeful features of the library, such as disallowing the assignment of arrays of different sizes. Layout information is heap allocated. Iterators are broken and too basic. Subblock views have a complicated, inefficient implementation. The lack of responsiveness of the original authors was also an issue. Don't get me wrong, BMA was a tour-de-force at the time. It even spearheaded C++ concept checking in C++98, which I have delayed integrating into the library until C++20 Concepts. * The same questions, "is it useful at all", apply also to Boost.MultiArray.
BMA is useful in principle and was a great leap forward then. In practice, it is unfortunately almost unusable. You can tour Stackoverflow questions on #boost-multi-array and see all its pain points.
* Whatever the motivation for using Boost.MultiArray is, it should also be a good motivation for Multi.
yes
I read in the Boost.MultiArraydocs that it is a better replacement for vector
>.
Yes, but this is a simplistic description. It is mainly used to explain the
multi-index element access A[i][j], etc., but the analogy ends there.
Of course it is a completely different data structure, vector
So, maybe a good use case is for simple use cases, where you want to store and perform a simple access to relatively small-sized data sets, where performance is important, but not super-critical, and changing from nested vectors to Multi is already good and sufficient an optimization.
I don't know how you conclude this is for small-sized data?
Do you mean that vector
As a side note, when comparing Multi with std::mdspan. The latter requires C++23, while the former, only C++17. So, when someone has C++17, mdspan is not an option.
yes
BTW, why do you require C++17?
- The main reason was for ease of development: - implementing allocator propagation without if-constexpr is a nightmare. - c++17 provides polymorphic allocators, and I wanted to be compatible with them (at the time, I didn't know about __has_include, it was much simpler to just use C++17) - If I remember correctly, I could use some constexpr algorithms and make the arrays fully constexpr. These are not hard reasons to require C++17. I can port it to C++14 as soon as any user asks for it, but it hasn't happened yet. In fact, Vinnie Falco promised a gazillion dollars from Microsoft to port it back to C++14. ;P ( https://cpplang.slack.com/archives/C27KZLB0X/p1682786897070269) Thank you, Alfredo
czw., 26 wrz 2024 o 04:25 Alfredo Correa
Hi Andrzej,
Thanks for the continued conversation.
From: Andrzej Krzemienski
Subject: [boost] [multi] A successor to Boost.MultiArray? Did I get it right that Multi can be considered a superior alternative to Boost.MultiArray? yes
If so, it triggers a couple of points:
* Can Boost.MultiArray be upgraded instead?
MultiArray is a superset of BMA. I have a test in the CI to show that this library fulfills all the concepts BMA defines. There are still semantic differences in BMA, such as the quality of implementation issues or plain semantic bugs.
Before we go down this road, I did try many times to overhaul BMA. The architecture of BMA is beyond repair. Some obstacles seem to be even purposeful features of the library, such as disallowing the assignment of arrays of different sizes. Layout information is heap allocated. Iterators are broken and too basic. Subblock views have a complicated, inefficient implementation. The lack of responsiveness of the original authors was also an issue.
Don't get me wrong, BMA was a tour-de-force at the time. It even spearheaded C++ concept checking in C++98, which I have delayed integrating into the library until C++20 Concepts.
C++98 compatibility may be a good reason. Other than that, the lack of responsiveness could be an indication that a library needs a new maintainer, and then you could replace the implementation. But it would surely not be C++98-compliant.
* The same questions, "is it useful at all", apply also to
Boost.MultiArray.
BMA is useful in principle and was a great leap forward then. In practice, it is unfortunately almost unusable. You can tour Stackoverflow questions on #boost-multi-array and see all its pain points.
I am suggesting this still in the context of documenting a solid motivation for why Multi is useful. If the users of BMA have their story to tell, you could use it as well for your motivation. But if you are saying BMA is unusable, and maybe unused, then there may not be much to take. This could also be an indication that Artyom is correct, and that the need for a container alone may be scarce.
* Whatever the motivation for using Boost.MultiArray is, it should also be a good motivation for Multi.
yes
I read in the Boost.MultiArraydocs that it is a better replacement for vector
>. Yes, but this is a simplistic description. It is mainly used to explain the multi-index element access A[i][j], etc., but the analogy ends there. Of course it is a completely different data structure, vector
> is more general (at a price), the "rows" can be staggered. In any case, it is, at worst a replacement for vector >> *recursively*. So, maybe a good use case is for simple use cases, where you want to store and perform a simple access to relatively small-sized data sets, where performance is important, but not super-critical, and changing from nested vectors to Multi is already good and sufficient an optimization.
I don't know how you conclude this is for small-sized data? Do you mean that vector
> would be ok for large-size data in contrast? I don't think so.
No. Sorry, I brought up two separate points. One point is that if you need
a multidimensional array Multi is always superior to vector
As a side note, when comparing Multi with std::mdspan. The latter requires C++23, while the former, only C++17. So, when someone has C++17, mdspan is not an option.
yes
BTW, why do you require C++17?
- The main reason was for ease of development: - implementing allocator propagation without if-constexpr is a nightmare. - c++17 provides polymorphic allocators, and I wanted to be compatible with them (at the time, I didn't know about __has_include, it was much simpler to just use C++17) - If I remember correctly, I could use some constexpr algorithms and make the arrays fully constexpr.
These are not hard reasons to require C++17. I can port it to C++14 as soon as any user asks for it, but it hasn't happened yet. In fact, Vinnie Falco promised a gazillion dollars from Microsoft to port it back to C++14. ;P ( https://cpplang.slack.com/archives/C27KZLB0X/p1682786897070269)
This is not a request. Just an observation that it would make the library even more competitive. Regards, &rzej; Thank you,
Alfredo
On Wed, Sep 25, 2024 at 9:37 PM Andrzej Krzemienski
czw., 26 wrz 2024 o 04:25 Alfredo Correa
napisał(a): Hi Andrzej,
Thanks for the continued conversation.
From: Andrzej Krzemienski
Subject: [boost] [multi] A successor to Boost.MultiArray Don't get me wrong, BMA was a tour-de-force at the time. It even spearheaded C++ concept checking in C++98, which I have delayed integrating into the library until C++20 Concepts.
C++98 compatibility may be a good reason. Other than that, the lack of responsiveness could be an indication that a library needs a new maintainer, and then you could replace the implementation. But it would surely not be C++98-compliant.
There are many precedents of Boost libraries that provide new implementations of libraries with similar. Boost.variant2 and Boost.Parser are recent examples, there might be others.
* The same questions, "is it useful at all", apply also to
Boost.MultiArray.
BMA is useful in principle and was a great leap forward then. In practice, it is unfortunately almost unusable. You can tour Stackoverflow questions on #boost-multi-array and see all its pain points.
I am suggesting this still in the context of documenting a solid motivation for why Multi is useful. If the users of BMA have their story to tell, you could use it as well for your motivation. But if you are saying BMA is unusable, and maybe unused, then there may not be much to take. This could also be an indication that Artyom is correct, and that the need for a container alone may be scarce.
Yes, as I said, if a container is not enough I am okay with that. I claim that this is not just a container; it is also a generator of views (part of what functions in ranges like drop, take, subrange, chunk_by, etc. provide), but all seamlessly integrated in the same underlying data structure. And it is a whole suite of features that worm together to 1) implement algorithms, 2) interface efficiently and nicely with tens of existing C libraries that work with this datastructure. I think this is valuable but others might disagree strongly; I accept that. My *dream* is that a well designed interface for this classic data structure will initiate a golden age of multidimensional algorithms that are a generalization of STL (which BTW is also criticized for being one-dimensional centric). Of course, this is a monumental task; that eventually I could integrate it into Multi. Putting a few selected exemplars of such multidimensional algorithms is as much as I can hope for since I am working by myself. (In place of that I decided to provide the adaptors.) I also accept that generic programming might not be fashionable in this decade. In which case the product wouldn't be as valuable.
I don't know how you conclude this is for small-sized data? Do you mean that vector
> would be ok for large-size data in contrast? I don't think so. No. Sorry, I brought up two separate points. One point is that if you need a multidimensional array Multi is always superior to vector
>. So if you happen to use a nested vector in your program, you can optimize and improve your program cheaply by employing Multi.
Sure, that can be used for teaching of course.
Vector The other, separate, point is in the context of a critique that a
general-purpose multidimensional array container cannot compete with
tailored frameworks for huge dataset processing that require domain-based
optimizations. yes, that is not the intention of Multi.
I don't want to compete with frameworks.
I feel that, together with STL algorithms and a few adaptors, Multi is
already very powerful.
Even if true, Multi has its use for small-sized data. The user experience would be: the following. I have an idea to solve a problem using a
multidimensional array. First I prototype a solution, and for this I use
Multi. I won't discourage this kind of use.
I am convinced that the library can do prototyping and beyond.
Then the prototype demonstrates that my solution will work, I need to implement a production ready, super-optimized version, and for this I use a
framework, like OpenCV. That can be an option, another option would be to add an ad-hoc OpenCV
adaptor (following the examples in the adaptors directory) so that one
doesn't lose the nice things of Multi.
Here Multi still plays an important role in enabling prototyping. sure.
Boost.Spirit has such an advertising technique ("use this library for small and medium parsers"). I would agree if you mean “small” in the sense of simple, consice code.
I don't think there is a penalty on using Multi for storing large data.
It is not that arrays become “slower” when they are big. (well that depends
on caches and access patterns but that is a different story)
The data structure is equally simple for large and small arrays. As a side note, when comparing Multi with std::mdspan. The latter
requires
C++23, while the former, only C++17. So, when someone has C++17, mdspan
is
not an option. yes BTW, why do you require C++17? - The main reason was for ease of development:
- implementing allocator propagation without if-constexpr is a nightmare.
- c++17 provides polymorphic allocators, and I wanted to be compatible
with them (at the time, I didn't know about __has_include, it was much
simpler to just use C++17)
- If I remember correctly, I could use some constexpr algorithms and make
the arrays fully constexpr. These are not hard reasons to require C++17.
I can port it to C++14 as soon as any user asks for it, but it hasn't
happened yet.
In fact, Vinnie Falco promised a gazillion dollars from Microsoft to port
it back to C++14. ;P (
https://cpplang.slack.com/archives/C27KZLB0X/p1682786897070269) This is not a request. Just an observation that it would make the library
even more competitive. yes, so far the question has been hypothetical, the library can be ported
back to C++14 when someone (or Bill Gates) ask for it.
Thanks
Alfredo Regards,
&rzej; Thank you, Alfredo
participants (2)
-
Alfredo Correa
-
Andrzej Krzemienski