[boost] [Multi] Proposal (was Re: Boost Digest, Vol 7345, Issue 1)

21 Sep 2024

      ...
...
1. You haven't mentioned even once OpenCV which today is the de-facto
standard for numerical computing in today's C++ world.
I do scientific computing with large arrays, and nobody uses OpenCV.
It would be fair if you had said that it is the de facto standard for image processing, which I don't do.
Ok... if you haven't used does not mean that it isn't very common
computing library that many developers/designers go by default.
...
Second, Multi is not a numerical library specifically; it is about the logic and semantics of multidimensional arrays and containers, regardless of the element type.
...
See, this is exactly the problem. Why would I need something like that
if I need to go to all the 3rd party libraries to actually use one
efficiently?

cv::Mat is numpy like NDArray with strides, windows, offsets (yes it
supports more than two dimensions). I myself used/written several
Tensor like objects and used (pytorch C++ tensor, dlprimitives
and other objects). It is nothing new, by all means it is the easiest
part of any numerical library.
...
...
2. While actual ndarry handling is nice, it is basically a tip of an
iceberg. You need a decent set of high performance algo combined with it.
There is a clear separation of concerns here.
The Multi library deals with the data structure, it uses when it needs to fulfill its semantics, the best *existing* algorithms it can in awareness of the datastructure.
The Multi library doesn't provide algorithms, it uses the algorithms that are provided to it via different mechanisms.
But if you don't provide algorithms, maybe I'd better take a
library/framework that does.
There are plenty of numpy-like arrays around. Usually they are called tensors...
...
...
I agree, promising all linear algebra is infinite work, like reimplenting MATLAB or Mathematica, but BLAS has a finite number of functions.
The philosophy of the BLAS-adaptor in particular (again, an optional component) is to interface to what BLAS offers and not more.
It is more of a model how to interface a legacy library using the features of Multi.
But that is exactly what make opencv useful and multi-array like a
fasade with emptiness behind.
...
...
I looked over the code and didn't find include to cblas... Did I miss
something? Also I see request to link cblas/openblas or so.
Yes, you missed that depending on cblas would tie it the application to cblas, not all BLAS implementations can be used through cblas.
BLAS is uses through the ABI directly.
(basically I have my own version of the cblas header as an implementation detail).
I see, it is a questionable decision, but Ok.
...
...
I did breef look into implementation and it seems to be lacking of
vectorization (use of intrinsics/vector operators) or have I missed?
You missed that this is a generic, not specifically numerical, library.
But, you making numpy-like library... otherwise you wouldn't be
interfacing cblas.
See, if you have been talking about multi-array as advanced
std::vector of generic objects... Ok - but you don't
you direct it to the numeric computations.
...
The other thing to take into account is that vectorization/parallelization is still provided by the external algorithms the library uses internally.
For example, then dealing with GPU or OpenMP arrays, the library uses thrust algorithms if they are available, which are parallel.
Just for the record there are two levels of parallelization on CPU
level: 1 thread based parallelization, 2nd SIMD level parallelization
like SSE/AVX2/Neon - where you load vectors of 16, 32 bytes of data
and process them together in a single instruction.
These can increase the performance significantly. Compilers aren't
always useful in this situation because of data
dependencies that only the author can know.
...
...
3. Finally I do want to see comparison with stuff like OpenCV, Eigen and
even uBlas (I know it isn't good)
Multi is for multidimensional arrays, not specifically 2D numerical arrays (i.e. matrices).
OpenCV isn't 2d only. It support n-D  tensors with views.
...
I consider that Eigen, OpenCV, Kokkos, PETSc are frameworks.
It reminds me of a comparison of Boost.Beast and a full scale
framework like CppCMS.
When I reviewed beast it was clear that it does not do 10% of what is
expected from something
to make an actually useful web application.

While it is nice to have an abstraction - if so either keep it basic
or go full way - you are stuck somewhere
in between std::vector ++ and something like OpenCV.
...
I don't fill confident adding an OpenCV column because I don't have experience with it, but feel free to help me adding a library and answering the points of each row in the comparison table.
I suggest getting some experience with OpenCV. It is a very good
library that already implements what you have (also in a different
way)
It ain't perfect by any means. But it works, well debugged, understood
and is a widely available library that does the job.
...
...
What is not clear to me is why should I use one over some existing solution
like OpenCV?
- Because sometimes your elements are not numerical types.
Yeahhh... don't buy it. Sorry :-)
...
- Because sometimes you want to specify your element types as template parameters not as OpenCV encoded types.
To make your code compilation slower and more horrible? Actually
OpenCV supports templated accessors.
...
- Because sometimes you want arbitrary dimensionality (e.g. 2D, 3D, 6D) to be compile-time.
And why it isn't possible with OpenCV
...
- Because sometimes you want to apply generic algorithms to your arrays (STL, std::sort, std::rotate, std::ranges, boost::algorithms, serialization)
Yeah... good luck with that in numerical context. But Ok. In OpenCV
you can just get several pointers and work with them
...
- Because sometimes you want to implement function that are oblivious to the actual dimension or your array (e.g. simultaneously view a 3D array of elements as a 2D array of something, for abstraction).
- Because sometimes you want to control allocations, use fancy pointers, parameterized memory strategies, polymorphic allocators,
OpenCV support custom allocations (actually something I exactly use
right now to monitor memory consumption)
Once again - be careful what you wish for. Templated allocators are
huge headache to work with in comparison to run-time ones (in real -
non fancy template world that Boost Loves)
...
- Because sometimes you want precise value semantics
I'm not sure it is a great idea for huge arrays/matrices.  Virtually
ALL tensor libraries have reference semantics for a very good reason.
...
- Because sometimes you want to work with subblocks of an array in the same way to need you work with the whole array.
And this is exactly what view is for... And this is common for any
tensor library.
...
- Because sometimes you want to give guarantees in the presence of exceptions (the library doesn't throw exceptions, to be clear).
Ok this is one of the points I want to discuss here in terms of
design, exceptions and broadcasting.
Lets take an example (numpy - but same works for torch::Tensor)

    a=np.ones(5,10)
    b=np.ones(2,5,1)
    c=a+b

It would perform broadcasting to Shape = (2,5,10) - automatically. How
it can be done - you broadcast shapes
of a to 2,5,10 using strides (0,10,1) and broadast b to same shape
using strides (5,1,0) (I hope I have not mistaken in calcs)
Then you can run easily on shape 2,5,10 and do all calculations, fetching etc.

Looking into your broadcasting docs - left me with the impression it
does not support anything like this.

I read this:
...
First, broadcasted arrays are infinite in the broadcasted
dimension; iteration will never reach the end position, and calling
`.size()` is undefined behavior.
Why? You certainly can iterate over broadcasted dimension...
Also how do you handle incompatible broadcasting without exception -
for example if b in the example above was of shape (2,5,3) and not
(2,5,1)
Exceptions are useful. And claiming that you don't throw ones means
you don't use new...
...
Thanks,
Alfredo
Regards,
I highly doubt the practicality of the library in the current form.
While generic multi-array can be nice to have, but actually it is
stuck at being little bit more that MultiArray
but too far even from basic numpy.

My $0.02

Artyom

[boost] [Multi] Proposal (was Re: Boost Digest, Vol 7345, Issue 1)

Artyom Beilis