[ublas] Supporting DNNs with Tensors/Multidimensional Arrays
Gsoc https://summerofcode.withgoogle.com/organizations/4507228564881408/ 2018 just ended one week ago and we had many successefully completed student projects https://github.com/BoostGSoC18. I was responsible for adding tensor support to Boost.uBLAS for primarily supporting multilinear algebra operations in the field of numerics. The wiki description along with the implementation can be found here https://github.com/BoostGSoC18/tensor. Similar to Boost.multi_array https://www.boost.org/doc/libs/1_68_0/libs/multi_array/doc/index.html the runtime-reshapable tensor data structure is parametrized in terms of number of dimensions (rank/order), dimension extents, data type, layout (first- and last-order) and storage type. The first two are runtime-variable. I am also about to add subtensor (view/handle of a tensor) along with multidimensional iterators for convenient algorithm implementation. It is yet not as flexible as GSL's multi_span https://github.com/Microsoft/GSL/blob/master/include/gsl/multi_span: does not yet support static rank and dimensions. However, basic generic tensor operations (contraction/transposition/reshaping/...), including a nice syntax for Einstein's summation convention with placeholders, using C++17 features are provided. The operations are evaluated using expression templates (not smart yet). Similar to the tensor https://eigen.tuxfamily.org/dox/unsupported/group__CXX11__Tensor__Module.htm... framework of Eigen, that is used by tensor flow https://github.com/tensorflow/tensorflow, the tensor data structure in Boost.uBlas could be taken for implementing deep neural networks or higher-order statistics I think. I am not sure if the C++ community would appreciate if Boost has some form of basic operations for building *deep neural networks* (DNNs). I would like to ask 1. if it make sense for boost to support basic operations for DNNs? 2. what are the obligatory, necessary basic operations for creating DNN building blocks? 3. if there are any additional data structure parameters that needs to be added for (efficiently) supporting DNNs? Best, Cem
Gsoc https://summerofcode.withgoogle.com/organizations/4507228564881408/ 2018 just ended one week ago and we had many successefully completed student projects https://github.com/BoostGSoC18. I was responsible for adding tensor support to Boost.uBLAS for primarily supporting multilinear algebra operations in the field of numerics. The wiki description along with the implementation can be found here https://github.com/BoostGSoC18/tensor. Similar to Boost.multi_array https://www.boost.org/doc/libs/1_68_0/libs/multi_array/doc/index.html the runtime-reshapable tensor data structure is parametrized in terms of number of dimensions (rank/order), dimension extents, data type, layout (first- and last-order) and storage type. The first two are runtime-variable. I am also about to add subtensor (view/handle of a tensor) along with multidimensional iterators for convenient algorithm implementation. It is yet not as flexible as GSL's multi_span https://github.com/Microsoft/GSL/blob/master/include/gsl/multi_span: does not yet support static rank and dimensions. However, basic generic tensor operations (contraction/transposition/reshaping/...), including a nice syntax for Einstein's summation convention with placeholders, using C++17 features are provided. The operations are evaluated using expression templates (not smart yet). Similar to the tensor https://eigen.tuxfamily.org/dox/unsupported/group__CXX11__Tensor__Module.htm... framework of Eigen, that is used by tensor flow https://github.com/tensorflow/tensorflow, the tensor data structure in Boost.uBlas could be taken for implementing deep neural networks or higher-order statistics I think. I am not sure if the C++ community would appreciate if Boost has some form of basic operations for building *deep neural networks* (DNNs). I would like to ask 1. if it make sense for boost to support basic operations for DNNs? 2. what are the obligatory, necessary basic operations for creating DNN building blocks? 3. if there are any additional data structure parameters that needs to be added for (efficiently) supporting DNNs? Cem
On 08/29/18 15:47, Cem Bassoy via Boost wrote:
2. what are the obligatory, necessary basic operations for creating DNN building blocks?
You may want to investigate Automatic Differentiation, which is a building block that extends to many use cases besides DNN. A good starting point is this talk by Conal Elliott: https://www.youtube.com/watch?v=ne99laPUxN4
Am Do., 30. Aug. 2018 um 20:17 Uhr schrieb Bjorn Reese via Boost < boost@lists.boost.org>:
On 08/29/18 15:47, Cem Bassoy via Boost wrote:
2. what are the obligatory, necessary basic operations for creating DNN building blocks?
You may want to investigate Automatic Differentiation, which is a building block that extends to many use cases besides DNN.
I am not really sure if. I have skimmed his paper https://arxiv.org/pdf/1804.00746.pdf. I need more concrete things, I guess before reasoning about DNNs. I thought to implement some Eigen::Tensor Operations such as Broadcast(), Convolution(), etc. However, I need some input from the boost community in what direction we want to go and if we want to enhance the support for multidimensional arrays, tensors, etc. We are now behind Eigen::Tensor and some other libraries! Cheers C
1. if it make sense for boost to support basic operations for DNNs? 2. what are the obligatory, necessary basic operations for creating DNN building blocks? 3. if there are any additional data structure parameters that needs to be added for (efficiently) supporting DNNs?
DNN are certainly interesting models in machine learning, but they represent a very small part of what we can do and what really works in real life. Tensors can be applied to many more situations. On top of it, tensors can be applied to many other field of science. What would be interesting is to start thinking more generically and not focus too much on just DNN, especially if we want to have machine learning in ublas (which was another successful GSOC this year by the way). I'm glad to see we have tensors, I'm happy to see we have basics stats and ML. Let's go to the next step and have some more numerical techniques related to linear algebra in ublas. Open discussion now ..... David
I totally agree. DNNs are just one part. Lets put in more numerical algorithms using tensors, matrices and vectors. However, IMHO I think that the community needs better documentation, interfaces, (possibly implementation) of basic tensor, matrix and vector operations. I am now also changing the interface of the tensor data structure for being able to be an alias for matrix and vector. Am Sa., 1. Sep. 2018 um 11:54 Uhr schrieb David Bellot < david.bellot@gmail.com>:
1. if it make sense for boost to support basic operations for DNNs? 2. what are the obligatory, necessary basic operations for creating DNN building blocks? 3. if there are any additional data structure parameters that needs to be added for (efficiently) supporting DNNs?
DNN are certainly interesting models in machine learning, but they represent a very small part of what we can do and what really works in real life. Tensors can be applied to many more situations. On top of it, tensors can be applied to many other field of science.
What would be interesting is to start thinking more generically and not focus too much on just DNN, especially if we want to have machine learning in ublas (which was another successful GSOC this year by the way).
I'm glad to see we have tensors, I'm happy to see we have basics stats and ML. Let's go to the next step and have some more numerical techniques related to linear algebra in ublas.
Open discussion now .....
David
participants (3)
-
Bjorn Reese
-
Cem Bassoy
-
David Bellot