Feedback desired for neural networks library
Hi everyone, I have a template-based library with several common types of layers which can be assembled into various neural networks: https://github.com/svm-git/NeuralNet, and I would love to get community feedback on the overall design, any issues or missing features, and how interesting a library like that would be in general. I've been watching the trends and research reports in the AI/ML space, and I feel that the recent announcements of the successful models for image classification, computer vision or natural language processing, push the focus towards very complex and computationally intensive networks. However, I think that the idea behind multi-layer networks is very powerful, and is applicable in many domains, where even a small and lightweight model can be used successfully. I also think that if developers have an access to a library of building blocks that allows them to train and run NN anywhere a C++ code can run, it may encourage a lot of good applications. In the current state, the library is fairly small and should be easy to review. It was built with two main goals in mind: * Provide a collection of building blocks that share a common interface which allows plug'n'play construction of more complex NNs. * Compile-time verification of the internal consistency of the network. I.e. if a layer's output size does not match the next layer's input, it is caught at the very early stage. Once it seems like there is some consensus on the core design and usefulness of such library, I am willing to do the work necessary to make the library consistent with Boost requirements for naming convention, folder structure, unit tests etc. The library relies on the C++ 11 language features and has a dependency on just a few STL components, so I think it should be straightforward to merge into Boost. Best regards, Sergei Marchenko.
On 31. Dec 2020, at 01:45, Sergei Marchenko via Boost
wrote: I have a template-based library with several common types of layers which can be assembled into various neural networks: https://github.com/svm-git/NeuralNet, and I would love to get community feedback on the overall design, any issues or missing features, and how interesting a library like that would be in general.
I only had a quick glance at your Github page. The code examples do not look bad and you put lot of examples up-front, which is good. A red flag is the use of variable names which start with _. That is discouraged. Some (not all) names starting with _ are reserved for implementers of the C++ stdlib, but there is no use going into the details. Just don't use variables starting with _ to be on the safe side and to give a good example to other C++ programmers. What would be the niche for this library? A NN C++ library would have to compete with the extensive amount of high-quality NN software that already exists in Python. You cannot compete in terms of features, obviously. I doubt that you have an advantage in terms of speed, because the Python libraries are already optimised for speed, many JIT-compile the hot code. There are large teams of excellent engineers working on making that possible. Does your library support GPU computation? Automatic differentiation? Probably not. I think the niche could be embedded systems. For prototyping and training a NN, Python is certainly the better choice, but once you have the final network, you may want to put it on an embedded system to do its work there. An embedded system does not have a GPU, so not supporting GPU computations wouldn't be a disadvantage. Best regards, Hans
Thanks Hans for your thoughts.
I only had a quick glance at your Github page. The code examples do not look bad and you put lot of examples up-front, which is good. A red flag is the use of variable names which start with _. That is discouraged. Some (not all) names starting with _ are reserved for implementers of the C++ stdlib, but there is no use going into the details. Just don't use variables starting with _ to be on the safe side and to give a good example to other C++ programmers.
I absolutely agree with you on the importance of the naming conventions, and if it ever comes to the point where the library is considered for integration into Boost, I fully expect that a lot of renames will be necessary to make it consistent with the other parts. I have not considered this code to be in a position where other C++ programmers would look at it as an example, so I just used the STL naming style as a reference when I was deciding on the names.
What would be the niche for this library? A NN C++ library would have to compete with the extensive amount of high-quality NN software that already exists in Python.
The niche for a NN C++ library is an excellent question. As you correctly point out, Python is a de-facto standard toolset for prototyping and experimenting with new types of neural layers and network configurations. Offering a support for hardware acceleration, for example via GPU or FPGA, immediately brings a question which hardware to support and which low-level library to use to interact with the hardware. At this point I am not certain what the answers should be, and hoping to get the suggestions from the community.
I think the niche could be embedded systems. For prototyping and training a NN, Python is certainly the better choice, but once you have the final network, you may want to put it on an embedded system to do its work there. An embedded system does not have a GPU, so not supporting GPU computations wouldn't be a disadvantage.
This is definitely a good suggestion. Another possibility that I thought about is the use of the library to extend an existing solution with a small/medium NN component in the situation where cross-process or cross-environment interop is not desirable. Or when a hardware configuration is not known upfront, or a solution is targeting a wide variety of the hardware. Or when data and model size are so small that GPU acceleration would not result in a significant overall improvement. These are the niches which a NN C++ library can fill. To be more specific, the example application that I have in the GitHub repo for MNIST digits dataset, produces a model, which can be trained to offer a 95% success rate in about 10-15 minutes on a single CPU core. While the example is somewhat synthetic, it is still representative of a wide variety of scenarios where an input from a sensor or a small image can be inspected by a NN component. Another application (not shown on GitHub) was a tiny model to estimate the cost of a web service API response time, given a small set of parameters, such as the user identity, API method, and payload size, which was re-trained on every start of the web service, and used to make predictions about the resource consumption by different callers for load balancing and throttling purposes. These are just two examples, and as I said in my original post, I do believe that there is a lot of power in the ideas behind NNs, and there can be a wide variety of possible applications. Best regards, Sergei Marchenko.
I absolutely agree with you on the importance of the naming conventions, and if it ever comes to the point where the library is considered for integration into Boost, I fully expect that a lot of renames will be necessary to make it consistent with the other parts. I have not considered this code to be in a position where other C++ programmers would look at it as an example, so I just used the STL naming style as a reference when I was deciding on the names.
Just an advice here: Don't use the full STL naming style as a guide, even in hobby projects. The STL is part of the compiler "environment" and hence is allowed to do things which you as "user of C++" are not. E.g. the STL purposely uses "uglified" names such as `__iterator` because no user of that STL is allowed to use that name. And this isn't just about naming conventions: Using reserved names (i.e. some names starting with underscore) is outright UB in C++ and hence mustn't be used. The easiest way to avoid that is to not start using names with underscores at the start. Also for a library there is no need to "uglify" its names which makes reading, understanding and using it easier. If you want to use STL naming style (you can use the Boost one too which is similar enough) then drop the leading underscores (so basically stick to snake_case)
Alexander Grund via Boost said: (by the date of Wed, 6 Jan 2021 09:47:03 +0100)
Using reserved names (i.e. some names starting with underscore) is outright UB in C++ and hence mustn't be used.
Hi, (I'm sorry to hijack a little bit) I am planning to implement units in YADE [1][2], and initially I wanted to use _m _km _s names, so that I could write: 1.0_km instead of 1.0km Then I decided against it, because of UB. Then I saw in boost ublas tensor this example: Cem Bassoy via Boost said: (by the date of Wed, 6 Jan 2021 10:32:15 +0100)
Please consider to use and contribute to *Boost.uBlas* https://github.com/boostorg/ublas which recently added *tensor* data types and operations with the convenient Einstein notation :
tensor_t C = C + A(_i,_j,_k)*B(_j,_l,_i,_m) + 5;
Which is using exactly the notation which I need: 10_kPa is easier to read than 10kPa, or 10_Pa vs 10Pa So. Where exactly do we have UB ? Would simply putting it into separate namespace yade::units solve the problem? Are there only certain letters forbidden after a starting underscore? I plan to add user defined literals for all SI, atomic units and later astrophysical units. best regards Janek Kozicki [1] http://yade-dem.org/ [2] https://en.cppreference.com/w/cpp/language/user_literal -- Janek Kozicki, PhD. DSc. Arch. Assoc. Prof. Gdańsk University of Technology Faculty of Applied Physics and Mathematics Department of Theoretical Physics and Quantum Information -- http://yade-dem.org/ http://pg.edu.pl/jkozicki (click English flag on top right)
(I'm sorry to hijack a little bit) You can send a new mail with a new topic I am planning to implement units in YADE [1][2], and initially I wanted to use _m _km _s names, so that I could write: 1.0_km instead of 1.0km
Then I decided against it, because of UB. Then I saw in boost ublas tensor this example:
Which is using exactly the notation which I need: 10_kPa is easier to read than 10kPa, or 10_Pa vs 10Pa
So. Where exactly do we have UB ? Would simply putting it into separate namespace yade::units solve the problem?
Are there only certain letters forbidden after a starting underscore? All names beginning with double-underscore or underscore+capital are forbidden.
https://en.cppreference.com/w/cpp/language/user_literal shows that `_Z` is fine: double operator"" _Z(long double); // error: all names that begin with underscore // followed by uppercase letter are reserved double operator""_Z(long double); // OK: even though _Z is reserved ""_Z is allowed So yes _Pa is fine as long as you write your UDL in the 2nd version
On 2021-01-07 at 15:50, Alexander Grund via Boost wrote:
(I'm sorry to hijack a little bit) You can send a new mail with a new topic I am planning to implement units in YADE [1][2], and initially I wanted to use _m _km _s names, so that I could write: 1.0_km instead of 1.0km
Then I decided against it, because of UB. Then I saw in boost ublas tensor this example:
Which is using exactly the notation which I need: 10_kPa is easier to read than 10kPa, or 10_Pa vs 10Pa
So. Where exactly do we have UB ? Would simply putting it into separate namespace yade::units solve the problem?
Are there only certain letters forbidden after a starting underscore? All names beginning with double-underscore or underscore+capital are forbidden.
https://en.cppreference.com/w/cpp/language/user_literal shows that `_Z` is fine:
double operator"" _Z(long double); // error: all names that begin with underscore // followed by uppercase letter are reserved double operator""_Z(long double); // OK: even though _Z is reserved ""_Z is allowed
So yes _Pa is fine as long as you write your UDL in the 2nd version
Yes, literal operators somehow seem to have the opposite rule of other reserved names. Here names with underscores are reserved for user programs, and the non-underscore names are used by the standard library. We even have a complex<float> operator""if(), even though if is a reserved keyword everywhere else. Odd, isn't it? :-) Bo Persson
Yes, literal operators somehow seem to have the opposite rule of other reserved names. Here names with underscores are reserved for user programs, and the non-underscore names are used by the standard library. kinda, yes. But you still can't use the reserved names, i.e. only _+lowercase is fine. But see below for uppercase
We even have a complex<float> operator""if(), even though if is a reserved keyword everywhere else. Odd, isn't it? :-) I think the "trick" here is that the identifier is `""if` not `if` (maybe even `operator""if`). Same for `""_W` not `_W`. Who said C++ is easy?
Alexander Grund via Boost said: (by the date of Thu, 7 Jan 2021 16:08:17 +0100)
We even have a complex<float> operator""if(), even though if is a reserved keyword everywhere else. Odd, isn't it? :-) I think the "trick" here is that the identifier is `""if` not `if` (maybe even `operator""if`). Same for `""_W` not `_W`. Who said C++ is easy?
Indeed! The name is `""_W` or even `operator""_W`, so it doesn't start with underscore. Thank you, this is an awesome information. Alexander Grund via Boost said: (by the date of Thu, 7 Jan 2021 15:50:26 +0100)
(I'm sorry to hijack a little bit) You can send a new mail with a new topic
Right. Will do next time! :-) best regards -- # Janek Kozicki http://janek.kozicki.pl/
On 6. Jan 2021, at 02:07, Sergei Marchenko
wrote: Thanks Hans for your thoughts.
I only had a quick glance at your Github page. The code examples do not look bad and you put lot of examples up-front, which is good. A red flag is the use of variable names which start with _. That is discouraged. Some (not all) names starting with _ are reserved for implementers of the C++ stdlib, but there is no use going into the details. Just don't use variables starting with _ to be on the safe side and to give a good example to other C++ programmers.
I absolutely agree with you on the importance of the naming conventions, and if it ever comes to the point where the library is considered for integration into Boost, I fully expect that a lot of renames will be necessary to make it consistent with the other parts. I have not considered this code to be in a position where other C++ programmers would look at it as an example, so I just used the STL naming style as a reference when I was deciding on the names.
Adding to Alexander's comments, the matter is correctly explained in the second answer to this SO question (unfortunately not the accepted answer): https://stackoverflow.com/questions/3136594/naming-convention-underscore-in-.... The advice to not use variables starting with _ is given in "C++ Coding Standards" from Herb Sutter and Andrei Alexandrescu, as mentioned in that answer. If you have not already done so, please also check https://www.boost.org/development/requirements.html which also has some guidelines for naming - although not on this issue specifically.
This is definitely a good suggestion. Another possibility that I thought about is the use of the library to extend an existing solution with a small/medium NN component in the situation where cross-process or cross-environment interop is not desirable. Or when a hardware configuration is not known upfront, or a solution is targeting a wide variety of the hardware. Or when data and model size are so small that GPU acceleration would not result in a significant overall improvement. These are the niches which a NN C++ library can fill.
I think Python also supports a wide variety of hardware. You are right, of course, that it would be rather awkward for an existing C++ application to call into Python to do its ML tasks, having a native C++ library to do the job is preferred. I am not sure about your argument regarding small data and or model sizes. I think in most cases you want to train Neural Nets with large amounts of data. Can you add generic GPU support with Boost.Compute? https://www.boost.org/doc/libs/1_75_0/libs/compute/doc/html/index.html
To be more specific, the example application that I have in the GitHub repo for MNIST digits dataset, produces a model, which can be trained to offer a 95% success rate in about 10-15 minutes on a single CPU core. While the example is somewhat synthetic, it is still representative of a wide variety of scenarios where an input from a sensor or a small image can be inspected by a NN component. Another application (not shown on GitHub) was a tiny model to estimate the cost of a web service API response time, given a small set of parameters, such as the user identity, API method, and payload size, which was re-trained on every start of the web service, and used to make predictions about the resource consumption by different callers for load balancing and throttling purposes.
Those are good niche applications, I think. Some more questions: Are you building the network at compile-time or run-time? It looks from your examples like it is compile-time. I think your library should offer both. Building the network at compile-time may give some speed benefits as it can gain from compiler optimisations, but it would require re-compilation to change the network itself. Building the network at run-time means you can change the network without re-compiling. This is useful for example when you want to read the network configuration (not only its weights) at run-time from a configuration file. It is possible to offer both implementations under a unified interface, as I am doing in Boost.Histogram. Other libraries which offer this are std::span and the Eigen library. I would tentatively endorse this project, but it would be good to have a second opinion from senior Boost members. Best regards, Hans
I think Python also supports a wide variety of hardware. You are right, of course, that it would be rather awkward for an existing C++ application to call into Python to do its ML tasks, having a native C++ library to do the job is preferred. That is not required. Both leading ML frameworks (TensorFlow & PyTorch) offer a C++ API for most, if not all, operations. At least the simple ones (working with tensors and layers) I am not sure about your argument regarding small data and or model sizes. I think in most cases you want to train Neural Nets with large amounts of data. Can you add generic GPU support with Boost.Compute? https://www.boost.org/doc/libs/1_75_0/libs/compute/doc/html/index.html
To be more specific, the example application that I have in the GitHub repo for MNIST digits dataset, produces a model, which can be trained to offer a 95% success rate in about 10-15 minutes on a single CPU core. While the example is somewhat synthetic, it is still representative of a wide variety of scenarios where an input from a sensor or a small image can be inspected by a NN component. Another application (not shown on GitHub) was a tiny model to estimate the cost of a web service API response time, given a small set of parameters, such as the user identity, API method, and payload size, which was re-trained on every start of the web service, and used to make predictions about the resource consumption by different callers for load balancing and throttling purposes. Those are good niche applications, I think.
Some more questions:
Are you building the network at compile-time or run-time? It looks from your examples like it is compile-time. I think your library should offer both. Building the network at compile-time may give some speed benefits as it can gain from compiler optimisations, but it would require re-compilation to change the network itself. Building the network at run-time means you can change the network without re-compiling. This is useful for example when you want to read the network configuration (not only its weights) at run-time from a configuration file.
Smallish networks are certainly a niche, if you want to do anything serious you won't be able to beat TF/PyTorch in performance. So keeping this focused on small, static (aka compiletime) models with only the basic layers and maybe even with optional training (removing this avoids the auto-differentiation need) could be the way. Using compile-time models makes this focused on usage of ML instead of development and allows the optimizations from the compiler to be used which are very important for small models. However I fear this is a not fit for Boost. ML evolves so fast, adding more and more layer types etc., that I fear this library to be outdated already during review. The only chance I see if this purposely is for very basic networks, i.e. FullyConnected, Convolution, SoftMax and similar basic layers, maybe with an extension to provide a ElementWise and BinaryOp layer templated by the operator (this may be problematic for auto-differentiation though). Reusing what we have (uBlas, Boost.Compute) might be a good idea too.
Thank you Hans and Alexander for your interest and suggestions, really appreciate it! Let me try to answer questions from both of you in the same post, as I feel they touch on many common points.
Can you add generic GPU support with Boost.Compute? https://www.boost.org/doc/libs/1_75_0/libs/compute/doc/html/index.html
Reusing what we have (uBlas, Boost.Compute) might be a good idea too.
In a parallel branch of this thread Cem Bassoy also suggested uBlas, and looking closer at both libraries it does look like both uBlas and Boost.Compute are good options to use for optimized underlying compute implementation. At a first glance, Boost.Compute is slightly more appealing, because it offers raw computation tools without an extra abstraction layer of uBlas tensors. Despite the use of "tensors" in the NN interface, the operations in many NN layers are limited to element-wise operations, so it may turn out that most of the benefits of uBlas will be unused. I will need to experiment more with both of these libraries to get a better sense which one is the best fit. The preliminary idea is to split responsibilities between NN and uBlas/Boost.Compute such that NN library defines an interface and familiar abstractions in the NN domain, and uBlas/Boost.Compute are used as the core computation engine. If this idea works out as I hope it will do, we can put aside the discussion about the hardware support, because it will come with the underlying compute engine, and we can focus more on the convenience of the interface and abstractions that an NN library can provide for easier use of ML elements.
Are you building the network at compile-time or run-time? It looks from your examples like it is compile-time. I think your library should offer both. Building the network at compile-time may give some speed benefits as it can gain from compiler optimisations, but it would require re-compilation to change the network itself. Building the network at run-time means you can change the network without re-compiling. This is useful for example when you want to read the network configuration (not only its weights) at run-time from a configuration file. It is possible to offer both implementations under a unified interface, as I am doing in Boost.Histogram. Other libraries which offer this are std::span and the Eigen library.
Using compile-time models makes this focused on usage of ML instead of development and allows the optimizations from the compiler to be used which are very important for small models.
What is currently on GitHub is a compile-time model building framework. The compiler optimizations are one of the reasons. Another reason is that it allows to catch any accidental compatibility problems between layers at a very early stage. The later part turned out to be convenient right away, and caught a few arithmetic mistakes which I made while writing the sample MNIST model. The support to build the network at run-time is an interesting point as well. You are correct, it may be useful to have when network configuration evolves from one version to another. I can think of three types of changes that may happen to a network: only the layer weights change after re-training a network on a better data set; the configuration or size of inner layers changes but input and output values remain the same; and finally, the entire network changes, including hidden layers and either input or output size, or both. Between these types, a compile-time model is suitable for the first and third ones: if either input or output values change, most likely that the code around it will be recompiled anyway to adapt to the new values. So, it is the second type of change that cannot be handled by a compile-time model and needs a run-time model reconstruction. I do not have a good intuition about how frequent each type of change may be, and the answer may depend on how exactly an updated model is released and deployed. If the model upgrade is done as part of a new release, then the code is recompiled anyway, so there is no difference. And if the new model is released in a form of configuration update, then the run-time reconstruction of the network will come handy, as you correctly notice. The price for such convenience is that compile-time optimizations may be reduced, although the difference may be offset by using uBlas/Boost.Compute as the computation engine. I agree that it is probably a good idea to offer both options with the same interface, but I need to think more on how this can be actually achieved in code. I did not have a chance to look at the libraries that you mentioned, but I will definitely do so for the inspiration.
Smallish networks are certainly a niche, if you want to do anything serious you won't be able to beat TF/PyTorch in performance. So keeping this focused on small, static (aka compiletime) models with only the basic layers and maybe even with optional training (removing this avoids the auto-differentiation need) could be the way.
This is indeed the niche I am targeting. To my knowledge, large networks require not only the best performance to reduce the costs of running and training them, but they also come with the additional challenges of running the network training and predictions in a distributed way, because the model or even the input to the model may not fit into memory. These are all interesting problems to solve, but they would require solutions like Hadoop or Apache Spark, and this is a completely different topic.
However I fear this is a not fit for Boost. ML evolves so fast, adding more and more layer types etc., that I fear this library to be outdated already during review. The only chance I see if this purposely is for very basic networks, i.e. FullyConnected, Convolution, SoftMax and similar basic layers, maybe with an extension to provide a ElementWise and BinaryOp layer templated by the operator (this may be problematic for auto-differentiation though).
You are correct, ML is evolving very quickly, and new configurations and layer types are proposed very often. And if I were to choose between two approaches: try to keep the library up to date with the new research, or keep the library scoped to a subset of layers that are proven to be useful in a variety of networks, I would opt for the later. The good part about ML layers is that most of the time they can be added incrementally over time, and which layers to add can be determined from the popular demand. But I can only speak for myself, and I will trust the judgment of the Boost community members. Best regards, Sergei Marchenko.
I will need to experiment more with both of these libraries to get a better sense which one is the best fit. The preliminary idea is to split responsibilities between NN and uBlas/Boost.Compute such that NN library defines an interface and familiar abstractions in the NN domain, and uBlas/Boost.Compute are used as the core computation engine. If this idea works out as I hope it will do, we can put aside the discussion about the hardware support, because it will come with the underlying compute engine, and we can focus more on the convenience of the interface and abstractions that an NN library can provide for easier use of ML elements.
Blast.Compute + OpenCL extensions to leverage hardware definitely look like the right path to go and would be a useful addition to this library. It would require a careful selection of OpenCL kernels for optimal speed, which was obvious from this simple test with different implementations of Matrix * Vector that I ran on a few OpenCL devices that are available on my computer. To my surprise, plain C++ version was outperforming my GPU, and I got a nice increase from OpenCL implementation on CPU with a simplistic kernel. I must have a very old and slow GPU. These are the raw test results for 4096 x 4096 matrix in case anybody is interested . Best regards, Sergei Marchenko OpenCL Platform: 'ATI Stream' (vendor: Advanced Micro Devices, Inc.) Devices: Device: ' Intel(R) Core(TM) i5-2300 CPU @ 2.80GHz ' (version: 2.0 ) (type: CPU) Device: 'Toucan ' (version: CAL 1.4.1848 ) (type: GPU) Extensions: cl_khr_icd cl_amd_event_callback cl_khr_d3d10_sharing OpenCL Platform: 'AMD Accelerated Parallel Processing' (vendor: Advanced Micro Devices, Inc.) Devices: Device: 'Turks' (version: 1800.11 (VM)) (type: GPU) Device: ' Intel(R) Core(TM) i5-2300 CPU @ 2.80GHz' (version: 1800.11 (sse2,avx)) (type: CPU) Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices Test Device: ' Intel(R) Core(TM) i5-2300 CPU @ 2.80GHz ' (version: 2.0 ) (type: CPU) Testing matrix * vector (map+reduce kernels): Map Elapsed: 45999200 ns Map BandWidth: 1.50486 GB/s Reduce Elapsed: 170500 ns Reduce BandWidth: 12.3961 GB/s Elapsed: 54297900 ns BandWidth: 2.47188 GB/s Testing matrix * vector (naive kernel): Elapsed: 5341900 ns BandWidth: 12.5689 GB/s Testing matrix * vector (Boost.Compute algorithms): Elapsed: 724216500 ns BandWidth: 0.185351 GB/s Testing matrix * vector (plain C++): Elapsed: 17725800 ns BandWidth: 3.78779 GB/s Test Device: 'Toucan ' (version: CAL 1.4.1848 ) (type: GPU) Testing matrix * vector (map+reduce kernels): Map Elapsed: 490535376 ns Map BandWidth: 0.141116 GB/s Reduce Elapsed: 2236373 ns Reduce BandWidth: 0.945073 GB/s Elapsed: 602027100 ns BandWidth: 0.222943 GB/s Testing matrix * vector (naive kernel): Elapsed: 170503700 ns BandWidth: 0.393784 GB/s Testing matrix * vector (Boost.Compute algorithms): Elapsed: 6837179400 ns BandWidth: 0.019633 GB/s Testing matrix * vector (plain C++): Elapsed: 17901600 ns BandWidth: 3.75059 GB/s Test Device: 'Turks' (version: 1800.11 (VM)) (type: GPU) Testing matrix * vector (map+reduce kernels): Map Elapsed: 222894000 ns Map BandWidth: 0.310562 GB/s Reduce Elapsed: 5166778 ns Reduce BandWidth: 0.409063 GB/s Elapsed: 248867100 ns BandWidth: 0.539315 GB/s Testing matrix * vector (naive kernel): Elapsed: 156637700 ns BandWidth: 0.428643 GB/s Testing matrix * vector (Boost.Compute algorithms): Elapsed: 2145102000 ns BandWidth: 0.062577 GB/s Testing matrix * vector (plain C++): Elapsed: 17918300 ns BandWidth: 3.7471 GB/s Test Device: ' Intel(R) Core(TM) i5-2300 CPU @ 2.80GHz' (version: 1800.11 (sse2,avx)) (type: CPU) Testing matrix * vector (map+reduce kernels): Map Elapsed: 37620700 ns Map BandWidth: 1.84001 GB/s Reduce Elapsed: 245500 ns Reduce BandWidth: 8.60911 GB/s Elapsed: 43919500 ns BandWidth: 3.05599 GB/s Testing matrix * vector (naive kernel): Elapsed: 5410200 ns BandWidth: 12.4102 GB/s Testing matrix * vector (Boost.Compute algorithms): Elapsed: 641987200 ns BandWidth: 0.209092 GB/s Testing matrix * vector (plain C++): Elapsed: 17944000 ns BandWidth: 3.74173 GB/s
Please consider to use and contribute to *Boost.uBlas* https://github.com/boostorg/ublas which recently added *tensor* data types and operations with the convenient Einstein notation : tensor_t C = C + A(_i,_j,_k)*B(_j,_l,_i,_m) + 5; More information is available at https://github.com/boostorg/ublas/wiki/Tensor. We could add convolution and pooling functions into Boost.uBlas and provide examples how to use them. Feel free to contact me if you have any questions or also use https://gitter.im/boostorg/ublas for detailed discussion. Best CB Am Do., 31. Dez. 2020 um 01:45 Uhr schrieb Sergei Marchenko via Boost < boost@lists.boost.org>:
Hi everyone,
I have a template-based library with several common types of layers which can be assembled into various neural networks: https://github.com/svm-git/NeuralNet, and I would love to get community feedback on the overall design, any issues or missing features, and how interesting a library like that would be in general.
I've been watching the trends and research reports in the AI/ML space, and I feel that the recent announcements of the successful models for image classification, computer vision or natural language processing, push the focus towards very complex and computationally intensive networks. However, I think that the idea behind multi-layer networks is very powerful, and is applicable in many domains, where even a small and lightweight model can be used successfully. I also think that if developers have an access to a library of building blocks that allows them to train and run NN anywhere a C++ code can run, it may encourage a lot of good applications.
In the current state, the library is fairly small and should be easy to review. It was built with two main goals in mind:
* Provide a collection of building blocks that share a common interface which allows plug'n'play construction of more complex NNs. * Compile-time verification of the internal consistency of the network. I.e. if a layer's output size does not match the next layer's input, it is caught at the very early stage.
Once it seems like there is some consensus on the core design and usefulness of such library, I am willing to do the work necessary to make the library consistent with Boost requirements for naming convention, folder structure, unit tests etc. The library relies on the C++ 11 language features and has a dependency on just a few STL components, so I think it should be straightforward to merge into Boost.
Best regards, Sergei Marchenko.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Please consider to use and contribute to Boost.uBlashttps://github.com/boostorg/ublas which recently added tensor data types and operations with the convenient Einstein notation :
tensor_t C = C + A(_i,_j,_k)*B(_j,_l,_i,_m) + 5;
Thank you Cem for the suggestion! uBlas::opencl definitely looks interesting, since many basic NN layers can be implemented using various element-wise functions, and the hardware support that comes with it is very appealing. The Einstein tensor notation is convenient for multi-dimensional convolution and pooling layers, although I feel that C++ 17 requirement for tensor extension is probably too strong. I will need to experiment with the library a bit more to get a better sense of what it means to implement NN abstractions on top of it. Best regards, Sergei Marchenko.
Am Do., 7. Jan. 2021 um 04:06 Uhr schrieb Sergei Marchenko < serge_v_m@hotmail.com>:
Please consider to use and contribute to Boost.uBlas https://github.com/boostorg/ublas which recently added tensor data types and operations with the convenient Einstein notation :
tensor_t C = C + A(_i,_j,_k)*B(_j,_l,_i,_m) + 5;
Thank you Cem for the suggestion! uBlas::opencl definitely looks interesting, since many basic NN layers can be implemented using various element-wise functions, and the hardware support that comes with it is very appealing. The Einstein tensor notation is convenient for multi-dimensional convolution and pooling layers, although I feel that C++ 17 requirement for tensor extension is probably too strong. I will need to experiment with the library a bit more to get a better sense of what it means to implement NN abstractions on top of it.
Sure. Just let me know if you need help. The contraction is not optimized. If you need optimized versions, please let me know - we are working on it right now. We are preparing faster implementations for Tensor-Times-Vector and Tensor-Times-Matrix. (E.g. https://github.com/bassoy/ttv).
Best regards, Sergei Marchenko.
Best, Cem
participants (6)
-
Alexander Grund
-
Bo Persson
-
Cem Bassoy
-
Hans Dembinski
-
Janek Kozicki
-
Sergei Marchenko