29 May
2013
29 May
'13
10:40 a.m.
On 29/05/13 11:46, Aditya Avinash wrote:
It's the second option. Provide framework to use SIMD templates.
Ok, in that case, you need to first study how uBlas works. For example if you write something along the lines of a = trans(b + c) * d; AFAIK what uBlas does is something like for(size_t i=0; i!=sz.height; ++i) for(size_t j=0; j!=sz.width; ++j) a[i][j] = (b[j][i] + c[j][i]) * d[i][j]; What you need to do is change the loop structure and modify the evaluation of all nodes involved to support SIMD. Of course trans is going to be a problem. Thankfully uBlas doesn't have that many functions, so trans and herm are the only functions that exhibit that issue.