Dear all, i am simulating particle trajectories, and depending on the data i want to generate i may or may not have to track additional data [boost.ublas c_matrix] along those trajectories. Right now i enable/disable that via a template parameter (so, using std::tuple(vec, vec) or tuple(vec, vec, c_matrix) and fusion_algebra), which has a few drawbacks. The algorithm is already templated to support varying dimensions via a template parameter, so the additional data practically doubles the number of template instantiations for system and all possible observers. This actually got so bad that gcc (mingw gcc 4.8) refuses to compile with debugging enabled. Therefore, i would like to make the switch between tracking the matrix and ignoring it a runtime choice instead of a template parameter. However, the cost of tracking that matrix is quite high (that is why it is optional), but simply disableing the d matrix / dt calculation in the system does not speed up the calculation much (86 s -> 58 s vs 27 s without matrix). A lot of time is spent in adding and multiplying numbers that are not used anywhere, or actually even copying them (e.g std::swap(c_matrix, c_matrix) shows up quite high in gprof (even though i am not sure how reliable the -pg option is in conjunction with -O3)). So what i am asking here is for suggestions how to best let odeint treat the c_matrix as uninitialized memory and never perform any operations on that depending on a runtime parameter. Options i thought of were: * deriving a class from c_matrix and providing my own versions of operator overloads that check a flag: as far as i know, boost.ublas matrices use some template magic to let operators only return expression templates to optimize evaluation if operations are chained, and i would not be sure about how to actually retain that behaviour in case the matrix is tracked. * creating my own algebra class: from what i could tell from a few quick glances into the algebra files, this would probably be conceptually easier than the other option, but would require a lot of repetetive coding. Does anyone have suggestion for a more efficient implementation strategy? Greetings Erik