Any interest in a framework for memoization?
Hi,
I would like to submit a small framework for memoization I am working on
(http://projects.giacomodrago.com/c++memo/). At the moment, it is still
in a beta stage and does not comply with Boost guidelines. I would like
to know if you might be interested.
The idea is to have a tool for rapid prototyping of software components
implementing dynamic programming algorithms or, anyhow, requiring
memoization.
I paste here a simple "fibonacci" example:
----------------------
int fibonacci(int i, std::function
Hi Giacomo, thank you for your contribution.
I'm interested in this kind of library, particularly I often implement
dynamic programs. I've got some remarks/questions.
* In your implementation you use std::function adaptor. You can remove it
by making some of the member functions templates( e.g. getValue).
* I'm interested in the case when Key type is integral. In this case one
can make the following optimization. Let density be the "the number of
elements in map" / "the largest element in map" ratio. If the density is
large, it is much better to use vector instead of hash map. The idea is to
measure the density and when it exceeds certain threshold switch from
hashmap to vector. Did you consider this optimization? (I'm aware this is
not as easy as I described). This makes sense when you consider the
unbounded knapsack problem. In UKP it is not clear (you have to decide in
runtime) if you should use vector/ or hashmap strategy . IMO the "integral
Key" case covers many standard dynamic programs.
Regards,
Piotr
On 18 November 2013 09:30, Giacomo Drago
Hi,
I would like to submit a small framework for memoization I am working on ( http://projects.giacomodrago.com/c++memo/). At the moment, it is still in a beta stage and does not comply with Boost guidelines. I would like to know if you might be interested.
The idea is to have a tool for rapid prototyping of software components implementing dynamic programming algorithms or, anyhow, requiring memoization.
I paste here a simple "fibonacci" example:
----------------------
int fibonacci(int i, std::function
prereqs) { if (i == 0) return 0; if (i <= 2) return 1; return prereqs(i-1) + prereqs(i-2); } cppmemo::CppMemo cppMemo; std::cout << cppMemo.getValue(30, fibonacci) << std::endl; ----------------------
The code will print 832040, without recursive calls to the "fibonacci" function. A stack is managed "under the hood" and appropriate calls are made to the "fibonacci" callback to gather pre-requirements and then call the function again when they are available.
This is just a toy example. There is more to know about, including important caveats (if you are interested, please read the documentation carefully): some defaults for template and method parameters are useful for conciseness but are EVIL if you don't know about them. The framework also provides automatic parallelization (when possible).
It is important to remark that at this moment the framework relies on fcmm (fast concurrent memoization map), a custom hashmap that is optimized for this very purpose. This can be changed quite easily, as fcmm is not exposed to the client. A C++11 compliant compiler is needed.
Thanks for your attention.
Giacomo
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/ mailman/listinfo.cgi/boost
Hi Giacomo, thank you for your contribution.
I'm interested in this kind of library, particularly I often implement dynamic programs. I've got some remarks/questions.
* In your implementation you use std::function adaptor. You can remove it by making some of the member functions templates( e.g. getValue).
I see. Out of curiosity, what is the reason to do so (maybe performance)? I chose std::function over template parameters as I find it quite convenient, e.g. I can pass nullptr to the function and then check this condition inside the function body. Turning getValue into a template function will imply major refactoring: Surely I will do it if it's required, and if you have some spare time, I'd appreciate insights based on the actual code.
* I'm interested in the case when Key type is integral. In this case one can make the following optimization. Let density be the "the number of elements in map" / "the largest element in map" ratio. If the density is large, it is much better to use vector instead of hash map. The idea is to measure the density and when it exceeds certain threshold switch from hashmap to vector. Did you consider this optimization? (I'm aware this is not as easy as I described). This makes sense when you consider the unbounded knapsack problem. In UKP it is not clear (you have to decide in runtime) if you should use vector/ or hashmap strategy . IMO the "integral Key" case covers many standard dynamic programs.
Interesting. How would you do it, with a template specialization for
int/long/etc... and maybe std::pair
On 22 November 2013 14:35, Giacomo Drago
Hi Giacomo, thank you for your contribution.
I'm interested in this kind of library, particularly I often implement dynamic programs. I've got some remarks/questions.
* In your implementation you use std::function adaptor. You can remove it by making some of the member functions templates( e.g. getValue).
I see. Out of curiosity, what is the reason to do so (maybe performance)? I chose std::function over template parameters as I find it quite convenient, e.g. I can pass nullptr to the function and then check this condition inside the function body.
The std::function is not for free. You gain type erasure but you lose some performance, you can read about it for example here: http://stackoverflow .com/questions/5057382/what-is-the-performance-overhead-of-stdfunction
Turning getValue into a template function will imply major refactoring: Surely I will do it if it's required, and if you have some spare time, I'd appreciate insights based on the actual code.
I think in this case it doesn't need major refactoring: each time you use the underlined type you add template parameter to the function. Maybe I missed something.
* I'm interested in the case when Key type is integral. In this case one
can make the following optimization. Let density be the "the number of elements in map" / "the largest element in map" ratio. If the density is large, it is much better to use vector instead of hash map. The idea is to measure the density and when it exceeds certain threshold switch from hashmap to vector. Did you consider this optimization? (I'm aware this is not as easy as I described). This makes sense when you consider the unbounded knapsack problem. In UKP it is not clear (you have to decide in runtime) if you should use vector/ or hashmap strategy . IMO the "integral Key" case covers many standard dynamic programs.
Interesting. How would you do it, with a template specialization for int/long/etc... and maybe std::pair
keys? This would be relatively easy, what I find tricky is to dynamically switch from the hashmap to the vector while preserving thread safety. Do you have any suggestions/examples?
My first thought was specialising the class for the case when std::is_integral<Key>::value is true. Unfortunately, I do not have any smart thoughts concerning the thread safety.
I appreciate your feedback and I'll work on your suggestion, but do you think this library may eventually find its place into Boost?
I can only say that I'd like to see library like that in boost but I'm afraid this is not my decision. Regards, Piotr
Thank you for your time.
Giacomo
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/ mailman/listinfo.cgi/boost
I successfully applied your suggestion of avoiding std::function. I have also cleaned up and refactored the whole source code. I gained a remarkable ~30% performance boost. As for the specialization for integral keys, I put it in the "backlog" but made no progress so far.
I can only say that I'd like to see library like that in boost but I'm afraid this is not my decision. Thank you, Piotr. It appears there is no such interest, at the moment. However, this conversation gave me useful insights. Have a look at the last version of the library: I encourage you to contribute (with benchmarks, further ideas or source code) if you like.
Regards Giacomo
----- Original Message -----
I can only say that I'd like to see library like that in boost but I'm afraid this is not my decision. Thank you, Piotr. It appears there is no such interest, at the moment.
Specific to here and now, I assume that the migration to github is occupying a lot of spare cycles in the community. In my opinion, memoization seems like a useful capability. I have no particular application for it on my to-do list, but having used memoization in prolog before, it seems like the sort of thing I might someday find myself wishing were in boost. IMO, proposals like this fall into a certain category, that might get less traction because if you randomly selected 1000 programmers, only a handful (at most) would have an application needing the proposed library. I think my current proposal for edit_distance/edit_alignment is also in this category. So is the handful of 'tree' proposals. Most applications simply don't need it Finding the people with the right combination of knowledge/interest in Boost, plus the domain knowledge to give good feedback, is tricky. And yet... when you want it, you want it :) I'm not sure what the answer to that quandary is. It's not like boost has none of these. Quaternion/Octonion come to mind as an example. I've seen other projects (e.g. octopress) that maintain a page of "notable 3rd-party contributions", that aren't in the official project, but can be used as a resource for people interested in capability-X that isn't in the official distribution. Alternatively, the new github library organization will allow for people to maintain customized library forks, so that is a possible avenue. For example, once the dust clears, I could fork boost::algorithm and maintain my own version that includes edit_distance. Periodically merge in the official version, etc.
In my opinion, memoization seems like a useful capability.
I am glad to know! I would like to remark that this library doesn't just provide memoization, it can also turn some recursive DP algorithms into non recursive (stack based) ones. Moreover, it provides thread safety and automatic parallelization out of the box...
IMO, proposals like this fall into a certain category, that might get less traction because if you randomly selected 1000 programmers, only a handful (at most) would have an application needing the proposed library.
I totally see your point: I didn't expect much feedback and in no way I think that it is due to me. ;) Best Giacomo
participants (3)
-
Erik Erlandson
-
Giacomo Drago
-
Piotr Wygocki