On Mon, April 30, 2007 19:47, Manuel Jung wrote:
Hi,
I have to count a lot of words. Up to now i did it with MySQL, because it was easy. The result is safed there anyway. Now i thought i could speed up this a little if i would use internally a Multi_index list to store the words, so i have only to insert all different words. The words are stored in a UnicodeString from the ICU library. My code is really near to the one from the example "sequenced.cpp". Im using the following definition:
typedef multi_index_container< UnicodeString, indexed_by< sequenced<>, ordered_non_unique
text_container;
typedef nth_index
::type ordered_text; text_container tc; Im inserting new words with "tc.push_back(UnicodeString(NewWord));" And count them exactly like in the example. I thought this should be fast, but it isnt. It eats up all my CPU, but isnt fast. It is a lot slower than my old solution. I have still hope i could speed this up, before i have to switch back MySQL. The profile of a run says that "boost::multi_index::safe_mode::check_same_owner<..." eats most CPU time.
Some suggesting how to speed it up with Multi_index? Or some ideas which other way would be faster than MySQL inserts?
Thanks Manu
Try using the hashed_non_unique instead of ordered_non_unique index implementation. This will use hashed values to access keys, and not a comparison function. My personal opinion is that if your words are in the database anyway, you should not retrieve them from there and then store them. SQL Solution will be always faster, since databases knows how to optimize statements and result sets as well. With Kind Regards, Ovanes Markarian