[Multi_index] Performance like sequenced.cpp example

Hi,
I have to count a lot of words. Up to now i did it with MySQL, because it
was easy. The result is safed there anyway. Now i thought i could speed up
this a little if i would use internally a Multi_index list to store the
words, so i have only to insert all different words. The words are stored
in a UnicodeString from the ICU library.
My code is really near to the one from the example "sequenced.cpp".
Im using the following definition:
typedef multi_index_container<
UnicodeString,
indexed_by<
sequenced<>,
ordered_non_unique
text_container;
typedef nth_index

On Mon, April 30, 2007 19:47, Manuel Jung wrote:
Try using the hashed_non_unique instead of ordered_non_unique index implementation. This will use hashed values to access keys, and not a comparison function. My personal opinion is that if your words are in the database anyway, you should not retrieve them from there and then store them. SQL Solution will be always faster, since databases knows how to optimize statements and result sets as well. With Kind Regards, Ovanes Markarian

Would this really work? If i use the hashed_non_unique index, i cant use std::difference and "upper_bound" to get the count of same words, because it isnt sorted anymore? Or am i wrong?
The original data comes not from the Database. I would do it then directly with a User Defined Function or Stored Procedure. But in my application the data is downloaded from the internet and is written to the DB after or before counting words. (Im counting it in the database with "INSERT ON DUPLICATE KEY UPDATE" statements.) Cheers Manu

On Mon, April 30, 2007 20:26, Manuel Jung wrote:
Please take a look at:
http://www.boost.org/libs/multi_index/doc/reference/hash_indices.html#hash_i...
There is a member count (2 overloads), which can count all items with a given key or another
member equal_range (2 overloads), which ruturns the pair
Ok, wanted to be sure. ;)
Cheers Manu
With Kind Regards, Ovanes Markarian

http://www.boost.org/libs/multi_index/doc/reference/hash_indices.html#hash_i...
I took a look at it. Thank you very much. I never used a hashed index, but i should sometime. For now, thanks to the quick solution some posts before, i will optimize at another place. But ill come back, if needed! Thank you for your help, Bye Manu

Hello Manuel,
----- Mensaje original -----
De: Manuel Jung
This trace indicates that you've set Boost.MultiIndex safe mode on; this and its companion invariant-checking mode are huge CPU eaters, only intended for catching programming errors in debug builds. Please turn them off and time again: is the performance adequate now? Joaquín M López Muñoz Telefónica, Investigación y Desarrollo
participants (3)
-
"JOAQUIN LOPEZ MU?Z"
-
Manuel Jung
-
Ovanes Markarian