Try using the hashed_non_unique instead of ordered_non_unique index implementation. This will use hashed values to access keys, and not a comparison function.
Would this really work? If i use the hashed_non_unique index, i cant use std::difference and "upper_bound" to get the count of same words, because it isnt sorted anymore? Or am i wrong?
My personal opinion is that if your words are in the database anyway, you should not retrieve them from there and then store them. SQL Solution will be always faster, since databases knows how to optimize statements and result sets as well.
The original data comes not from the Database. I would do it then directly with a User Defined Function or Stored Procedure. But in my application the data is downloaded from the internet and is written to the DB after or before counting words. (Im counting it in the database with "INSERT ON DUPLICATE KEY UPDATE" statements.) Cheers Manu