On 14 March 2015 at 10:18, Antony Polukhin
2015-03-13 23:13 GMT+03:00 Cosmin Boaca
:
OK. Not a problem, we're not in a rush.
By the way, think of making `parent` a private variable and adding (get|set)_parent() functions. We can later try to make node even smaller by removing `parent` variable and getting parent node by calling intrusive::set::container_from_iterator(iterator);
I have thought a little about the container used for children. I think that a container with less insertion / find overhead would be better. I think that maintaining a sorted intrusive list / vector of pointers would perform better if the number of children for a node is small (something like < 50 , maybe < 100). We may tune this further on anyway, maybe using some kind of policy for the container.
I'm thinking of a slightly different approach: we can make a specific "leaf" node, that contains the remaining part of the value. For example, if we have two values in trie "hello word" and "hell's bells", then the trie would look like:
node['h']->node['e']->node['l']->node['l']->(leaf["o world"] | leaf['''s bells']);
But this must be done only after the existing code would be polished enough.
Related to this, I was thinking at changing the data structure to radix tree [0]. It would be more compact. You're idea it's something like a mix between trie and radix tree. In a radix tree the two strings would be represented like : node["hell"] -> (node["o world"] | node["'s bells"]). I think there are specific cases in which it performs worse than a trie, for instance when you add all the strings that may be formed by an alphabet, but this is a quite rare case, but in the general case it's more compact and I think it performs better. Cosmin