Chris Russell wrote:
[lots of valuable tips about caching, paging etc]
Thanks for all the input - I guess the problem is much, much harder than I expected (as always... :-) My initial hope was to develop the app in BGL (since I'm somewhat familiar with it) and try it out on smaller examples just to get everything up and running. Then I could optimize the graph data structure for huge datasets without touching the actual code that uses the graph. Now I'm starting to think that this is a stupid idea and that I'm better off coding the thing from scratch with the huge dataset in mind.
And when you're satisfied you can do no better, then if it's still not fast enough (and it never is) then you can decide if shelling out the $$$ for things like striped fiber channel drive arrays controlled by 64-bit PCI controllers is worth it. They certainly make a difference - we used this monsters a lot back when I was working on video editing stuff.
Unfortunately it's not up to me - I'd have to convince my boss which could be... ahem... somewhat tricky :-)
Hope this helps and good luck! What are you crunching out of curiosity (if you're at liberty to say?)
It's a tool for shotgun sequence assembly, a technique for puzzling together large quantities of short DNA sequences into longer contigous sequences. Google on "shotgun sequencing" for more info. Again, thanks for valuable comments! /Erik