-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Andy Thomason Sent: 21 July 2015 13:11 To: boost@lists.boost.org Subject: [boost] Genetics library: Volunteers needed
Hi All,
I am recruiting users for the putative genetics library.
https://github.com/andy-thomason/genetics
We have a few simple examples of gene searching and I am working on a more complete aligner example and some performance improvements to the index data structure.
For data, you can obtain the human genome from:
ftp://ftp.ensembl.org/pub/release- 81/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
Interesting problems we would like to solve:
Given a 20 character sequence with up to six errors, what is the fastest way to list all
possibilities
other than a brute force search (CRISPR).
Can we use JNI to connect the library to Hadoop and other distributed seach systems?
Can we construct a database of all known viral genomes including recombination?
Can we detect variations in MHC VDJ regions within a single sample?
Many other interesting puzzles are there to be found...
Andy.
Potential users may find the draft docs useful at https://rawgit.com/andy-thomason/genetics/master/doc/html/index.html (if lacking some icons and style sheets to see in all their glory ;-)