[seqfan] Re: project: looking for connections between sequences

Thu Aug 12 20:19:44 CEST 2010

On Wed, Aug 11, 2010 at 11:25 AM, Marc LeBrun <mlb at well.com> wrote:
>>="N. J. A. Sloane" <njas at research.att.com>
>> Several people have suggested that it would be interesting
>> to run the whole 200,000 sequences through Superseeker,
>> to see if there are interesting connections to be discovered.
>
> This is great, a longstanding dream!  Some ideas and suggestions:
>[...]
>  *  Create a distributed superseeker

This is not trivial, and as Plouffe pointed out (I think), only
appropriate (beneficial) for a subset of the search space.  I am still
unsure how best to describe the subspace for which it is useful.  I
am, however glad that my comments may have sparked your fruitful
discussions.

>  *  Ensure superseeker is run on all submissions.

This makes a lot of sense to me (even as a layman), as did the
comments about being careful about the offset parameters and other
metadata.  The metadata/intersequence-relations should be validated
IMHO.

> Any seqfans interested in taking any of this on?

If a search space appropriate for distributed computation could be
identified, I would be interested in it.  However, as mentioned above,
it isn't trivial.  Also, I've looked into the hosting costs, and I can
tell you from talking to Centralized Distributed Project Admins that
it is not cheap.  On the other hand, an even more ambitious (and
cheaper) idea, would be to write a P2P protocol....something like
"braintorrent" =).

Thanks for listening.  Most of your conversations are over my head.

Cheers,
Don