[seqfan] Re: 3-sequences relations using Robert Gerbicz's seeker.c

Charles Greathouse charles.greathouse at case.edu
Tue Aug 24 16:50:29 CEST 2010


I'm fascinated with your results, Georgi.  I *do* think that extending
it to 100 or more terms with the b-files is of value, even if this
increases the computational cost, to reduce the amount of human time
needed.  Of course it may be more efficient to generate relations with
30 terms and prune those that do not match to the length of the
shortest b-file involved rather than directly including the extra
terms, since you're not memory-local in any case.

I think that restricting to an 'interesting subset' of sequences has
promise, too -- for starters, perhaps the core and nice sequences.

> 3. a new computer program may prune the large number of relations e.g. by verifying them to the max available terms or deleting trivially related (by definition) sequences.

I'll be honest -- I think this is probably harder than anything that
has been done so far in such analyses.  If you could do this it would
be great -- but don't kid yourself, it's a hard task.  In particular,
determining which relationships are 'trivial' and which are
interesting is difficult.

Charles Greathouse
Analyst/Programmer
Case Western Reserve University

On Tue, Aug 24, 2010 at 5:42 AM, Georgi Guninski <guninski at guninski.com> wrote:
> On Mon, Aug 23, 2010 at 10:01:29PM -0400, Charles Greathouse wrote:
>> That does sound interesting.  But I think it's infeasible for other
>> reasons: the human time needed to sift through the resulting
>> sequences.
>>
>> If (extrapolating) there are a million 3-sequence relationships
>> between 200,000 sequences, then we'd expect to the order of a hundred
>> million 4-sequence relationships between 200,000 sequences.
>>
>> In fact, even just the 3-sequence relations seem hard to analyze. If
>> there are 500 active SeqFans members (surely an overestimate) then
>> each would need to check 2000 sequence relations.  Perhaps half would
>> be trivial and anther quarter could be dismissed without much work,
>> but that's still 500 difficult relations per person.  50 I could
>> imagine; 500 would be too many to ask.  50,000 seems entirely more
>> than a person could reasonably check.
>>
>> Charles Greathouse
>> Analyst/Programmer
>> Case Western Reserve University
>>
>
> Charles, i agree with you.
>
> the post was just a computational experiment, i don't claim it makes any sense or is interesting/worth.
>
> some marginal benefits of the test may be:
>
> 1. the experiment was just a toy experiment based on too little number of terms. if instead of 30 terms 100 terms were used the # of relations would be considerably smaller imo. try searching for the integers 1 .. 30 and 1 .. 45 in oeis. (126 vs 67). btw, the choice of just 30 terms may be a big mistake of mine.
>
> to play devil's advocate, the oeis may genuinely contain an enormous number of inter-relations that is infeasible for the humans on seqfan, so what ;)
>
> 2. i suppose most people are not interested in *all* sequences, they are *particularly* interested in a small subset of them. so if someone is doing a web search for a sequence or a pair of sequences he may find the list of potential relations and ideally find a nontrivial relation by examining them (that's the main reason i posted a large list. it happened to me when doing web search ironically to end up on my site - the search results were < 10)
>
> 3. a new computer program may prune the large number of relations e.g. by verifying them to the max available terms or deleting trivially related (by definition) sequences.
>
>
> _______________________________________________
>
> Seqfan Mailing list - http://list.seqfan.eu/
>




More information about the SeqFan mailing list