Search question
cino hilliard
hillcino368 at hotmail.com
Sun Jan 15 07:04:55 CET 2006
Hi Frank et al,
>From: Russ Cox <rsc at swtch.com>
>To: "franktaw at netscape.net" <franktaw at netscape.net>
>CC: seqfan at ext.jussieu.fr
>Subject: Re: Search question
>Date: Sat, 14 Jan 2006 14:30:09 -0500
>
> > I am currently trying to identify sequences that need more entries, but
>do
> > not have the "more" (or "full") keyword. I can search for
>"-keyword:more
> > -keyword:full", but I would like to add to that a search for the absence
>of
> > a "%U" line. Is there any way I can do this, or that it could easily be
> > enabled?
>
>There isn't a way to do this; the machinery treats all
>sequence data lines the same.
>
>Russ
>
This can be done by downloading the 114 parts of the database and mining
what you want
from it with software. I have written Bcx ( C ) programs that do this.
Extracting the parts takes 1
minute with a cable connection.
Here are some examples of sequences that have only one line of data by
criterion "keyword."
Keyword: dumb
A004740,A019440,A026081,A048659,A055200,A058230,A059916,A059969,A082390,A084912,
A085808,A101944,A102701,A102705,A103132,A107081,A108159,A111070,A111157,A111198,
A112733,A112747,A112748,A112749,A112750,A112766,A112767,A112782,A112783,A112784,
A112785,A112786,A113172,
Total 33 one liners out of 63 dumb sequences
Keyword: hard
A000066,A000162,A000236,A000341,A000348,A000372,A000375,A000376,A000403,A000438,
A000474,A000509,A000510,A000512,A000528,A000530,A000532,A000609,A000637,A000638,
A000679,A000769,A000789,A000791,A000882,A000937,A000952,A000983,A001071,A001072,
....
....
A112245,A112284,A112535,A112548,A112723,A112724,A112741,A112853,A112855,A112874,
A112879,A112880,A113276,A113457,A113459,A113461,A114601,A114628,A114629,A114630,
A114631,A114632,A114648,A114649,A114665,A114670,A114676,A114714,A114716,A087306,
Total 1690 one liners out of 2439 hard sequences
Some more statistics
keyword nice
6049 nice Sequences
939 nice With only one line of terms
keyword uned
956 uned Sequences
251 uned With only one line of terms
keyword easy
28794 easy Sequences
1210 easy With only one line of terms
keyword nonn
106689 nonn Sequences
13717 nonn With only one line of terms
keyword sign
6909 sign Sequences
234 sign With only one line of terms
Etc Etc
Tell me which files you want or whatever you want and I will try to extract
it and send to you
direct. Gotta lot of time! However, I woul like to just send sequence
numbers and maybe the
definition.
Here is a great oppurtunity for some dumb sequences. :-)
I like your idea of checking short sequences for possible extension. We
could of course do a
similar run for sequences with only <3 n < k terms in the first row to
narrow it down more.
The sequence list is a 'living" entity. These are just some of the things we
can do to maintain
its vitality.
Have fun,
Cino Hilliard
More information about the SeqFan
mailing list