Another new lookup issue

Russ Cox russcox at gmail.com
Tue Jan 10 17:02:42 CET 2006


David Wilson:
> When i look up
>
> 1 1 2 3 5 9 16 28 49 86 151 264
>
> I get A097757
>
> Many of the queried numbers do not match sequence elements, but rather numbers
> that appear in the example text, which is a table of numbers.  I would think
> that if a user queries a list of numbers, they are expecting a match within the
> sequence proper, not in the comments, formulae, examples, etc.

What you say is true, but it's hard to implement that heuristic and
keep the explanation of the meaning of searches consistent.
Sometimes you might well be looking for numbers elsewhere
(for example, a few terms and a year).

To query numbers in the sequence data, add commas, as the hint suggests.

Our general solution to these issues has been not to change the
definition of the search but to rank sequences well.  If there had been
a sequence in the database with those terms in that order,
it would have been returned before A097757 in the results.

> Also, when visually scanning a sequence output, my mind tends to ignore the
> sequence name, because it is in the purple header bar (to my embarrassment in a
> recent post).  Maybe just the sequence number should be in the bar, and the name
> moved down below the offset with the rest of the textual data.

I think you'll get used to this.  I definitely have.

Alexandre Wajnberg:
> What is the meaning of the little numbers at the right of the blue bar?
> Ex: +30 and 1086 for A000045.

The +30 is the sequence's query score -- how well it matches the
query.  You get O(100) points for matching ordering (for example,
having 1 3 5 not 1 2 3 4 5 when the query is 1 3 5).  You get O(10) points
for matching terms in certain lines (for example, sequence data
counts more, and sequence number counts a lot more).  The details
are not too interesting and subject to change if the ranking isn't
working well.

The 1086 is the number of sequences in the database that reference
the given sequence.

The "relevance" sort is by query score, with ties broken by reference count.

That's how 1 3 5 manages to bring up the odd numbers and 2 3 5 the primes.

Russ






More information about the SeqFan mailing list