[seqfan] A metric and UI for Pandora-like OEIS exploration

Mon Oct 26 11:03:39 CET 2009

I noticed that the scores were slightly biased by the A-numbers:
A000111 is more likely to match A000110 simply because of the
resemblance in the A-numbers. So I eliminated that by removing the
A-number from the beginning of each line in the database during the
initial format-conversion step.

I also found a significant bug that made it overestimate the coverage
of A over B, which skewed all the scores.

Then I re-did each of the tests in my previous message; all but one
result was affected. Here's what is does now:

Tribonacci numbers: A000213 <-> A001648 (score 0.270) (Tetranacci
numbers without leading 4)

Prime numbers: A000040 <-> A019590 (score 0.225) (Fermat's Last Theorem)

Powers of 2: A000079 <-> A005408 (score 0.284) (The odd numbers)

Pennies sequence: A005577 <-> A005576 (score 0.447) (A different
pennies sequence)

Composite numbers: A002808 <-> A018252 (score 0.357) (The nonprime numbers)

The "How four dogs meet in a field" sequence gave the same match, just
with a different score:
  A006451 <-> A006454 (score 0.324)

Kaprekar triples: A006887 <-> A060768 (score 0.268) (Pseudo-Kaprekar triples)

Greedy Egyptian fractions: A100140 <-> A117116 (score 0.181)
(Denominators of an Egyptian Fraction for phi = (1+sqrt(5))/2)

I am confident it is working, at least for sequences that have useful
stuff in the %C and %H fields. I am also quite happy with the
responsiveness on the 16-thread Nehalem system. The roughly 15 second
wait on the Core 2 Duo machine would get rather tiring after a while.

I'll start rating multiple sequences tomorrow. When it lets you do
each of the basic functions I described in my first message I'll
publish source code.

--
 Robert Munafo  --  mrob.com