[seqfan] A metric and UI for Pandora-like OEIS exploration
mrob27 at gmail.com
Mon Oct 26 11:03:39 CET 2009
I noticed that the scores were slightly biased by the A-numbers:
A000111 is more likely to match A000110 simply because of the
resemblance in the A-numbers. So I eliminated that by removing the
A-number from the beginning of each line in the database during the
initial format-conversion step.
I also found a significant bug that made it overestimate the coverage
of A over B, which skewed all the scores.
Then I re-did each of the tests in my previous message; all but one
result was affected. Here's what is does now:
Tribonacci numbers: A000213 <-> A001648 (score 0.270) (Tetranacci
numbers without leading 4)
Prime numbers: A000040 <-> A019590 (score 0.225) (Fermat's Last Theorem)
Powers of 2: A000079 <-> A005408 (score 0.284) (The odd numbers)
Pennies sequence: A005577 <-> A005576 (score 0.447) (A different
Composite numbers: A002808 <-> A018252 (score 0.357) (The nonprime numbers)
The "How four dogs meet in a field" sequence gave the same match, just
with a different score:
A006451 <-> A006454 (score 0.324)
Kaprekar triples: A006887 <-> A060768 (score 0.268) (Pseudo-Kaprekar triples)
Greedy Egyptian fractions: A100140 <-> A117116 (score 0.181)
(Denominators of an Egyptian Fraction for phi = (1+sqrt(5))/2)
I am confident it is working, at least for sequences that have useful
stuff in the %C and %H fields. I am also quite happy with the
responsiveness on the 16-thread Nehalem system. The roughly 15 second
wait on the Core 2 Duo machine would get rather tiring after a while.
I'll start rating multiple sequences tomorrow. When it lets you do
each of the basic functions I described in my first message I'll
publish source code.
Robert Munafo -- mrob.com
More information about the SeqFan