Preventing Duplicate Sequences

Sat May 10 01:53:37 CEST 2003

There are two reasons I can think of why two or more sequences that are 
almost (ie. by some simple transform) alike should NOT *necessarily* be 
consolidated into one. 

1) the sequences might actually be unrelated sequences, except that they 
are alike (by some simple transform) for only the terms in the database. 
In this case, I would at least suggest in the comment-line that the 
seemingly-related sequences are *conjectured* to be related.

2) many related (similar by simple transformations) sequences may have 
basic-definitions, where each is likely to searched by EIS users. (I 
would suppose that most users of the database never attempt to plug their 
sequences into superseeker.) If superseeker becomes very fast and 
efficient, then automatically applying it someday to all inputted 
sequences (which do not initially match anything) would be a good idea.

Thanks,
Leroy Quet

PS- In regards to (1), I would assume that there are many (technically, 
there ARE an infinite number of) sequences which match EXACTLY other 
mathematically-unrelated sequences for the terms in the database.
In this case, I would highly encourage the use of *conjecturing*, in the 
comment-line or name-line, that any "matching" sequences are indeed the 
same sequence, if the two sequences MIGHT be the same. And I would guess 
that there are many examples, also, of two sequences, which are KNOWN to 
be unrelated, matching for all of the terms in the database.

Same goes for any sequences "matching" by use of a simple transformation.

Thanks again.

>Hello.  The subject is how to best *prevent* duplicate sequences -- or just
>to increase awareness and suggest that this topic may belong in the OEIS
>FAQ, perhaps as an extended answer to "What should I do before submitting
>a new sequence?".
>
>During about a week (while adding 49 other, related sequences)  I've
>encountered 19 duplicate sequences in the OEIS -- 10 I've already reported
>and 9 more I'm about to report.  (The latter set includes 3 sequences which
>are identical!).  NJAS has already stated that he'll merge the first 10
>pairs.
>
>There are many common-sense reasons why duplicates should be avoided -- not
>the least of which is that they cost lots of people extra work (and/or
>perhaps missed references if they find the "wrong" one first).
>
>The big question is:  Should there be an automated check at the time we
>submit new sequences to issue a warning (to the submitter and would-be
>updater) if an identical sequence (or nearly so) is already in the database?
>In
>spite of best efforts to check manually beforehand there could always be
>cases
>of two people sending in the same sequence at about the same time....
>
>[Of course, it is reassuring when both sequences really are identical
>(rather than accidentally differing in, say, one or two terms... :^j)  ).]
>
>Regards,
>Rick Shepherd