[seqfan] Re: Policy on near-duplicates

Peter Luschny peter.luschny at googlemail.com
Mon May 4 21:30:59 CEST 2009


ADA> Another common situation is when a(1) differs but all the other terms
ADA> are the same, due to different approaches in handling 1's unique
ADA> qualities. Some people know enough to start their entry in the search
ADA> box with a(2) or even a(3), but there will certainly be others who put
ADA> start with a(1) and it would certainly benefit them to get a result
ADA> with a cross-reference to the sequence with the different a(1).

In my recent blog I advocated the following: Not
to assign an ID to a sequence but to an equivalence
class of sequences. In your example this could be:
Two sequences get the same ID if they differ only
in the first (or first and second) term.

The differences of the actual sequences (or "schools
of though about" the sequence) are to be explained
in the comment section.

I believe that this would reduce all these issues
with references, cross-references and search/find
problems dramatically and reduce much noise.

Cheers Peter




More information about the SeqFan mailing list