[seqfan] Re: Submission suggestion

David Wilson davidwwilson at comcast.net
Sun Nov 6 00:19:13 CET 2011


Just an IMHO, but:

It is a bad thing® to mix form and content. Knuth designed TeX under the 
premise
that the author should write the story and the publisher should typeset 
the book.
Likewise, web developers introduced style sheets to make the look of 
HTML pages
independent of their content.

Unfortunately, in the OEIS, there is a place where display and content are
interdependent, specifically, the %STU lines in the database entries.

As content, the %STU lines represent the initial terms of a sequence. 
This means that
the %STU must fulfill content requirements. For example, if a sequence 
is hard to
compute, we want those hard-bought values in the %STU lines. Likewise, 
if the initial
elements of two sequences are similar, we might like the %STU lines to 
include enough
terms to distinguish them.

As display, the %STU lines determine which elements are shown in various 
sequence
views (the sequence page, the internal format page, the list page, the 
pin plot, etc).
This adds conflicting display requirements. For example, we don't want 
to display too
many elements, so we must keep the %STU lines down to 260 characters, we 
don't
want enormous values in the %STU lines, etc. This creates issues for the 
submitter
when entering sequences; they cannot include too many terms, etc.

Originally, the %STU lines represented the entirety of sequence terms 
stored in the
database, which was fine when the database consisted of a pile of punch 
cards. But
when the database went online, it was soon found that users often wanted 
access to
more term data than was included in the %STU lines (Users wanted to see 
all of the
88 Armstrong numbers, not just the few shown in the A005188 %STU 
lines).  However,
because of their display role, the %STU lines could not be greatly 
extended, so it was
decided to augment the database with b-files. This creates additional 
issues for the
submitter; if they are entering a long sequence, they must enter enter 
the data twice,
once for the %STU lines, and once for the b-file, and they must make 
sure the two
sources of data don't conflict.

A lot of the complication in the submission process stems from the dual 
roles of the
%STU lines as content and display artifacts, and the redundancy of the 
%STU lines
and b-files. I would suggest that the process would be simpler both from 
the user
and the developer standpoint if we separated content and display.

On the content side, I would move all the sequence data into the 
b-files, and
eliminate the %STU lines. When creating a sequence, the user can still 
enter explicit
terms or point to a b-file, either way the terms are stored in the 
b-file. In any view
of the data (sequence page, list view, graphs, etc.), data is read from 
the b-file.
This eliminates content redundancy.

On the display side, most of the views are pretty simple.  In the 
sequence page,
we normally show 4 lines of terms. In the internal format, the %STU 
lines are gone,
so we don't need to show any terms. In the list view and scatter plot, 
we show all
the terms from the b-file, in the pin plot we show initial terms up to 
some fixed
number to terms (say 100).

The only sticky point is that on some sequence pages we want to show 
more than
the usual number of elements to distinguish the sequence from similar 
sequences
or to guarantee that some important element appears on the page. To 
accomplish
this, we would add a new number to the %O line. If this number is 
missing or zero,
we use the default formatting. If it is nonzero n, we extend the 
formatted sequence
to include a(n). For example, if we wanted to show all 88 Armstrong 
numbers on
the sequence page, we would include

%O A005188 1,2,88

If we made this change, it would

- Eliminate having to submit sequence values twice as %STU values and 
b-file values.
- Eliminate arbitrary restrictions (e.g, 260 char limit) on submitting 
explicit sequence
values.
- Eliminate any possible inconsistency between %STU and b-file values, 
since %STU
values no longer exist.

On 11/5/2011 3:21 PM, N. J. A. Sloane wrote:
> David Wilson suggested:
>
>   The submission page
> should automatically format the terms to database standards. This would
> include dropping terms if necessary to stay within formatting limits,
>
> Me: No, no, no. There are many reasons why we sometimes
> want more terms than usual (sequences that are hard to compute,
> sequences where we need to show a few more terms to distinguish it from
> its nbrs, also we may change the so-called limit at some time)
>
> This is a bad idea!
>
> Neil
>
> _______________________________________________
>
> Seqfan Mailing list - http://list.seqfan.eu/
>




More information about the SeqFan mailing list