TOPIC: OEIS XML

Fri Apr 22 11:46:25 CEST 2005

 From now on, I will use TDB to stand for the current text EIS database
and XDB for the proposed XML EIS database.

I have looked over your tentative requirements from an earlier message.
Many of your requirements are stated in terms of possible XML
solutions, I have tried to strip these implementation details or solutions
and summarize the requirements:

1.    The XDB should be easier for NJAS to maintain than the TDB.
2.    The transition from TDB to XDB should be minimally painful.
3.    The XDB should be normalized.
4.    The XDB should support something akin to source control,
    tracking contributors and modification times per datum.
5.    The XDB should distinguish between definitions, identities, and
    conjectures.
5a.   The XDB should track contributors and sources per datum.
6.     The XDB should support a simple language for defining sequences
    in terms of other sequences, common sequence classes, and common
    operations on sequences.
7.     The XDB should support equations such as those provided for
    MathML or TeX.
8.    The XDB data organization should improve on that of the TDB.
9.    The XDB should support data in formats transitional between the
    TDB format and the XDB format.

I would categorize these requirements as follows, and add some of my
own requirements:

I.  Implementation requirements:
    A.  Migration from TDB to XDB should be as seamless as possible.

I.  Functional requirements:
    A.  The XDB must support all current functions of the TDB.
    B.  The XDB should support some form of source control, with
        automatic tracking of modifications, including submitter,
        data/time, and source documents associated with the modification.
    C.  The XDB should support a simple, broad sequence definition
        language.
    D.  The XDB should support equations similar to MathML or
        TeX equations.

II.  Format requirements:
    A.  The XDB should support data formats that are transitional
        between the TDB and XDB.
    B.  The XDB data model should reflect as closely as is practical
        the perceived organization of the data.
    C.  The XDB should be normalized.

This requirements list is a prototype.  However, even with this much
data, I can safely say the that project must be implemented in stages,
with each stage implementing a chewable number of features.  As a
software engineer, my approach to this project would be approximately
as follows:

1.  Define an API for the OEIS database.  This involves surveying the
    current functionality of the TDB.  We need to know everything that
    NJAS, editors, the site, and users do with the database, since we
    need to maintain all those functions in the TDB.  For instance, once
    we have transistioned to the TDB, NJAS will still want to create and
    tweak sequences, the site will still want to look up sequences by the
    existing criteria, and users will still want to be able to download
    sections of the database.  Loss of any significant function will
    conflict with the requirement of seamless migration, and will be
    disruptive to the operation of the OEIS.

2.  Once the API is defined, design a transitional XDB whose internal
    organization is faithful to the current TDB.  I would suggest that
    TDB-style tags be in their own namespace so that they will not
    conflict with XDB-style flags later on.

3.  Implement the API on the transitional XDB containing only
    TDB-style tags.  This will prove that the XDB is complete datawise
    and provide a test bed for future XDB development.

4.  Design the XDB-style tags and elements.  Survey the TDB data and
    proposed enhancements.  XDB-style tags and elements should be in
    their own name space.

5.  Implement the XDB-style tags.  As each tag or group of related
    XDB-style tags is implemented, the API should be modified to deal
    with the XDB-style tags.  It may be expedient at this point to convert
    TDB-style tags to XDB-style tags, or else teach the API to deal with
    mixed tags and postpone the conversion.  When the API is fully
    functional, we have effectively replaced the TDB, and now have
    a database with XML functionality, to which more-or-less seamless
    enhancements can be made.

6.  Create source control tools for the TDB.  These tools should
   allow users to create new sequences or check out existing sequences
   for modification, submit them for review, assign them to a reviewer,
   guide the reviewer through the review process and approve the
   sequence, and accept the sequence into the database (or at any
   point reject the sequence, which will then not enter the database).
   When the sequence is accepted into the database, the contributor
   and date/time are added to the modified lines.

7.  Think about enhancements, such as MathML equations or
   simple sequence descriptions.

----- Original Message ----- 
From: "Antti Karttunen" <Antti.Karttunen at iki.fi>
To: <ham>; "David Wilson" <davidwwilson at comcast.net>; "Hugo van der Sanden" 
<hv at crypt.org>; "Gerald McGarvey" <Gerald.McGarvey at comcast.net>; "Ralf 
Stephan" <ralf at ark.in-berlin.de>; "Thomas Baruchel" <baruchel at laposte.net>; 
"Marc LeBrun" <mlb at fxpt.com>; "Antti Karttunen" <Antti.Karttunen at iki.fi>
Sent: Thursday, April 21, 2005 4:21 PM
Subject: Re: TOPIC: OEIS XML

> David Wilson wrote:
>
>> To begin, I think the conversion of the OEIS into XML is a great
>> idea.  I would, however, like to understand the goal of the project.
>> Are we merely trying to emulate the current sequence schema in
>> XML, or are we going to develop a new schema for the XML
>> database, that is, the way the sequences SHOULD be structured?
>>
>> If the former, the whole project seems pretty cut and dried, and
>> you should need no participation beyond the techical people.  If
>> the latter, I suggest that you start putting together requirements
>> and/or design documents that can be reviewed and criticized by
>> us XML illiterates.
>
>
> Here are some of my thoughts.
>
> First of all, I'm here thinking about the full-scale replacement
> of the current implementation of the OEIS-system, based
> on the data stored in RDBMS or other database which
> can give an XML-view to its data. This would involve
> also all the operations Neil has to do for the maintenance
> of OEIS, currently invisible to us.
>
> [list of requirements omitted]