TOPIC: OEIS XML
David Wilson
davidwwilson at comcast.net
Fri Apr 22 11:46:25 CEST 2005
From now on, I will use TDB to stand for the current text EIS database
and XDB for the proposed XML EIS database.
I have looked over your tentative requirements from an earlier message.
Many of your requirements are stated in terms of possible XML
solutions, I have tried to strip these implementation details or solutions
and summarize the requirements:
1. The XDB should be easier for NJAS to maintain than the TDB.
2. The transition from TDB to XDB should be minimally painful.
3. The XDB should be normalized.
4. The XDB should support something akin to source control,
tracking contributors and modification times per datum.
5. The XDB should distinguish between definitions, identities, and
conjectures.
5a. The XDB should track contributors and sources per datum.
6. The XDB should support a simple language for defining sequences
in terms of other sequences, common sequence classes, and common
operations on sequences.
7. The XDB should support equations such as those provided for
MathML or TeX.
8. The XDB data organization should improve on that of the TDB.
9. The XDB should support data in formats transitional between the
TDB format and the XDB format.
I would categorize these requirements as follows, and add some of my
own requirements:
I. Implementation requirements:
A. Migration from TDB to XDB should be as seamless as possible.
I. Functional requirements:
A. The XDB must support all current functions of the TDB.
B. The XDB should support some form of source control, with
automatic tracking of modifications, including submitter,
data/time, and source documents associated with the modification.
C. The XDB should support a simple, broad sequence definition
language.
D. The XDB should support equations similar to MathML or
TeX equations.
II. Format requirements:
A. The XDB should support data formats that are transitional
between the TDB and XDB.
B. The XDB data model should reflect as closely as is practical
the perceived organization of the data.
C. The XDB should be normalized.
This requirements list is a prototype. However, even with this much
data, I can safely say the that project must be implemented in stages,
with each stage implementing a chewable number of features. As a
software engineer, my approach to this project would be approximately
as follows:
1. Define an API for the OEIS database. This involves surveying the
current functionality of the TDB. We need to know everything that
NJAS, editors, the site, and users do with the database, since we
need to maintain all those functions in the TDB. For instance, once
we have transistioned to the TDB, NJAS will still want to create and
tweak sequences, the site will still want to look up sequences by the
existing criteria, and users will still want to be able to download
sections of the database. Loss of any significant function will
conflict with the requirement of seamless migration, and will be
disruptive to the operation of the OEIS.
2. Once the API is defined, design a transitional XDB whose internal
organization is faithful to the current TDB. I would suggest that
TDB-style tags be in their own namespace so that they will not
conflict with XDB-style flags later on.
3. Implement the API on the transitional XDB containing only
TDB-style tags. This will prove that the XDB is complete datawise
and provide a test bed for future XDB development.
4. Design the XDB-style tags and elements. Survey the TDB data and
proposed enhancements. XDB-style tags and elements should be in
their own name space.
5. Implement the XDB-style tags. As each tag or group of related
XDB-style tags is implemented, the API should be modified to deal
with the XDB-style tags. It may be expedient at this point to convert
TDB-style tags to XDB-style tags, or else teach the API to deal with
mixed tags and postpone the conversion. When the API is fully
functional, we have effectively replaced the TDB, and now have
a database with XML functionality, to which more-or-less seamless
enhancements can be made.
6. Create source control tools for the TDB. These tools should
allow users to create new sequences or check out existing sequences
for modification, submit them for review, assign them to a reviewer,
guide the reviewer through the review process and approve the
sequence, and accept the sequence into the database (or at any
point reject the sequence, which will then not enter the database).
When the sequence is accepted into the database, the contributor
and date/time are added to the modified lines.
7. Think about enhancements, such as MathML equations or
simple sequence descriptions.
----- Original Message -----
From: "Antti Karttunen" <Antti.Karttunen at iki.fi>
To: <ham>; "David Wilson" <davidwwilson at comcast.net>; "Hugo van der Sanden"
<hv at crypt.org>; "Gerald McGarvey" <Gerald.McGarvey at comcast.net>; "Ralf
Stephan" <ralf at ark.in-berlin.de>; "Thomas Baruchel" <baruchel at laposte.net>;
"Marc LeBrun" <mlb at fxpt.com>; "Antti Karttunen" <Antti.Karttunen at iki.fi>
Sent: Thursday, April 21, 2005 4:21 PM
Subject: Re: TOPIC: OEIS XML
> David Wilson wrote:
>
>> To begin, I think the conversion of the OEIS into XML is a great
>> idea. I would, however, like to understand the goal of the project.
>> Are we merely trying to emulate the current sequence schema in
>> XML, or are we going to develop a new schema for the XML
>> database, that is, the way the sequences SHOULD be structured?
>>
>> If the former, the whole project seems pretty cut and dried, and
>> you should need no participation beyond the techical people. If
>> the latter, I suggest that you start putting together requirements
>> and/or design documents that can be reviewed and criticized by
>> us XML illiterates.
>
>
> Here are some of my thoughts.
>
> First of all, I'm here thinking about the full-scale replacement
> of the current implementation of the OEIS-system, based
> on the data stored in RDBMS or other database which
> can give an XML-view to its data. This would involve
> also all the operations Neil has to do for the maintenance
> of OEIS, currently invisible to us.
>
> [list of requirements omitted]
More information about the SeqFan
mailing list