[seqfan] Re: Trying to use the author field of the oeis for FindStat

israel at math.ubc.ca israel at math.ubc.ca
Mon Jun 1 03:05:07 CEST 2015


Basically the problem is that the entries are written by many different
people over many years, and we are not always diligent about following
formatting rules.  It might be best if you could make your software be more
flexible at parsing names and dates.
 
Anything between "_"'s is an author. Month names (in all their possible 
variants) are pretty easy to recognize; a four-digit integer from 1990 to 
the current year is a year; a one or two digit integer before or after a 
month name is a day. Anything else can probably be ignored, as a first 
approximation.

Cheers,
Robert

On May 31 2015, Rubey Martin wrote:

>Hi Alonso!
>
>> Some of those latter eight need to be corrected. I'll get back to you 
>> later today. Al
>
>>> These entries look OK (authors are separated by " and " or "," possibly 
>>> followed by a date). However, the following are harder to parse:
>
>('A032537', '_Patrick De Geest_, april 1998.'),
>
>I'd appreciate "april 1998." -> "Apr 1998"
>
>('A005985', '_Colin Mallows_; revised Jun 13 2005'),
>
>Not that important but ";" -> "," would be good.
>
>('A117239', '_Eric W. Weisstein_, Mar 05, 2006')
>
>I'd appreciate removing the comma after 05
>
> ('A184184', '_Emeric Deutsch_, Feb 16 2011 (based on communication from 
> _Vladeta Jovovic_)'),
>
>I'm not sure what's best here.  Either
>
>'_Emeric Deutsch_, _Vladeta Jovovic_, Feb 16 2011'
>
>or leave it as is.
>
> ('A135533', '_N. J. A. Sloane_, based on a message from Guy Steele and D. 
> E. Knuth, Mar 01 2008'),
>
>Same problem, but worse.  Either
>
>'_N. J. A. Sloane_, Guy Steele, D. E. Knuth, Mar 01 2008'
>
>or
>
> '_N. J. A. Sloane_, Mar 01 2008 (based on a message from Guy Steele and 
> D. E. Knuth)')
>
>FindStat would then remove parenthetical remarks to obtain the author.
>
>('A199352', '_R. H. Hardin_ Nov 05 2011'),
>
>I'd appreciate a comma after the second underscore.
>
>('A018178', '_N. J. A. Sloane_.'),
>
>Please remove the period.
>
> ('A016437', 'rwgk(AT)cci.lbl.gov<http://cci.lbl.gov> (R.W. 
> Grosse-Kunstleve)'),
>
>I'd appreciate
>
>'R. W. Grosse-Kunstleve (rwgk(AT)cci.lbl.gov<http://cci.lbl.gov>)'
>
>Many many many thanks!
>
>Martin
>
>_______________________________________________
>
>Seqfan Mailing list - http://list.seqfan.eu/
>
>



More information about the SeqFan mailing list