[seqfan] Report on b-files

Neil Sloane njasloane at gmail.com
Tue Jan 8 05:23:03 CET 2019


Dear Sequence Fans

Georg Fischer, Martin Pedersen, and several other people have been looking
over the b-files. There were a lot of errors.


Tonight I did the following:

1) On the server I made a list of all the b-files as of 19:45 New York time
today Jan 07 2019. The list is in 4 parts:

staff  2463694 Jan  7 20:09 bfilelist0.txt 40386 b-files

staff   450119 Jan  7 20:05 bfilelist3.txt 35861 b-files

staff  3865082 Jan  7 20:05 bfilelist2.txt 63362 b-files

staff  2187521 Jan  7 20:05 bfilelist1.txt  7379 b-files

for a total of 146988 b-files


2) At the same time I downloaded a new copy of the whole database - a new
cat25 file, which contains 318886 sequences, a stack of 5799337 (virtual)
punched cards. This contained 146573 distinct links to b-files. This list
is called bcat25u.txt


3) After I did some cleaning up, the situation at present is:

- every bfile mentioned in the database can be found on the server (146570
b-files, listed in bcat25u.txt)


- there are 414 b-files on the server that are not mentioned in the
database. These are listed in bextra.txt I think these should be deleted
from the server (does anyone disagree?)


I am going to send the lists I just mentioned to Georg and Daniel.  If
anyone else would like copies, let me know.


One thing I have not done is to check if the 146573 b-files that are used
actually match the sequence data. But I believe Georg and Martin are
working on that.



More information about the SeqFan mailing list