[seqfan] Report on b-files
Neil Sloane
njasloane at gmail.com
Tue Jan 8 05:23:03 CET 2019
Dear Sequence Fans
Georg Fischer, Martin Pedersen, and several other people have been looking
over the b-files. There were a lot of errors.
Tonight I did the following:
1) On the server I made a list of all the b-files as of 19:45 New York time
today Jan 07 2019. The list is in 4 parts:
staff 2463694 Jan 7 20:09 bfilelist0.txt 40386 b-files
staff 450119 Jan 7 20:05 bfilelist3.txt 35861 b-files
staff 3865082 Jan 7 20:05 bfilelist2.txt 63362 b-files
staff 2187521 Jan 7 20:05 bfilelist1.txt 7379 b-files
for a total of 146988 b-files
2) At the same time I downloaded a new copy of the whole database - a new
cat25 file, which contains 318886 sequences, a stack of 5799337 (virtual)
punched cards. This contained 146573 distinct links to b-files. This list
is called bcat25u.txt
3) After I did some cleaning up, the situation at present is:
- every bfile mentioned in the database can be found on the server (146570
b-files, listed in bcat25u.txt)
- there are 414 b-files on the server that are not mentioned in the
database. These are listed in bextra.txt I think these should be deleted
from the server (does anyone disagree?)
I am going to send the lists I just mentioned to Georg and Daniel. If
anyone else would like copies, let me know.
One thing I have not done is to check if the 146573 b-files that are used
actually match the sequence data. But I believe Georg and Martin are
working on that.
More information about the SeqFan
mailing list