finding if first number

cino hilliard hillcino368 at hotmail.com
Mon May 29 04:13:47 CEST 2006


Hi Frank,
Thanks for you input and work in seqfan.

I have some thoughts and procedures below.

>From: franktaw at netscape.net
>To: seqfan at ext.jussieu.fr
>Subject: Re: finding if first number
>Date: Sun, 28 May 2006 14:06:53 -0400
>
>As I have pointed out before, searching for something at the beginning of a 
>sequence is problematic.  It is often rather arbitrary whether an initial 0 
>(or 1) is present at the beginning of a sequence.  So you will find or not 
>find a sequence based on this arbitrary decision.
Yes.

So, if we get out of the box, we will always find a first string with a 
dedicated program
such as the one I  presented in my prior post.

If you do not want to program a find routine you can save the Part xxx 
"target as"
into a folder, say, c:\download. This creates a file  eisBTfry00xxx.txt 
where xxx is the part number.

Then use the dos command, findstr, as in the following example.
(Type help findstr at the dos command prompt to get instructions on using 
findstr)

c:\download>findstr %S.A.......911, eisBTfry00114.txt
%S A093634 911,911111,911111111111111111111,

The "." is a wildcard for any character between characters shown. In this 
case, the seven .'s
are the 6 digits of the seq number and the 1 dot for the intervening space.

If you save "target as" all Parts (takes about 5min with cable) , then you 
can invoke the findstr
command as follows:

c:\download>findstr  %S.A.......911, *.txt  This will search all .txt files 
in the folder for the string.

This is useful if you are doing several searches.

Don't worry if you have unrelated .txt files in the folder. It will just not 
find anything and
continue.

In my installation I have a few .txt files where the 052706.txt is the whole 
database as of 5/27.

The findstr output is as follows.

c:\download>findstr %S.A.......911, *.txt
052706.txt:%S A093634 911,911111,911111111111111111111,
911,.txt:%S A093634 911,911111,911111111111111111111,
911.txt:%S A093634 911,911111,911111111111111111111,
eisBTfry00114.txt:%S A093634 911,911111,911111111111111111111,
recent.txt:%S A119520 
911,1974,2326,6236,8346,8403,9301,15317,17412,17601,20512,
21914,22211,
test.txt:%S A093634 911,911111,911111111111111111111,

Findstr went through each .txt file and found the occurrences of 911,

This should always find the first occurrences.

Again, if you are looking for any first 911, then  use

c:\download>findstr %S.A.......911 *.txt

Without the comma, all first occurrences will be tabulate including those 
that are prefixes to larger
integers.

Moreover, you can pipe your "finds" to a file with

c:\download>findstr %S.A.......911 *.txt > eureka.txt

Finally,  you can extend this procedure for other searches in the database. 
Ie., keyword,
name, reference, etc,

And Dos the way it is,
May 28, 2006

Have fun,
Cino hilliard.







More information about the SeqFan mailing list