tables of p(n) and/or pi(n)

Graeme McRae g_m at mcraefamily.com
Thu Sep 22 22:49:41 CEST 2005


As numbers increase, the density of primes in those numbers decreases.  Beyond a certain point, you would do better to represent the differences between successive primes p and q based on the possible values of p mod 2310=2*3*5*7*11.  Once you get past 2, 3, 5, 7, and 11, there are 480 possible values of that a prime, p, mod 2310 can take on (because 480 is phi(2310)).  The distance of q from p among these values can be represented by a 4-bit code in which 0000 means that q is the next possible number that could be prime; 0001 means q is the second possible number that could be prime, etc.  1111 could be reserved to mean the distance among possible values is larger than 15, and successive 1111's could be used to indicate gaps larger than 30, 45, etc.  I've noted that 15 hops along the possible values of prime numbers, i.e. those relatively prime to 2310, take you an average of 72 numbers.  So until you get high enough that the density of primes approaches 1/72 (which doesn't happen before 10^31), the average number of bits per prime is only a tiny bit larger than 4.

Using the bit-map approach, the average number of bits per prime starts to exceed 4 when the density of primes falls below 1/15, which happens pretty early in the game -- before 4 million.  So the break-even point (the point where the "distance mod-2310" scheme gets better than the bit-map) occurs about when the primes exceed 4 million.

I recognize that theory and programming are two different animals, and that programming a "distance among relative primes to 2310" scheme has its own set of challenges, such as the ones that have been mentioned: speed of lookup, speed of manipulating large arrays of primes, etc. so this scheme may not be at all practical.  I'm just throwing it out as an idea that people might want to consider.

--Graeme


----- Original Message ----- 
  From: Hans Havermann 
  To: joshua.zucker at gmail.com 
  Cc: seqfan at ext.jussieu.fr 
  Sent: Monday, September 19, 2005 11:16 AM
  Subject: Re: tables of p(n) and/or pi(n)


  Joshua Zucker wrote:


    The easy compression fits 30 numbers per byte, by using one bit for

    prime/not prime for each of the number equal to 1, 7, 11, 13, 17, 19,

    23, 29 mod 30.




    So it's not 30 primes per byte, but rather 30 numbers.




    With that scheme, since pi(3*10^13) is roughly 10^12, it would take

    only (3*10^13)/30 bytes to store the primality of all of them, so

    indeed, 1TB is about right (not 10TB as in your corrected post).



  To be fair, I'm not just storing primality, but also relatively fast recovery of pi(prime) values.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.seqfan.eu/pipermail/seqfan/attachments/20050922/3067b4a9/attachment-0001.htm>


More information about the SeqFan mailing list