Most "compact" sequence such that there is at least one prime between a(n) and a(n+1)

Thu Dec 13 17:24:50 CET 2007

> >Maybe someone could verify my findings.
> 
> The decisive spot occurs at the prime 1151, which determines the least
> possible exponent. A description of a potential new sequence would be
> floor(n^(log(1151)/log(95))) together with a remark like "there is at
> least one prime p in each interval fulfilling a(n) < p <= a(n+1)"

This remark of course labeled as conjectural.

Over the past week, I've been writing functions for the various sequence
transforms in PARI/GP, similar to the Maple and Mathematica ones hosted on the
EIS.

http://www.research.att.com/~njas/sequences/transforms.html

You can find mine at:

http://www.geocities.com/seeker_168/transforms_pari.txt

I would like to see this one hosted too, but before that happens, I'd like to
discuss conventions with the list. Once an infrastructure is decided, it is
easy to add new transforms, but it is difficult to change the infrastructure
after it's in general use. It is worth taking some time to get it right.

I've decided to use vectors ("t_VEC") to implement sequences. One annoyance
with doing this is that most sequences start their index at 0, while vectors
index at 1. To accomodate this, for a vector A representing sequence a, I have
a_{n} = A[n+1]. I know this is confusing, but consistent usage can avoid most
of the confusion.

The other choice would be to use a series ("t_SER") which solves the indexing
problem, but is not as nice to print out or input and creates some confusion
about whether to interpret as o.g.f. or e.g.f.

The next issue is the naming convention. I've used a prefix for all the
functions, so that they don't clash with existing functions or those defined
in some other user's package. All functions operation on a vector have the
prefix "trv_". I have used "trm_" for operations on a matrix, "trvm_" for an
operation on a vector returning a matrix and "trsv_" for an operation on a
consideration. Since many transforms are invertible, a convention for the
inverse is needed. Neil has used an "i" postfix in the transform name (EULERi,
...), I have used "i_" prefix. I can be convinced otherwise, but I chose the
prefix notation because I have so many other postfix notations and there is
the natural tendency to say "inverse euler transform." In some cases, both a
transform and its inverse have common names hence we have trv_exp and trv_log
as inverses, also trv_psum and trv_diff.

? \r scripts/trans.gp
? v1=vector(15,n,1)
%36 = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
? trv_euler(v1)
%37 = [1, 1, 2, 3, 5, 7, 11, 15, 22, 30, 42, 56, 77, 101, 135]
? trv_i_euler(%)
%38 = [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

The names themselves deserve some discussion. When I consider the transforms
named EULER and WEIGH which organize objects into sets and the transform
INVERT which organizes them into lists, I think those may not have been the
best names, but they are referenced so much in the EIS, it is impossible to
change now. Newer transforms can get a fresh start here as their old names are
not so common. I've named the transform I called "CIK" in:

http://www.research.att.com/~njas/sequences/transforms2.html

as trv_cycle. Also "CHK" trv_lyndon, "BIK" trv_chain, "BHK" trv_chain_j, "DIK"
trv_polygon, "DHK" trv_polygon_j.

For most transforms, the output is the same length as the input, but for some
this is not a sensible assumption. In those cases I have an input parameter
for the output length.

? trv_euler(v1)
%39 = [1, 1, 2, 3, 5, 7, 11, 15, 22, 30, 42, 56, 77, 101, 135]
? trv_char(%,35)
%40 = [0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]

Some choose an output size based on the contents of the input.

? v1=vector(100,n,1)
%35 = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
? trv_euler_p(%)
%36 = [0, 1, 1, 1, 2, 1, 2, 1, 3, 2, 2, 1, 4, 1, 2, 2, 5, 1, 4, 1, 4, 2, 2, 1,
7, 2, 2, 3, 4, 1, 5, 1, 7, 2, 2, 2, 9, 1, 2, 2, 7, 1, 5, 1, 4, 4, 2, 1, 12, 2,
4, 2, 4, 1, 7, 2, 7, 2, 2, 1, 11, 1, 2, 4, 11, 2, 5, 1, 4, 2, 5, 1, 16, 1, 2,
4, 4, 2, 5, 1, 12, 5, 2, 1, 11, 2, 2, 2, 7, 1, 11, 2, 4, 2, 2, 2, 19, 1, 4,
4]
? trv_record(%)
%37 = [0, 1, 2, 3, 4, 5, 7, 9, 12, 16, 19]
? trv_record_ix(%36)
%38 = [0, 1, 4, 8, 12, 16, 24, 36, 48, 72, 96]

Of course I could use a parameter that defaults to the size of the input
output is longer that the input) will complicate the project.

I've put a GNU General Public License on the code. I would hate for someone to
take it from the web as "public domain" and then claim their own copyright on
it, denying hobbyists like myself from using it. I welcome opinions about it
or better solutions.

In some cases I've implemented sequence transforms on matrices (with the same
issue about interpreting the "1" row and column of the matrix as the "0" row
and column of the 2d sequence).

? m1 = matrix(10,12,x,y,y==2&&x!=1)
%42 = 
[0 0 0 0 0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0 0 0 0 0]

? trm_euler(m1)
%43 = 
[1 0 0 0 0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0 0 0 0 0]

[0 1 1 0 0 0 0 0 0 0 0 0]

[0 1 1 1 0 0 0 0 0 0 0 0]

[0 1 2 1 1 0 0 0 0 0 0 0]

[0 1 2 2 1 1 0 0 0 0 0 0]

[0 1 3 3 2 1 1 0 0 0 0 0]

[0 1 3 4 3 2 1 1 0 0 0 0]

[0 1 4 5 5 3 2 1 1 0 0 0]

[0 1 4 7 6 5 3 2 1 1 0 0]

This suggests the use of functions that can take different types of input as:

{ tr_euler(A) =
}

which will take either a vector or a matrix as input and will output the same
type. On the other hand many, perhaps most sequence transforms will never have
a matrix form. Would it be necessary to have a "trv_" and a "tr_" function for
every one of them? Also, labeled matrix transforms have additional
complications. I have

? m2=matrix(8,8,x,y,y==2&&x!=1)
%44 = 
[0 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0]

? trm_exp_h(m2)
%45 = 
[1 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0]

[0 1 1 0 0 0 0 0]

[0 1 3 1 0 0 0 0]

[0 1 7 6 1 0 0 0]

[0 1 15 25 10 1 0 0]

[0 1 31 90 65 15 1 0]

[0 1 63 301 350 140 21 1]

which treats the "x" as labeled and the "y" as unlabeled and:

? trm_exp_e(m2)
%46 = 
[1 0 0 0 0 0 0 0]

[0 1 0 0 0 0 0 0]

[0 1 2 0 0 0 0 0]

[0 1 6 6 0 0 0 0]

[0 1 14 36 24 0 0 0]

[0 1 30 150 240 120 0 0]

[0 1 62 540 1560 1800 720 0]

[0 1 126 1806 8400 16800 15120 5040]

which treats both as labeled, each having their own labeling set, hence there
is no clear choice about which is the matrix version of trv_exp.

Another question is about which transforms to include. This initial version
includes the following:

trv_euler, trm_euler, trv_i_euler, trv_euler_2, trv_euler_even, trv_euler_odd,
trv_euler_p, trv_exp, trm_exp_h, trm_exp_e, trv_log, trv_weigh, trv_weigh_x,
trv_i_weigh, trv_i_weigh_x, trv_weigh_2

all of which belong to the euler transform family. There are still other
members of that family waiting to be implemented. Every "combinatorial
assembly" transform has a family just as large (including trv_invert and
trv_cycle). Thus the number of transforms can grow very quickly and we could
created their own tools that could be hosted in this file. Certainly I would
never wish to see over 100K transforms as there are sequences in the EIS. No
one would use them, and they can't be searched the way the EIS can. Even a few
hundred would be overwhelming to a new user, and may make them skeptical about
downloading the file. For this reason, I'm thinking it might be nice to have a
file of ~50 "core" transforms and another file of other transforms.

Another issue with the combinatorial assembly sequences is how to handle the 0
index term. Most of those transforms follow the convention that if we have a =
T(b) for sequences a, b, then a_0 = 0  and b_0 = 1. The inverses naturally
following the opposite convention. In theory, the transform should reject an
input that does not follow the convention, but in practice, I've implemented
them to ignore the 0 index on input and follow the convention on output. The
Maple version handles it by treating those sequences as starting at index 1. I
don't like that solution because I want all transforms to treat sequences the
legitimate use for the 0 index term that has been obsured by convention. It
makes perfect sense to give the transform trv_weigh an input the a_0 == A[1]
!= 0. (Can you count partitions into distinct parts when 0 is allowed as a
partition?) To deal with this issue, I've implemented trv_weigh_x as an
"extended" version of the transform.

? v1
%47 = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
? trv_weigh(v1)
%48 = [1, 1, 1, 2, 2, 3, 4, 5, 6, 8, 10, 12, 15, 18, 22]
? trv_weigh_x(v1)
%49 = [2, 2, 2, 4, 4, 6, 8, 10, 12, 16, 20, 24, 30, 36, 44]
? concat(0,v1)
%50 = [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
? trv_weigh_x(%)
%51 = [1, 1, 1, 2, 2, 3, 4, 5, 6, 8, 10, 12, 15, 18, 22, 27]
? concat(-1,v1)
%52 = [-1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
? trv_weigh_x(%)
%53 = [1/2, 1/2, 1/2, 1, 1, 3/2, 2, 5/2, 3, 4, 5, 6, 15/2, 9, 11, 27/2]

So far I have not implemented any transforms taking 2 or more sequences as
input, but that will change. Thus a convention will be needed about the size
of the output when the inputs are not the same size. In the PARI built in
routines dirmul and dirdiv, the convention is the shorter of the 2.

? v2 = vector(5,n,n)
%54 = [1, 2, 3, 4, 5]
? v3 = vector(10,n,2*n)
%55 = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
? dirmul(v2,v3)
%56 = [2, 8, 12, 24, 20]

I would prefer the longer of the 2, but this is consistent with PARI's
philosophy to give exact answers where possible.

Also I want to consider how to discuss this. Initially I'd like to solicit the
opinions of a wide group, even non PARI users should take some interest as it
might influence other projects. (Maybe someone wants to do this with Perl or
Python.) Since this is seqfan and not a PARI list, programming issues should
probably be discussed off list, but the other issues can be discussed here for
a while at least.

I'd hate to make more work for Neil with a project like this, so I think I
agrees to host it.)

Christian

"Max Alekseyev" <maxale at gmail.com> wrote:
:The correct value of N is: N = p*q (which is easier to factor in general)
:and we need to look for divisors that are congruent to -q modulo p-q.

Using this approach, I calculate a(7) = 13005235 with the code below
(which takes 860 minutes on my computer - for comparison, a(6) takes
just under 12s).

Given my track record so far on this, I would welcome confirmation. :)

Further optimisations are possible: it is certainly worth replacing
the recursive function with a loop that keeps a stack for <p, q, prev>;
it may also be worth doing a binary chop to find the first divisor >= min,
a practical way to avoid constructing the full list of divisors, since
we only need a restricted subset of them.

Even though PARI gets to do the heavy lifting here, there is still also
a lot of time spent in the perl code: converting the whole thing to
a single PARI function (or Maple, Mma etc) would also save lots of time.

Note that the time spent is heavily "front-loaded": { 2, 4, 16, 256, ... }
took 80 minutes to calculate, but { 2, 4, 16, 257, ... } was already much
faster; { 2, ... } took 825 minutes of the 860 minutes total.

As a rough feeling, I would expect calculation of a(8) with this approach
to take between 10^6 and 10^9 times as long as a(7), but I don't know
right now how to get a firmer estimate than my hunch.

Hugo
---
#!/usr/bin/perl -w
use strict;
use Math::Pari qw/ PARI divisors gcd /;
my($c0, $c1, $c2) = map PARI($_), (0, 1, 2);

my $length = shift(@ARGV) || 7;
printf "A118085(%s) = %s\n", $length, A118085($length);
exit 0;

}

#
# A118085_r(p, q, prev, len)
# Returns the number of ways a multiset of len elements a_i >= prev
# yields \prod{1+1/a_i} = p/q
#
}

__END__