[seqfan] Re: Nice nontrivial integer sequence

Mon Mar 6 20:39:35 CET 2023

On Sun, Mar 05, 2023 at 05:27:32PM -0500, Arthur O'Dwyer wrote:

> I confirm these results, using a C++ program that I wrote without looking
> at Christian's C program. (Now I've looked, and I think we converged to the
> same basic data structures... except that Christian doesn't even need to
> store the list of `i` values? Ah, yes, now I see; that's clever. Updated
> mine.)
> https://quuxplusone.github.io/blog/2023/03/05/oeis-a360447/
> has my C++ program and results up to a(96), at which point I, also, run out
> of memory.

Awesome! I made some more observations, and I think I have one more
term.

My sum array is too big, 2N is the biggest sum we could get, but we'll
never need them anyway. So sum[N+1] is enough. 
(You only use half the size? That's really neat!
I realized we only need the intervall [i..2*i], but didn't see how to
use that.)
With careful programming we don't need 2N to fit into the ints we use,
so I could go to 4 billion terms with some swapping (3.7 billions
without swapping).

This is my new update_sum function:

---------------------------
void update_sum(T n, T o){
  T s, d, od, p, q;
  if (N-n<o)
    return;
  s = n+o;
  d = (n>o) ? n-o : o-n;
  p=sum[s];
  if(!p){
    sum[s]=n;
    return;
  }
  q=s-p;
  od = (p>q) ? p-q : q-p;
  if (d<od)
    sum[s]=n;
}
--------------------------

Unsurprisingly, this also speeds up the computation, and I got another
substantial performance increase from changing
  q=next[p];
in the main function to
  q=i-p;
(even though it still writes to that same location in the next step).
Now I'm at 30 seconds for 2 billion terms (using an i5 12500 processor).

We almost don't need the list!
If we ignore the position of the 2 (or handle it differently), it is
only used when a stable term is found and we need to see what the next
potential stable term is. So just storing the first 200 elements of the
list is plenty enough! But that needs a different data structure. If
we have a bigger int type, we might want to spend a high bit to indicate
whether the number is in the short list.

But there is also another option: we only need one array of size N!
When we handle i, the list so far contains the numbers smaller than i,
and all sums we get or care about are bigger than i. So we can put
both in one array.
(I haven't implemented any of this so I may be missing something...)

Another thing I noticed is that we can detect stable terms earlier. If n
is the last stable term and m is the next term in the list, we wait
until we process n+m. But we can declare m stable as soon as we find a
pair p,q with p+q=n+m and smaller difference.
I didn't change my stable detection code, but searched through the list
after all terms were added.

After 3.7 or 4 billion terms, the number in the list after the 96th
stable term, 1'345'208'697, is 3'672'819'155. Their sum is
5'018'027'852. The list contains the pair 3.112.447.655 and 1.905.580.197
which have the same sum and a smaller difference.
So 3'672'819'155 is the 97th stable term.
The next term in the list is 2'327'610'458, the sum 6'000'429'613 does
not appear so far as sum of other pairs.

Just above 6 billion should be possible with 40 bit integers and 32G
memory with the one array method...

All the best
Christian