[seqfan] Re: Worry about old sequence, A030077, paths in K_n
Tim Peters
tim.peters at gmail.com
Tue Apr 5 03:29:23 CEST 2022
Thank you for the comments, Brendan! They were very clear and helpful.
[Brendan McKay]
> ...
> I guess that the most expensive multisets to test are those which
> are not realizable, but you can eliminate probably all of those by
> testing for admissibility.
Yes, the ones that aren't realizable required exhaustive search to
prove they can't be materialized.
My code yesterday required about 90 minutes to go all the way through a(17).
After adding code to reject non-admissible multisets as soon as
they're generated, the total time dropped to about an hour.
So the relative handful of non-admissible multisets consumed over half
an hour of CPU time to weed out via exhaustion. Very much worth
avoiding.
> Depending on how you are doing the bookkeeping, you can eliminate
> most admissible multisets quickly by making a large number of
> random paths. Then you only need to test the admissible multisets
> that have not been spotted.
That is, pIck a permutation of the index set at random. Compute its
chord multiset and add it to a set of all seen so far. Repeat until
... "a large number" ;-) Then when generating distinct multisets
directly, skip over those already in that set.
Devil's in the details. I stopped when 1000 * n tries were made in a
row without finding a new multiset.
Why? Why not ;-). That dropped the total runtime (through a(17)) down
to 16 minutes. More valuable overall: less benefit per admissible
multiset, but there are so very many more admissible than
non-admissible multisets.
Next step: I don't care about CPU time so much as wall-clock time
here. So I spin off another process to do the random generation
"forever", and communicate results back to the main program,
concurrently with generating multisets directly. There's potentially a
lot of redundant work this way, But it cut total wall-clock time to
about 11 minutes. About a factor of 9 faster than yesterday's code
overall;. I'll take it :-)
FYI, the code just before multiprocessing was added completed n=18
while I was at supper. As a sanity check, the results:
n 18
number weak compositions 1081575
number of those not admissible 8845
number admissible 1072730
number of unique sums 445507
I'm not surprised at the high number of duplicate sums. PSLQ finds 3
simple identities involving integer linear combinations of sin(j * pi
/ 18) for j in 1 .. 9.
0.0
1 0.17364817766693 1
1 0.766044443118978 5
-1 0.939692620785908 7
-1.11022302462516e-16
1 0.342020143325669 2
1 0.642787609686539 4
-1 0.984807753012208 8
-1.11022302462516e-16
2 0.5 3
-1 1.0 9
I won't translate to English, except to sketch the first: let S(j) =
sin(j * pi / 18). Then the first block says that, within a small
number of machikne ulps, S(1) + S(5) - S(7) = 0.
You can use fancier software to verify that it is an identity (e.g.,
wxMaxima's trigrat() simplifies the symbolic expression to exactly 0).
In a different message you wondered about the lack of duplicates at
n=16 and n=22. PSLQ finds nothing close to an identity at those values
of n. I believe constructing distinct multisets with the same sum
requires exploiting at least 2 different identities.
More information about the SeqFan
mailing list