[seqfan] Re: D(n)

David Sycamore djsycamore at yahoo.co.uk
Wed Jan 1 12:22:46 CET 2020


Congratulations on a brilliant result; well done David Seal ! 

I especially enjoyed reading how your machine managed to compress  a few billion years of potential computation time into just 10 minutes; how nice :-)

If this sequence belongs in oeis then  perhaps the Data could go up to a(11) and maybe a screenshot of a(12) could be added as a link ? 

Happy New Year 

David. 



> On 30 Dec 2019, at 02:26, David Seal <david.j.seal at gwynmop.com> wrote:
> 
> I wrote:
> 
>> Since any integer that requires 11 iterations to reach an all-digits-
>> different stable integer therefore has at least 26 digits, and the
>> arguments above about the number of digits input to D() say that at
>> least 1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+2+3+4+9+10+11+13+15+18+22+30 =
>> 152 digits are needed to produce 26 output digits, one can conclude
>> that a(12) has at least 152 digits. This goes well beyond the range
>> of my program's search, so I stopped it before it completed - no point
>> in giving it some days to weeks to work its way up to 54 digits when
>> it's clear that no further results will be found!
> 
> I've since managed to determine a(12) using another search based on the same general ideas - i.e. using knowledge of how many digits a(11) has to say that D(a(12)) must have at least that many digits, and using that to explore the possible values of a(12).
> 
> Specifically, my previous search said that a(11) = {1*4,2*4,3*3,4*2,5,6*3,7*5,8*2,9*2} in multiset terms, using the notation d*i to indicate i copies of the digit d, so that a(11) = 11112222333445666777778899 in integer terms, i.e. the smallest integer whose digits are that multiset. So a(11) has 26 digits, from which it follows that any multiset (or integer) m that requires 11 iterations of m --> D(m) to become stable (i.e. to become an all-digits-different value which won't be changed by further iterations) must have at least 26 digits. D(a(12)) is such a multiset (or integer) and so D(a(12)) has at least 26 digits.
> 
> In general, as a multiset D(m) is the union of D0(m), D1(m), D2(m), ... D9(m), where Dd(m) is the empty multiset if the digit d does not appear in m and the multiset consisting of the digits of the product of d and the number of copies of d that appear in m. For example, if m = a(11), D0(m) is {} since 0 does not appear in m and D7(m) = {3,5} since there are 5 7s in m and 7*5 = 35, and overall D(m) = the multiset union of {}, {4}, {8}, {9}, {8}, {5}, {1,8}, {3,5}, {1,6} and {1,8} = {1*3,3,4,5*2,6,8*4,9}.
> 
> The size of a multiset union is just the sum of the sizes of the multisets it is formed from, so the number of digits of D(m), which I will denote by N(m), is equal to N0(m)+N1(m)+N2(m)+...+N9(m), where for each digit d Nd(m) is 0 if d does not appear in m, and otherwise the number of digits in the product of d and the number of times d appears in m.
> 
> If m contains n copies of d, then:
> 
> * If d is 0, D0(m) = {} if 0 does not appear in m and {0] if it does, so N0(m) = 0 or 1 respectively.
> 
> * If d is not 0, Nd(m) is 0 if d does not appear in m, and is otherwise equal to the largest value of i such that 10^(i-1) <= the product of d and the number of times d appears in m. This means that there is a sequence of thresholds RoundUp(1/d), RoundUp(10/d), RoundUp(100/d), RoundUp(1000/d), RoundUp(10000/d), ... on the number of times d appears in m for Nd(m) to reach 1, 2, 3, 4, 5, ... For instance, for d=7, that sequence of thresholds is 1, 2, 15, 143, 1429, ..., indicating that N7(d) is 1 if d appears between 1 and 2-1=1 times, 2 if it appears between 2 and 15-1=14 times, 3 if it appears between 15 and 143-1=142 times, 4 if it appears between 143 and 1429-1=1428 times, ...
> 
> These thresholds and those implied by the special cases can be listed in a table:
> 
> Digit  Thresholds for Nd(m) =
> (d)     0      1      2      3      4      5  ...
> --------------------------------------------------
> 0      0      1      x      x      x      x  ...
> 1      0      1     10    100   1000  10000  ...
> 2      0      1      5     50    500   5000  ...
> 3      0      1      4     34    334   3334  ...
> 4      0      1      3     25    250   2500  ...
> 5      0      1      2     20    200   2000  ...
> 6      0      1      2     17    167   1667  ...
> 7      0      1      2     15    143   1429  ...
> 8      0      1      2     13    125   1250  ...
> 9      0      1      2     12    112   1112  ...
> 
> Basically, given a multiset (or integer), one can count up the number of occurrences of each digit d in it, look across each row of this table to find the largest threshold that one has achieved, then look at which column that threshold is in to determine Nd(m), and add the results to find N(m), the number of digits in D(m). For example, m = a(11) has:
> 
> 0 0s, so the largest threshold exceeded in the d=0 row is 0, making N0(m) = 0
> 4 1s, so the largest threshold exceeded in the d=1 row is 1, making N1(m) = 1
> 4 2s, so the largest threshold exceeded in the d=2 row is 1, making N2(m) = 1
> 3 3s, so the largest threshold exceeded in the d=3 row is 1, making N3(m) = 1
> 2 4s, so the largest threshold exceeded in the d=4 row is 1, making N4(m) = 1
> 1 5s, so the largest threshold exceeded in the d=5 row is 1, making N5(m) = 1
> 3 6s, so the largest threshold exceeded in the d=6 row is 2, making N6(m) = 2
> 5 7s, so the largest threshold exceeded in the d=7 row is 2, making N7(m) = 2
> 2 8s, so the largest threshold exceeded in the d=8 row is 2, making N8(m) = 2
> 2 9s, so the largest threshold exceeded in the d=9 row is 2, making N9(m) = 2
> -----------------------------------------------------------------------------
> Total:                                                              N(m) = 13
> 
> The following table is the first-differences-across-rows of that table (with asterisks explained below the table):
> 
> Digit  First differences of Nd(m) thresholds 
> (d)     1      2      3      4      5  ...
> -------------------------------------------
> 0      1      x      x      x      x  ...
> 1      1*     9     90    900   9000  ...
> 2      1*     4     45    450   4500  ...
> 3      1*     3     30    300   3000  ...
> 4      1*     2     22    225   2250  ...
> 5      1*     1     18    180   1800  ...
> 6      1*     1*    15    150   1500  ...
> 7      1*     1*    13    128   1286  ...
> 8      1*     1*    11    112   1125  ...
> 9      1*     1*    10    100   1000  ...
> 
> The way this table can be used for calculating N(m) is again to count each digit in m, then work across the corresponding row successively deducting each entry from the count for as long as it doesn't make the count negative. Asterisk each successfully deducted entry and count the number of asterisks: the result is N(m), the number of digits in D(m). The asterisks in the table are for the example of m = a(11) - for instance, the count of 5 7s is successfully reduced by the first 1 in the d=7 row (5-1 = 4), then by the second 1 (4-1 = 3), but then not by the 13 (which is larger than 3). So just the two 1s in that row are asterisked.
> 
> That may seem a bit of a roundabout way to calculate N(m) - which it is. But its real value is in the opposite direction: if we know N(m), the number of digits in m must be at least the sum of the N(m) smallest entries in that second table. This was the key fact that my earlier email was built upon, and my deduction that a(12) must have at least 1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+2+3+4+9+10+11+13+15+18+22+30 = 152 digits was simply a matter of using that fact plus the knowledge that D(a(12)) has at least 26 digits, i.e. N(a(12)) >= 26.
> 
> Another observation is that if m contains a 0, D(m) must contain at least one 0. So if D(m) does not contain a 0, m does not contain a 0, and therefore the number of digits of m must be at least the sum of the N(m) smallest entries in that second table, excluding the entry 1 in the d=0 row. So in particular, if D(a(12)) = a(11), a(12) must contain at least 1+1+1+1+1+1+1+1+1+1+1+1+1+1+2+3+4+9+10+11+13+15+18+22+30+45 = 196 digits. Quite a substantial increase - but of course D(a(12)) might be some other 26-digit multiset/integer that does include a 0, if one exists...
> 
> So I modified my previous search to make it output multisets that equalled the previous record as well as ones that bettered it, so that it would output all 26-digit multisets that required eleven m --> D(m) iterations to become stable and not just the smallest one. It turned out that there are three of them:
> 
> {1*4,2*4,3*3,4*2,5,6*3,7*5,8*2,9*2}
> {1*4,2*3,3*3,4*2,5*3,6*3,7*5,8,9*2}
> {1,2*3,3*3,4*2,5,6*3,7*5,8*6,9*2}
> 
> None of them contain a 0, and so if N(a(12)) = 26, the number of digits of a(12) must be at least 196, while if N(a(12)) >= 27, it must be at least the sum of the 27 smallest entries of the second table, i.e. 1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+2+3+4+9+10+11+13+15+18+22+30+45 = 197. So a(12) must contain at least 196 digits.
> 
> At this point, I started attacking the problem from the other end - could I produce a multiset m such that D(m) = a(11)? If so, the size of m would be an upper bound on the size of a(12) - and actually, I clearly could produce one - the multiset m consisting of a(11) 1s, i.e. m = {1*11112222333445666777778899}, clearly has D(m) = a(11), and so a(12) has no more than 11112222333445666777778899 digits. But I wanted to produce a better upper bound on the size of a(12) than that!
> 
> At this point, I started playing around with the numbers. It looked likely that a good way of getting the 26 digits of a(11) for D(m) would be for m to have 1*(number of 1s) have two digits, while 2*(number of 2s), 3*(number of 3s), 4*(number of 4s), 5*(number of 5s), 6*(number of 6s), 7*(number of 7s), 8*(number of 8s) and 9*(number of 9s) each have 3 digits. Furthermore, I wanted the top digits of those eight 3-digit numbers to be as small as possible, especially for the smaller digits, so I aimed for 2*(number of 2s), 3*(number of 3s), 4*(number of 4s) and 5*(number of 5s) to be 3-digit numbers starting with 1 and 6*(number of 6s), 7*(number of 7s), 8*(number of 8s) and 9*(number of 9s) to be 3-digit numbers starting with 2, using up all the 1s and 2s in D(m). I then used a 'greedy' technique to pick first the number of 9s, then the number of 8s, etc, trying to account for the higher digits (and especially the large number of 7s) early. It was fairly easy to come up with:
> 
> 31 9s, producing {2,7,9}, the digits of 9*31 = 279
> 37 8s, producing (2,6,9}, the digits of 8*37 = 296
> 41 7s, producing {2,7,8}, the digits of 7*41 = 287
> 46 6s, producing {2,6,7], the digits of 6*46 = 276
> 37 5s, producing {1,5,8}, the digits of 5*37 = 185
> 44 4s, producing {1,6,7}, the digits of 4*44 = 176
> 49 3s, producing {1,4,7}, the digits of 3*49 = 147
> 67 2s, producing {1,3,4}, the digits of 2*67 = 134
> 33 1s, producing {3,3}, the digits of 1*33 = 33
> --------------------------------------------------
> 385 digits of m, producing {1*4,2*4,3*3,4*2,5,6*3,7*5,8*2,9*2} = a(11)
> 
> So a(12) has at most 385 digits, i.e. at this point I knew that a(12) has between 196 and 385 digits. Still quite a wide range...
> 
> However, look at D(a(12)). If it has 28 or more digits that don't include a 0, then a(12) has at least
> 1+1+1+1+1+1+1+1+1+1+1+1+1+1+2+3+4+9+10+11+13+15+18+22+30+45+90+100 = 386 digits, and if it has 29 or more digits that do include a 0, then a(12) has at least 1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+2+3+4+9+10+11+13+15+18+22+30+45+90+100 = 387 digits. Since we know that a(12) has no more than 385 digits, we can rule those possibilities out, and it follows that D(a(12)) has 26 digits (which we know cannot include a 0), 27 digits (which might or might not include a 0) or 28 digits that do include a 0. At this point, I went back to my enhanced earlier program and found that there were the following 31 possibilities for D(a(12)):
> 
> {1*4,2*4,3*3,4*2,5,6*3,7*5,8*2,9*2}
> {1*4,2*3,3*3,4*2,5*3,6*3,7*5,8,9*2}
> {1,2*3,3*3,4*2,5,6*3,7*5,8*6,9*2}
> {0,1*4,2*4,3*3,4*2,5,6*3,7*5,8*2,9*2}
> {0,1*4,2*3,3*3,4*2,5*3,6*3,7*5,8,9*2}
> {0,1,2*3,3*3,4*2,5,6*3,7*5,8*6,9*2}
> {1*6,2*2,3*3,4*2,5*3,6*3,7*5,8,9*2}
> {1*5,2*4,3*3,4*2,5*3,6*3,7*2,8,9*4}
> {1*5,2*4,3,4*2,5,6*3,7*7,8*2,9*2}
> {1*5,2*3,3,4*2,5*3,6*3,7*7,8,9*2}
> {1*4,2*4,3*3,4*4,5,6*3,7*5,8,9*2}
> {1*3,2*4,3*3,4*2,5,6*3,7*2,8*7,9*2}
> {1*3,2*3,3*5,4*2,5,6*3,7*7,8,9*2}
> {1*3,2*2,3*3,4*2,5*3,6*3,7*8,8,9*2}
> {1,2*4,3*3,4*2,5*3,6,7*5,8*6,9*2}
> {1,2*4,3*2,4*2,5*3,6*3,7*5,8*6,9}
> {2*3,3*3,4*2,5*3,6*3,7*5,8*6,9*2}
> {0*2,1*4,2*4,3*3,4*2,5,6*3,7*5,8*2,9*2}
> {0*2,1*4,2*3,3*3,4*2,5*3,6*3,7*5,8,9*2}
> {0*2,1,2*3,3*3,4*2,5,6*3,7*5,8*6,9*2}
> {0,1*6,2*2,3*3,4*2,5*3,6*3,7*5,8,9*2}
> {0,1*5,2*4,3*3,4*2,5*3,6*3,7*2,8,9*4}
> {0,1*5,2*4,3,4*2,5,6*3,7*7,8*2,9*2}
> {0,1*5,2*3,3,4*2,5*3,6*3,7*7,8,9*2}
> {0,1*4,2*4,3*3,4*4,5,6*3,7*5,8,9*2}
> {0,1*3,2*4,3*3,4*2,5,6*3,7*2,8*7,9*2}
> {0,1*3,2*3,3*5,4*2,5,6*3,7*7,8,9*2}
> {0,1*3,2*2,3*3,4*2,5*3,6*3,7*8,8,9*2}
> {0,1,2*4,3*3,4*2,5*3,6,7*5,8*6,9*2}
> {0,1,2*4,3*2,4*2,5*3,6*3,7*5,8*6,9}
> {0,2*3,3*3,4*2,5*3,6*3,7*5,8*6,9*2}
> 
> I then wrote a program to try to find values of m such that D(m) is each of those multisets in turn, with a limit on the number of digits in m that was initially set to 385, but reduced as and when smaller solutions were found, to the number of digits in the smaller solution. The program basically uses 10 nested subroutines:
> 
> The outermost subroutine looks for solutions for m with no more than the limiting number of digits, all of which could in principle be any of 0 to 9. However, in practice it knows that a solution for m with more than one 0 can always be reduced to a solution with a smaller number of digits by removing all but one of the 0s, so it only looks for solution with no 0s and with one 0. Furthermore, if D(m) doesn't include a 0, it knows that m cannot include a 0, so in that case it only looks for solutions with no 0s. It therefore calls the second outermost subroutine either once, specifying no 0s, or twice, specifying no 0s or one 0.
> 
> The second outermost subroutine looks for solutions for m with no more than the limiting number of digits, of which a specified number are 0s. There are (limit) - (number of 0s) other digits, each of which can be any of 1 to 9, and it loops through the possible numbers of 9s. It rejects any for which D9(m) would cause there to be too many of a digit in D(m). As some examples, on the call to it for D(m) = {1*4,2*4,3*3,4*2,5,6*3,7*5,8*2,9*2}, it rejects ten or twelve 9s because they would cause D9(m) to be {0,9} or {0,1,8} respectively, which isn't compatible with D(m) containing no 0s, and it rejects 62 and 65 9s because they would cause D9(m) to be {5,5,8}, which isn't compatible with D(m) containing only one 5. It also rejects numbers of 9s that would mean that the remaining digits of m (which must all be in the range 1 to 8) cannot generate enough digits for D(m). In the same call for D(m) = {1*4,2*4,3*3,4*2,5,6*3,7*5,8*2,9*2}, for example, it knows for numbers of 9s from 247 up to 385, m contains at most 385-247 = 138 digits that are in the range 1 to 8, and D9(m) = 4 (since 9*(number of 9s) has 4 digits). So D1(m)+D2(m)+D3(m)+D4(m)+D5(m)+D6(m)+D7(m)+D8(m) = 26-4 = 22, which means that m must contain a number of digits in the range 1 to 8 which is at least the sum of the 22 smallest entries on the d=1 to d=8 rows of the table. That sum is 1+1+1+1+1+1+1+1+1+1+1+1+2+3+4+9+11+13+15+18+22+30 = 139, so m must contain at least 139 digits in the range 1 to 8, contradicting the fact that it contains at most 138 such digits. It can similarly reject the possibility that m contains no 9s, since the sum of the 26-0 = 26 smallest entries on the d=1 to d=8 rows is 1+1+1+1+1+1+1+1+1+1+1+1+2+3+4+9+11+13+15+18+22+30+45+90+112+128 = 514 > 385, or one 9, since the sum of the 26-1 = 25 smallest entries on the d=1 to d=8 rows is 1+1+1+1+1+1+1+1+1+1+1+1+2+3+4+9+11+13+15+18+22+30+45+90+112 = 386 > 385. It therefore calls the third outermost subroutine a number of times, specifying either no 0s or one 0, and a number of 9s in the range 0 to 385 but with various possible values rejected.
> 
> The third outermost subroutine looks for solutions for m with no more than the limiting number of digits, of which a specified number are 0s and another specified number are 9s. It similarly loops through the possible numbers of 8s, rejecting numbers of 8s which cause D(m) to contain too many of any particular digit, or that cause the remaining digits of m (which are all in the range 1 to 7) to be too few in number to generate the required remaining number of digits of D(m) after accounting for those generated by the 0s, 8s and 9s.
> 
> And so it goes on - the fourth outermost subroutine loops through numbers of 7s, the fifth outermost through numbers of 6s, etc, until the innermost subroutine loops through numbers of 1s. If we've managed to generate all the digits of D(m) at the end of this, we've found a solution for m; if it's got no more than the smallest number of digits of any solution so far, we output the solution, and if it's got fewer digits than that smallest number, we update the smallest number.
> 
> Anyway, I was rather uncertain just how long the program would take - a very crude program that had ten nested loops, each running through numbers of a particular digit that could range from 0 to 385, would involve 396^10 = ~7.343*10^25 iterations of the innermost loop, which would take far too long (of the very rough order of billions of years). That would be cut down by all the rejections of paricular numbers of digits the program was doing, but would it be cut down enough? I didn't have much of an idea about exactly how much it would be cut down, just a feeling that it was at least worth trying - and it turned out to take about 10 minutes on my desktop machine. The results are that a(12) has 373 digits, and there are two 373-digit multiset solutions for it, which are:
> 
> {1*37,2*68,3*49,4*37,5*27,6*46,7*41,8*37,9*31}
> {1*37,2*69,3*49,4*34,5*29,6*46,7*41,8*37,9*31}
> 
> The second of those produces the smaller integer solution, due to their first 105 digits being identical and its 106th digit being a 2 compared with a 3 for the first. The answer is that the integer value of a(12) is (with formatting to make the numbers of digit repetitions reasonably checkable):
> 
>                            111 1111111111 1111111111
> 1111111111 1111222222 2222222222 2222222222 2222222222
> 2222222222 2222222222 2222222222 2223333333 3333333333
> 3333333333 3333333333 3333333333 3344444444 4444444444
> 4444444444 4444445555 5555555555 5555555555 5555566666
> 6666666666 6666666666 6666666666 6666666666 6777777777
> 7777777777 7777777777 7777777777 7788888888 8888888888
> 8888888888 8888888889 9999999999 9999999999 9999999999
> 
> and its iterations are:
> 
> --> 37138147136145276287296279
> 
> --> 9354168185818
> 
> --> 931043632
> 
> --> 9910462
> 
> --> 1810462
> 
> --> 280462
> 
> --> 48046
> 
> --> 8806
> 
> --> 1606
> 
> --> 1120
> 
> --> 220
> 
> --> 40
> 
> Note that a similar argument to the one that said that a(12) has at least 196 digits now says that the number of digits of a(13) is at least the sum of the 373 smallest elements of the table, excluding the 1 on the d=0 row. This implies that a(13) has at least ~7.74 * 10^40 digits, putting it hugely beyond the scope of the OEIS!
> 
> All assuming my search programs are correct, of course...
> 
> David
> 
> --
> Seqfan Mailing list - http://list.seqfan.eu/




More information about the SeqFan mailing list