# [seqfan] Re: Probability of identical sequences

Thu Jan 16 04:19:48 CET 2020

```Hi Everyone,

Thank you again for your responses.
The two options of 1 and 0 happen only if we have a definitive answer. The probability is 1 if we can prove the connection, and 0 if we can find a counter example. But that's not the case all the time.

My question was not about randomness.  I don’t think that randomness is an element in the OEIS sequences. The OEIS has less then 400,000 sequences carefully selected based on specific mathematical algorithms. The sequences belong to different categories, but they are not random.

I would respectfully disagree with Frank’s comment. I think looking at the “odds” is an important factor if we want to decide whether to study potential connections between two sequences or not.

The deciding factors include, but not limited to:

1)    The number of similar terms. Obviously we would be more inclined to find connection between two sequences that share 50 terms rather than sequences that share 10 terms.

2)    The values of the identical terms. Even if the two sequences share only five terms, for example, we would be more interested if those term were 14356, 78098, 342789, 679838, 329019. We won’t be that interested if the five terms were 2, 4, 7, 9 ,11.

3)    The type of algorithms or concepts that produced the two sequences. If we were studying partitions and we somehow get these terms 2, 4, 6, 10, 14, 22, 30 it wouldn’t be big surprise. However, if we get the same terms while we were working on prime numbers, then we would definitely pay more attention to connection.

Best,

Ali

On Wednesday, January 15, 2020, 9:12:04 PM EST, Frank Adams-watters via SeqFan <seqfan at list.seqfan.eu> wrote:

If we find 50 or 100 identical terms at the start of two sequences with different definitions, this is an invitation for further study, The object is to either:

1) Find a difference, or

2) Prove that the sequences are the same.

(2) is the more interesting possibility. But until one or the other of these can be shown, it is incomplete. (There is always the possibility that it will stay undecided, of course.)

In other words, who cares what the odds are? This is mathematics. We care about proofs, not probabilities.*

* Of course, there are exceptions, the only one I can think of beingprobable primes.

-----Original Message-----
From: Ali Sada via SeqFan <seqfan at list.seqfan.eu>
To: Sequence Fanatics Discussion list <seqfan at list.seqfan.eu>
Cc: Ali Sada <pemd70 at yahoo.com>
Sent: Tue, Jan 14, 2020 6:57 pm
Subject: [seqfan] Re: Probability of identical sequences

Thank you very much Sean, Frank, Robert, and David for your responses. I really appreciate the knowledge I got from them.

When I mentioned A073675, it was just to say why I thought of the question.

Let’s take these different cases:

Case 1
Suppose that the two sequences defined based on different concepts. Let’s say that A1 is “the number of steps to reach x when iterating…..”, and B1 is “number of partitions of n into distinct ….”
The two definitions come from totally different concepts, and they require different algorithm to calculate. Assume that we calculated k1 terms and we found those terms to be identical.
Statement 1: A1=B1.

Cases 2 and 3:
Define the following “original” sequence: a(1)=1; a(n)=S/p, where S=a(n-1)+n, and p is the least prime factor of S.
This is the sequence we get:
1, 1, 2, 2, 1, 1, 4, 4, 1, 1, 4, 8, 3, 1, 8, 8, 5, 1, 4, 8, 1, 1, 8, 16, 1, 9, 12, 8, 1, 1, 16, 16, 7, 1, 12, 16, 1

The observations here is that the first appearances of powers of 2 and powers of 3 are in order
(assume that we cannot prove these observations.)

Now, define A2 as “powers of 2 in the original sequence in order of their first appearances.”
We get 1, 2, 4, 8, 16,….
B2 is the powers of 2, A000079.
Statement 2: A2=B2.

A3 is “powers of 3 in in the original sequence in order of their first appearances.”
We get 1,3,9,27,.
B3 is powers of 3, A000244.
Statement 3: A3=B3.

If we calculate k terms of the original sequence, we will get k2 terms that are in line with statement 2, and k3 terms that are in line with statement 3. k2 is larger than k3.

It seems to me that statement 2 and statement 3 have the same probability of being true, even if k2>k3.
While, statement 1 seems less probable, even if k1>k2.

I am sorry for the long email. The reason for this is that I am not familiar with the right technical terms.

Best,

Ali

On Tuesday, January 14, 2020, 7:30:12 AM EST, David Seal <david.j.seal at gwynmop.com> wrote:

> If we have two sequences, A and B, with different definitions. However,
> when we calculate k terms for each sequence, all of these terms are
> identical. If we can’t prove that definition A equals to definition B,
> what’s the probability that A and B are identical?

It's not defined - you need to specify the distribution of the sequence definitions for the probability to exist.

An analogy: what is the probability that two numbers in the range 0-5 are the same? The answer is that it's undefined, because the question doesn't specify the distribution of the numbers. One could specify it in the question, and then the answer would be defined - two possible forms of that question are:

* "What is the probability that two numbers in the range 0-5 generated by throwing a fair die and subtracting 1 from the result are the same?" The answer to that question is (1/6)^2 + (1/6)^2 + (1/6)^2 + (1/6)^2 + (1/6)^2 + (1/6)^2 = 0.16666...

* "What is the probability that two numbers in the range 0-5 generated by tossing five fair coins and counting the number of heads are the same?" The answer to that question is (1/32)^2 + (5/32)^2 + (10/32)^2 + (10/32)^2 + (5/32)^2 + (1/32)^2 = 0.24609375.

Clearly with the answer depending on which of those two distributions of the numbers (or many other possibilities) is being asked about, it's going to be undefined if the question doesn't specify the distribution. And unfortunately, assuming one wants the distribution to have the property that for any reasonable outcome, it has a nonzero chance of generating that outcome, it's a lot harder to specify a well-defined distribution of sequence definitions than a well-defined distribution of numbers in the range 0-5.

David

> On 13 January 2020 at 19:06 Ali Sada via SeqFan <seqfan at list.seqfan.eu> wrote:
>
>
> Hi Everyone,
>
> If we have two sequences, A and B, with different definitions. However, when we calculate k terms for each sequence, all of these terms are identical. If we can’t prove that definition A equals to definition B, what’s the probability that A and B are identical? Is it zero, since we are dealing with infinite terms? Or is it a function of k?
>
> This question came to me when I was trying this algorithm: Exchange n and 2n. Each term gets changed only once.
> a(1)=2 and a(2)=1.
> a(3)=6 and a(6)=3
> etc.
> The sequence I got was
> 2, 1, 6, 8, 10, 3, 14, 4, 18, 5, 22, 24, 26, 7, 30, 32, 34, 9, 38, 40, 42, 11, 46, 12, 50, 13
>
> This sequence seems identical to A073675 (Rearrangement of natural numbers such that a(n) is the smallest proper divisor of n not included earlier but if no such divisor exists then a(n) is the smallest proper multiple of n not included earlier, subject always to the condition that a(n) is not equal to n.)
>
> This example might not reflect my question above exactly. I am just trying to show why I asked the question.
>
> Best,
>
> Ali
>
>
>
> --
> Seqfan Mailing list - http://list.seqfan.eu/

--
Seqfan Mailing list - http://list.seqfan.eu/

--
Seqfan Mailing list - http://list.seqfan.eu/

--
Seqfan Mailing list - http://list.seqfan.eu/

```