[seqfan] Re: Can we create a workable taxonomy for classifying all the various kinds of sequences in the OEIS?

Sat Apr 7 15:02:44 CEST 2018

Hi,
I think it is a matter of time (months?) before Google Deepmind would apply an updated AlphaGo Zeroalgorithm to mathematics.  In essence given a set of rules, the algorithm would develop expert knowledge, and withinthat would be the meta-rules you refer to, then at that point an automatic categorization of OEIS could be done too.
https://deepmind.com/blog/alphago-zero-learning-scratch/cheers,Jamie
----- Original Message -----
From: Wayne VanWeerthuizen <waynemv at gmail.com>
To: Sequence Fanatics Discussion list <seqfan at list.seqfan.eu>
Sent: Sat, 07 Apr 2018 00:35:02 -0600 (MDT)
Subject: [seqfan] Re: Can we create a workable taxonomy for classifying all the various kinds of sequences in the OEIS?

Neil wrote, "The Index to the OEIS does a pretty good job."

It does of good job helping people find sequences relevant to very specific topics. But when I looked at the index, I didn't see how it addressed the sort of distinctions between different kinds of sequences on the level of abstraction that I was asking about. Below, I will attempt to clarify what I was trying to say and ask in my previous post. I might not have explained my idea clearly enough earlier, and some of my thinking on the matter has further evolved.

But first I want to ask Neil: Who updates the index? I've contributed sequences involving power towers (e.g. A246491 thru A246497), but I did not find anything regarding them or power towers in the index. Is updating the index the duty of whomever contributes new sequences? 

Now back to what I was trying to propose earlier:

Basically, my idea builds upon five conjectures/observations, (which realize might be wrong), so I wish to further discuss these conjectures. They are as follows:

1. (Setting aside physically based sequences), every mathematically grounded sequence in the OEIS fundamentally involves one or both of the following two types of primitives. One kind of primitive, is to test values against given conditions. The second kind of primitive is to explicitly generate new values via application of given mathematical functions. (The only difference between these is one returns a Boolean value, while the other returns a numeric value. And both functions may take multiple arguments.) Both types of primitives are infinite in their variety and I don't expect to be able to create any nice system for classifying them. But they can be considered in the abstract as encapsulated primitives to which a sequence generation meta-rule gets applied to.

2. Two different sequences might use the exact same primitives but differ from each other merely with respect to certain additional fixed parameters. For example, initial values in the sequence might be specified as parameters, before subsequent values are generated via recursively applying a function. 

3. Every sequence involves following a procedure for creating a specific list of numeric values from the given primitives and parameters. This is the meta-rule I mentioned above. 

4. Some sequences can be generated more than one way, possibly using fundamentally different procedures, with different primitives. That is not a problem. This just means the sequence should be noted as falling into multiple categories. Part of the astounding beauty of mathematics is how things generated in radically different ways can seemingly "coincidentally" turn out to be equivalent.

5. Sequences are only good, and thus worthy of inclusion in the OEIS, if the procedure for generating the sequence is, under common sense judgement, somehow natural and not overly complicated, arbitrary, or ad hoc. I suspect that, at an appropriate level of abstraction, there exists only a limited number of simple, non-arbitrary fundamental ways to generate good sequences. Thus, I suspect there may be only a very limited number of general procedures (maybe fewer than a hundred?) currently in use among all the great multitude of sequences in the OEIS. My proposed project is simply to encourage a collective volunteer effort to identify and compile a list of what those procedures are. 

Now, are each of these five conjectures agreeable or plausible? 

I really want to stress and make absolutely clear that conjecture five only considers ways of generating sequences from primitives. It need not concern the extensive variety of primitives themselves. Categorizing all the primitives lies far beyond the scope of what I'm suggesting here. (I think in this regard some here may have misunderstood my previous post.) Rather, I'm merely suggesting, that we relegate the primitives to a lower level abstraction while we look at the bigger picture. Yes, there are infinitely many conditions ones could test, and infinitely many functions one could invoke; and all those would be difficult to fit into any useful taxonomy. But I suspect there are only a limited number of good, overall, common-sense, generalized procedures for generating interesting or noteworthy sequences from given primitives. For each sequence, we just need to find where we might be able to identify a natural distinction between what (at the lower level of abstraction) are the specific tests or functions involved, and what (at the higher level) is the generic generating procedure the sequence uses. 

***** MY MAIN POINT: It is the generalized forms of sequence generation, that is, overall generally stated procedures that abstract away the specific details of a given sequence, that I suggest we should attempt to catalog and assign formalized nomenclature to. 

In order to illustrate what I meant, I included some categories in my previous post. I'd like to repeat and expand on that list here, since I believe it helps clarify what I mean. (I removed "Geometric sequences" since that category was probably too ambiguous and subjective.)

"Derived Sequences": A category of sequences wherein the simplest or most natural method of generating new terms essentially involves using the terms of one or more other simpler sequences as input to be reprocessed. This will have subcategories. The reprocessed sequences I'll denote as base sequences.

"Explicit Sequences" apply a closed form function, F, to the sequence index, thus simply, a(n)=F(n). This is the category most likely to overlap with over categories. There might be value to limiting its use to only where the function is relatively simple, and where the "spirit" of what the sequence does isn't better described by another category. (But that could be too subjective.)

"Recursive sequences" are functions of previous terms in the sequence.

"Filtered Sequences" (What I previously termed "Discriminatory sequences"): These take candidate values, k, and test the one at a time against a given criteria. If and only if k passes the criteria is k added to the sequence as the next term. Candidate values are often simply the non-negative integers, but candidates can be terms of another base sequence. Thus, filtering can be multiple levels deep. The prime numbers are a filtering of the positive integers. The Mersenne primes are, in turn, a filtering of the primes, based on their form. (Is it thus true that filtered sequences must have strictly increasing values? Maybe not, if the base sequence being filtered isn't strictly increasing. But I am not sure if any example of filtering a non-increasing base sequence exists anywhere in the OEIS.)

"Least-k sequences": This procedure is more involved. For each n, find the smallest k, such that conditional test Q(n,k) is true. The sequence is of a(n)=k.

"Racing sequences": A type of derived sequence. For each positive integer k, compare two existing sequences for which has the most terms a(n) < k. Whenever the second sequence overtakes the first in its quantity of terms, record the value k as a term in the new sequence. There may be other ways of setting up the races, so perhaps this category would have subcategories of its own, listing all the most reasonable ways a sequence race can be implemented or the results recorded. Again, I suspect common sense limits on what is reasonable and not too arbitrary, would keep the list short.

"Union" and "Intersection" sequences: Derived sequences that combine terms from two or more base sequences or list the terms that two or more base sequences have in common.

"Exclusion" sequences: Derived sequences that take one base sequence, A, and remove all terms in common with another base sequence, B. That is, create a new sequence of all terms that are in A but not in B.

"Array sequences" are sequences containing rows, columns, or diagonals of multidimensional arrays of interest to the OEIS. We might also compile a short list of the most reasonable generalized methods for contructing arrays that could be of interest to the OEIS. Any arrays generated via too complicated or arbitrary of methods probably don't belong in the OEIS in the first place. All the arrays appropriate to the OEIS should fall under only a few different methods of construction.

"Digit sequences" are sequences is built from taking a specified irrational number and making each digit of it a term in the sequence.

"Physical sequences" are just lists of measured values, which might not have any mathematical basis.

Specifically note that "Base" sequences do NOT form a category in the restricted sense I am meaning here. That some sequences are base dependent is a property of the primitives used, but not descriptive of the overall sequence generation procedure itself. 

I realize these categories I listed are by themselves likely insufficient to describe many of the sequences in the OEIS. But my working hypothesis is that probably no more than fifty or a hundred total categories of this sort would be required to cover 90% of the OEIS. (Is this too optimistic?) I suspect that most sequences in the database are built in the same general manner as other sequences, just applying different primitives. But maybe I am wrong about that. There are a great many sequences in the OEIS beyond my level of understanding, of which I don't comprehend how they are generated. This makes it hard me to test my conjectures against random sequences in the OEIS on my own.

I strongly feel this kind of sequence classification would be a valuable project and increase the usability of the OEIS. And I consider it feasible. But I am open to the possibility that I might be wrong about either its feasibility or its value. I'm a nowhere even close to as familiar with the OEIS as many others here. I would like to read further discussion on this topic.

--
Wayne

On 4/6/2018 7:54 AM, Neil Sloane wrote:
> The Index to the OEIS does a pretty good job
> 
> Best regards
> Neil
> 
> Neil J. A. Sloane, President, OEIS Foundation.
> 11 South Adelaide Avenue, Highland Park, NJ 08904, USA.
> Also Visiting Scientist, Math. Dept., Rutgers University, Piscataway, NJ.
> Phone: 732 828 6098; home page: http://NeilSloane.com
> Email: njasloane at gmail.com
> 
> 
> On Fri, Apr 6, 2018 at 9:06 AM, Veikko Pohjola <veikko at nordem.fi> wrote:
> 
>> Just a short comment,
>>
>> Taxonomies are typically domain specific ad hoc systems which prove
>> unsatisfactory when the domain evolves or is viewed in a larger
>> perspective. I would suggest an adequate ontology based approach where the
>> sequences forming a hierarchical whole are treated as objects with a
>> generic set of attributes. I have myself a lot of experience of applying
>> the PSSP ontology with success on various diffuse problem areas of the real
>> world like knowledge management (even ignorance management) and why not to
>> mention waste management, the project which started by defining the waste
>> itself. Some of my students went crazy in searching most strange targets to
>> test the approach like formulating zero and God as a PSSP object. So, I do
>> not doubt its power in attacking the integer sequences, but who would have
>> the necessary interest and time.
>>
>> Veikko Pohjola
>>
>>
>>> Wayne VanWeerthuizen <waynemv at gmail.com> kirjoitti 6.4.2018 kello 2.39:
>>>
>>>
>>> I've long wondered whether it might be possible to develop a good
>> taxonomy for classifying sequences. I'm not sure how well this can be
>> achieved. A single sequence may fall into multiple categories. Getting all
>> sequences into at least one appropriate category may mean we need a lot
>> more categories than just the one I suggest below. And some of the
>> categories, such as geometric, might be subjective, with no precise formal
>> way to rigorously define what should be included. But anyway, if we can
>> find a feasible approach, I'd appreciate seeing a classification of all or
>> most sequences in the OEIS into categories somewhat like these:
>>>
>>> Geometric Sequences:
>>> Sequences whose primary or simplest interpretation is geometric.
>>>
>>> Examples: triangular numbers (A000217), river crossings (A005316)
>>>
>>>
>>> Combinatoric Sequences:
>>> Sequences a(N) answering how many ways to create a construction of a
>> certain size or order N such that it meets a given criteria
>>>
>>> Example: Number of ways to have N red, white, or blue beads in a
>> string such no red and blue beads are adjacent. (A078057)
>>>
>>>
>>> Discriminatory Sequences:
>>> Sequence lists all which pass a given test for compliance with some
>> criteria.
>>> That is, for each possible k, if boolean test f(k) is true, then k is
>> part of the sequence.
>>> Such sequences should be strictly increasing: if j and k are both in
>> the sequence, j should be listed before k if j<k.
>>>
>>> Examples: Prime numbers (A000040), squarefree numbers (A005117),
>> palindromes (A002113), The Positive Integers (A000027)
>>>
>>>
>>> Least-qualifying-value sequences:
>>> Sequences a(n)=k, such that for each n, and a given boolean function
>> f(n,k), k is the lowest positive value such that f(n,k) is true.
>>> Examples: A246491, A215537
>>>
>>>
>>> Physically-based Sequences:
>>> Sequences based on experiments in the actual world.
>>>
>>> Example: Decimal expansion of electron mass in kg. (A081801)
>>>
>>>
>>> Recursive Sequences:
>>> Sequences whose most natural definition is based on a recursively
>> applying a given formula.
>>> a(n)=f( a(n-1), a(n-2), ... )
>>>
>>> Example: Fibonacci numbers (A000045)
>>>
>>>
>>> Polynomial Sequences:
>>> Sequences whose terms are defined by a single polynomial P, such that
>> a(n)=P(n).
>>>
>>> Example: square numbers, triangular numbers
>>>
>>>
>>> Off the top of my head, I don't know any interesting Polynomial
>> sequences that are not also Geometric sequences. Recursive Sequences might
>> be a poorly chosen category, given many sequences in the database can be
>> generated using recursive functions, not all of which look recursive at
>> first glance. But without such a category, I am not sure how otherwise to
>> classify sequences such as the Fibonacci and Lucas numbers. This example
>> gets to my core question of whether a classification scheme like this is
>> even viable. I don't expect these few categories to cover all existing
>> sequences in the database, I think many more categories would be needed for
>> that. And again, a single sequence could fall into multiple categories. I'm
>> also not yet suggesting any hierarchy of subcategories of categories, but
>> that could also come into play later if we developed a more refined and
>> detailed taxonomy. Also, I think even if the big goal of finding sufficient
>> categories to cover all the sequences in the database proves too difficult,
>> it might still be valuable to have a few categories and to tag just those
>> sequences which can be easily classified, accepting that not all sequences
>> would be tagged.
>>>
>>> Anyway, has anyone else here thought much about how to classify
>> sequences in this sort of way? How might the categories I've suggested be
>> more clearly and rigorously defined? Can they be given better names? What
>> additional categories might one suggest? What percentage of the sequences
>> in the OEIS do you estimate can be classified into one or more of the
>> categories I've suggested above?
>>>
>>>
>>> --
>>> Seqfan Mailing list - http://list.seqfan.eu/
>>
>>
>> --
>> Seqfan Mailing list - http://list.seqfan.eu/
>>
> 
> --
> Seqfan Mailing list - http://list.seqfan.eu/
> 

--
Seqfan Mailing list - http://list.seqfan.eu/