[seqfan] Guided browsing of the OEIS based upon personal preferences?

Sun Oct 25 17:01:04 CET 2009

An interesting link about Pandora, thanks!

Now, for any such browsing-algorithm (whether "personalized" or not)
to work in OEIS, it would need much more software-digestable information
about each entry than is currently available.
Currently, most of the searchable information is "extra-mathematical"
("contingent"), e.g. the submitter's name, the number of references
to academic papers, keywords concerning the quality or importance, such as
"nice", "dumb", "less" or "core", etc. These can already be
used in searches, e.g. giving "ref:" into a search field
gives any sequence with at least one reference.

However, to get closer to the idea behind Pandora, that such
human-judgements should not matter there should be many
more keywords, or tags, or "categories" (called that in the new Wiki-OEIS)
made easily available to the OEIS-software.
Many of them could be automatically collected by bots, e.g.
given enough terms of the sequence, it's easy to conjecture
that it is, for example:

a) multiplicative (keyword "mult")
b) additive
c) monotone
d) injective
e) surjective (on N, not so easy for a program to detect in many cases)
f) consists of nonnegative values only (keyword "nonn")
g) consists of certain subset of integers only, say {0, 1} or primes,
h) is "continuous", i.e. the first differences consist of {-1, 0, 1} only.
i) grows with a certain rate, e.g. linear, quadratic, exponential, etc.
j) etc. etc.

Of course, all these are categories on which no program can make a
conclusive
decision based only on the finite subset of terms from an infinite sequence.
So, this "classification bot" would tag the entries only tentatively
with these classes, and then it would be a task of the editors to either
affirm these, or mark the entry with the corresponding "opposite tag"
(e.g. "non-nonn" is the opposite keyword of "nonn", currently called
"sign"),
to tell the bot that no, this sequence is not in category X, although
it might look like that. These cases are probably quite rare, as this
implies that the first counter-example is so far away that it is not
practical to include so many terms in the database/b-file.
Another possibility is that the property remains conjectural,
and in that case it should be flagged as so. (E.g. "conj-prime"
if the sequence seems to get only prime values, but nobody has proved
it for sure.)
(Should we have a different tag/keyword-prefix for the conjectures
made by the bot-program and the conjectures acknowledged by
human beings?)

Now, one could use time-stamping to lighten the task
of the bot (i.e. don't run the analyzer for already analyzed entries),
but care must be taken to rerun it on those entries where
terms have been added or corrected.
In any case, such a semi-automated bot would be a major improvement
over the current situation, where such analyzing have been done
only sporadically by some brave individuals.

Unfortunately, most of these easily detectable categories are
not very interesting or distinguishing.
- "Ah, this sequence was monotone! Now please give me more monotone
sequences from this same author!"

So, there are still two paths to proceed.

A) A lots of manual categorization, like in Pandora's case. Make it
as a task of associate editors not to just edit and correct, but also
to find as many meaningful categories (nowadays still called "Index
entries")
for the new entries as one can find. Prepare a some kind of
"Category FAQ" which lists the most useful categories to watch out,
in which a sequence could fall. Certainly, one cannot rely that the
many of the submitters would themselves know or care about them.

B) However, what I think people really want is analogues:

 - Is there a similar sequence, but based on partitions instead of
   combinations?

 - Some other automorphism applied to this same combinatorial structure?

 - Similar recurrence, but with slightly different parameters.
   (BTW: I see from the index entries that some people have already done
    a lots of work regarding this idea.)

 - What if instead of the primes we used here the irreducible elements
   of some other factorization domain? Or the leftovers from
   some other kind of sieve process (e.g. Lucky or Ludic numbers)?

 - A factorial-base analogue for this decimal-based sequence?

 - Base-3 or Zeckendorf-expansion analogue for this base-2 sequence?

 - What about using some other 2-ary function that just ordinary
   multiplication in this convolution formula?

 - etc. etc.

However, this would require that the OEIS-software would have a transparent
access to the formula, gen.func. & other information on %F, %C and %Y-lines,
which is currently obfuscated in the dozens of different Ad Hoc
-notations and the stray pieces of code in various, often proprietary
programming languages. So, I guess this is a long-term project, although
I don't have a doubt that it would not eventually be implemented as well.
Here one could start by integrating all the "combstruct"-information
from the Encyclopedia of Combinatorial Structures (if not already
in OEIS), and then making the search program so sophisticated
that it could operate with those structures based on the input
given by the user. At least the combinatorialists would like it.

Just my two cents,

Yours,

Antti Karttunen

On Mon, Oct 19, 2009 at 11:06 PM, <seqfan-request at list.seqfan.eu> wrote:

>
>
> Message: 9
> Date: Mon, 19 Oct 2009 11:51:05 -0400
> From: Rick Shepherd <rlshepherd2 at gmail.com>
> Subject: [seqfan] Guided browsing of the OEIS based upon personal
> preferences?
> To: seqfan at list.seqfan.eu
> Message-ID:
>        <b949fe1a0910190851h3aefa599yc0318422d5401b72 at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hello, SeqFans,
>
> Bear with me, I'm not really being off-topic (although this may be
> interesting to some people strictly for music-related reasons).
> This note will soon come back to the OEIS.  I'll also emphasize now that
> the
> OEIS already has a Browse feature:
> http://www.research.att.com/~njas/sequences/Sbrowse.html<http://www.research.att.com/%7Enjas/sequences/Sbrowse.html> (although I'm
> having difficulty accessing the database at the moment)
>
> Today I ran across an article describing a system called Pandora for
> suggesting songs one may like based upon one's previous statements of
> preferred songs.  Actually, my son had already told me about Pandora at
> least twice -- but this is the first time I've seen a bit about the
> nuts-and-bolts of how these suggestions are made.  Most of us have probably
> experienced some apparently-similar system in some sphere where a product
> is
> suggested because "other buyers of this product also bought these other
> products" (e.g., Amazon and books/etc) or where the software (using
> cookies,
> etc.) is clearly attempting to learn what we like (i.e., this isn't new).
> Pandora, in contrast to some of these others, attempts to be more objective
> and makes suggestions based upon straightforward characterizations of
> technical elements of songs rather than "collaborative filtering"
> (popularity or what everyone else is doing) -- but, of course, matters of
> (other people's) taste and subjectivity cannot be completely eliminated
> (yet).
>
> If one replaced "Pandora's music collection" with "the OEIS" and "song"
> with
> "sequence" (and drew several similar parallels), this article could give
> some food for thought about future directions for the OEIS.  This article
> also touches upon what to add to the collection and how to decide that --
> topics that have been recently discussed on this list for the OEIS.
>
> If the OEIS currently "Contains 164537 sequences", *one day* it might
> actually be so large that it's difficult to find that which you're really
> seeking.  :^J)   Algorithms based upon sophisticated, dynamic saved
> searches (and more) could help direct those who aren't looking to "commit
> complete serendipity" on a given day (The latter I admit is often the mode
> I
> enjoy but certainly not always.).
>
> Here's the link, "The Song Decoders":
> http://www.nytimes.com/2009/10/18/magazine/18Pandora-t.html?pagewanted=1&em
> (published Oct. 14th, 2009)
> I've found (in the USA) that sometimes it's necessary to be logged-in to
> one's NY Times account (free registration) to access their articles -- and
> sometimes not -- even for the same articles (it seems partly to be based
> upon time of day).
>
> Regards,
> Rick
>
>
> -