[seqfan] Re: Linked Open Data (LOD) and the OEIS

Sat Mar 17 21:26:56 CET 2012

>="Antti Karttunen" <antti.karttunen at gmail.com>
> Just a couple of questions:

Interesting questions.  I'll try to make some cogent comments, but with the
disclaimer that "I'm not a professional ontologist, I just play one on TV":

> Axxxxxx --is_a_composition_of--> Ayyyyyy, Azzzzzz
> (this latter meaning something like Axxxxxx(n) = Ayyyyyy(Azzzzzz(n)) )?

A key design discipline that makes a lot of this semantic stuff "work" is to
represent ALL the "knowledge" uniformly using nothing but triples of URIs.

These simple assertions can get tied together in various ways to form more
complex relationships, either statically or dynamically:

One simple way is to have multiple triples asserting things "about" Axxxxxx
coordinate together by convention.  For example if Ax = Ay(Az)
  Axxxxxx  --isMappingFrom-->  Azzzzzz
  Axxxxxx   --isMappingVia-->  Ayyyyyy

However this is too simple if we also can have Ax = Au(Av).  One solution
could be to introduce new entities, whose URI's are used as "subjects" in
some triples and "relations" in others:
  Au(.)    --isMappingVia-->  Auuuuuu
  Axxxxxx  --Au(.)-->         Avvvvvv

Note that even this simple kind of "knowledge schema" can support all sorts
of interesting reasoning patterns.  For example, if we've also asserted that
  Aw(.)  --isInverseMappingOf-->  Au(.)
Then inferences like
  Avvvvvv  --Aw(.)-->  Axxxxxx
might be made.  These can just be ephemeral, generated during a query and
then discarded, or they can be "materialized" and inserted explicitly into
the knowledge base as well.

Another approach, "reification", is to have component URI's that themselves
refer to triples.  
  Ayyyyyy  --isMappingAppliedIn-->
    ( Axxxxxx  --isMappingFrom-->  Azzzzzz )

> Another questions: How to record conjectural information?

Reification can also support "metaknowledge", with assertions like
  Chun-XianJiang  --conjecturesThat-->
    ( Axxxxxx  --finiKeywordHasValue-->  False )

Etc.  The knowledge needn't be just limited to "first order" assertions.
There's all sorts of fun possibilities.

> how is this piece of information to be replaced, when the conjecture
> is either refuted or proved?

Again there's many options.  One could indeed just delete the old assertion
and insert a new one.

Another approach, again using reification, would be to add yet more
meta-assertions about the conjecture, refutation, etc
  <XYZConjecture>   --wasRefutedIn-->    <DocumentQRS>

And of course this suggests all sorts of other interesting data:
  <AnttiKarttunen>  --isAuthorOf-->      <DocumentQRS>
  <DocumentQRS>     --wasPublishedOn-->  <2012.03.17>
  <DocumentQRS>     --fullText-->        <http://oesif.org/papers/...>
Etc.

Many semantic applications also document the provenance of assertions, again
using similar modeling techniques.  For instance scientific domains cite
publications, experiment reports, raw data; legal domains include case law,
testimony, patent filings, etc.

A cool thing about the semantic web is that knowledge can be "federated"
from multiple sources.  For example all that stuff about <DocumentQRS> could
just as well be in some citation database run by, say, the AMS.  The OEIS
itself doesn't have to maintain it, just point at it.

I have even seen demos of "zero-data services" that generate useful new
medical information using only federation and reasoning.

> That is, how the system is specified/expected to manage knowledge
> whose veracity status might change in the future?

As you might infer from the above examples, managing knowledge is an art as
well as a science that is rapidly evolving.  Ontology and epistemology are
becoming practical trades nowadays.

> But, if done well, then there could be a possibility for
> automatic generation of code, starting from simple definitions
> with machine-readable formulae (e.g. some non-proprietary
> mathematical or declarative programming language/notation),
> together with an extensive web of such relations.

Exactly.  Moreover, the backtracking reasoning systems typically generate
"provisonal local theories" in response to queries, that might or might not
pan out and perhaps even make sense to materialize and assert persistently
as interesting new "discoveries".

With this technology is relatively congenial to ask and answer new types of
questions, such as "What known sequences are related by chains of
compositions of mappings starting from the Fibonacci numbers?"  Or
explorations like "What are the biggest disjoint islands of related
sequences ("continents"?) that exist in the OEIS?"