[seqfan] Re: A new sequence counting the number of distinct languages recognized by DFAs

brad klee bradklee at proton.me
Thu Jul 25 01:05:46 CEST 2024


> Here is some material on converting DFAs to minimal regexes, but I haven't read it all yet:
> https://stackoverflow.com/questions/35112630/minimize-specific-regular-expression
> https://cs.stackexchange.com/questions/2016/how-to-convert-finite-automata-to-regular-expressions

Thanks, vertex removal is what I was using, but I'm not surprised to see it isn't
the best method for extracting regex.

To get things started, here's another idea we might not have tried yet:

a(n) counts valid binary regex of string length n.

a(1) = 2 {a, b}
a(2) = 6 {aa, ab, ba, bb, a*, b*}
a(3) = 22 { x+y (4) , xyz (8), xy* (4), x*y (4), (x) (2) }

And per my assumed conventions, the sequence continues:

a(n): 2, 6, 22, 80, 308, 1210, 4872, 19946 . . .

The next interesting question is how many of these are irreducible to smaller size?
For example, quite obviously:

((x)) → x

or

x*x*x → xx*

or

x+x → x

but also:

(a*b*)* → (a+b)*

or

(a*ba*)*+a* → (a+b)*

This sequence would make a nice partner with the one that Thomas has proposed, and
It should have some precedent in OEIS, and is vaguely related to Catalan #'s, I think.

Although for Thomas's proposal, we might want to consider entering the first differences.
That way we aren't double counting when we always find all languages from n-state DFAs
again in the languages of m-state DFAs when m>n.

Cheers,

--Brad


More information about the SeqFan mailing list