patternMinor

Algorithms for minimizing Moore automata

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

automata cs finite-automata stackoverflow optimization reference-request transducers

minimizingautomataalgorithmsformoore

Problem

Brzozowski's algorithm can be extended to Moore automata but its time complexity is exponential in general. Is there any other algorithm for minimization of Moore automata? What are the running times of these algorithms if any?

Solution

The original DFA minimization algorithm was actually designed for Moore Machines, guided by their apparently more observable behavior.
But the algorithm presented here is a reconstruction from the DFA minimization,
since I discovered the historical evidence after the fact.

After Wikipedia (with some notational changes):

A Moore machine can be defined as a 6-tuple $(Q, q_0, \Sigma, \Pi, \delta, \gamma)$ consisting of the following:

a finite set of states $Q$

a start state (also called initial state) $q_0$ which is an element of $Q$

a finite set called the input alphabet $\Sigma$

a finite set called the output alphabet $\Lambda$

a transition function $\delta : Q \times \Sigma \rightarrow Q$ mapping a state and the input alphabet to the next state

an output function $\gamma : Q \rightarrow \Pi$ mapping each state to the output alphabet

From this definition, a Moore machine is a deterministic finite state
transducer.

I have no reference for minimization of Moore automata. However it
seems not too hard to imagine an algorithm, derived from the algorithm used for
deterministic finite state automata.

The idea in DFA minimization is based on the Myhill-Nerode
characterization of regular languages.

Given a language $L$, and a pair of strings $x$ and $y$, define a
distinguishing extension to be a string $z$ such that exactly one of
the two strings $xz$ and $yz$ belongs to $L$. Define a relation
$R_L$ on strings by the rule that $x R_L y$ iff there is no
distinguishing extension for $x$ and $y$. It is easy to show that
$R_L$ is an equivalence relation on strings, and thus it divides the
set of all strings into equivalence classes.

The Myhill-Nerode theorem states that $L$ is regular if and only if
$R_L$ has a finite number of equivalence classes, and moreover that
the number of states in the smallest deterministic finite automaton
(DFA) recognizing $L$ is equal to the number of equivalence classes
in $R_L$.

Indeed each state $q$ of the smallest DFA is such that $W_q$ as
defined above is one of the equivalence classes for the relation $R_L$.

For a non-minimal DFA for the regular language $L$,
it is easy to show that each set $W_q$ contains
strings that all belong to a same equivalent class with respect to
$R_L$.

Hence, the minimization of the DFA actually consists of merging states
(considered as sets of equivalent strings), whenever it is shown that
two distinct states contain equivalent strings.

Two reasonably fast algorithms exists for that purpose, Moore
algorithm (1956) which is in time $O(n^2)$ and Hopcroft's algorithm
(1971) in time $O(n\log n)$.

The extension to Moore automata is best understood in redefining the
equivalence relation as $R_T$ for a transducer $T$. The relation $R_L$
was concerned with whether future input would lead equivalently to an
accepting state. The $R_T$ equivalence relation of Moore automata is
concerned with whether future input will produce the same output.

Hence, given a transducer $T$, and two strings $x$ and $y$, we define a
distinguishing extension to be a string $z$ such that $T(xz)=T(x)u$
and $T(yz)=T(y)v$ with $u\neq v$, i.e. such that the output behaviour
of the transducer differs for $z$ depending on whther it is following
$x$ or $y$.

Again, $R_T$ is an equivalence relation, dividing all strings in
$\Sigma^*$ into equivalence classes. In the case of a Moore machine,
these classes will again correspond to state of the minimal
transducer.

The following algorithm mimics the Moore algorithm for DFA minimisation.

We define an initial partition $\mathcal P$ of $Q$ into classes of states $S_e$ as follow:

$\forall e\in\Pi:\; S_e=\{q\in Q\mid \gamma(q)=e\}$

Then we split the classes in $\mathcal P$ as follows:

repeat successively for each class of states $S$, until none
changes

$\ \ $ repeat $\forall a\in\Sigma,\;$

$\ \ \ \ $ If $ \exists S'\in \mathcal P,\; \forall q\in S,\;
\delta(q,a)\in S'$ then do nothing

$\ \ \ \ $ else split $S$ into subsets $S_i$ such that

$\ \ \ \ \ \ $ for each
subset $S_i$, there is a different class $S'\in \mathcal P$ such that $ \forall q\in S_i,\;
\delta(q,a)\in S'$

$\ \ \ \ \ \ $ (the subsets $S_i$ replace $S$ in $\mathcal P$)

When there is no class left that needs to be split, the remaining
classes of states will form the states of the minimal Moore
machine.

By construction, all states in a class have the same output which is
the output for the class.

Similarly, for any input $a\in\Sigma$, all states in a class go to some
state in the same other class, which defines the transition function
for the minimal Moore machine.

Complexity analysis:
Let $n=|Q|$ be the number of states, and $s=|\Sigma|$ the size of the input alphabet.

The main loop is executed at most $n$ times, since each iteration must
split at least one class of states, and each class contains at least
one state. Each iteration of the loop examines ea

Context

StackExchange Computer Science Q#44012, answer score: 6

Revisions (0)

No revisions yet.