You are here
Homestar height
Primary tabs
star height
The star height of a regular expression $p$ over an alphabet $\Sigma$, denoted by $\operatorname{ht}(p)$, is defined recursively as follows:
1. $\operatorname{ht}(\varnothing)=\operatorname{ht}(a)=0$ for any $a\in\Sigma$;
2. $\operatorname{ht}(p\cup q)=\operatorname{ht}(pq)=\max(\operatorname{ht}(p),% \operatorname{ht}(q))$;
3. $\operatorname{ht}(p^{*})=\operatorname{ht}(p)+1$.
Let $\Sigma=\{a,b,c\}$. The following expressions have star height $0,1,2,3$:
$(a\cup b)c\qquad a^{*}b^{*}\qquad(a^{*}b)^{*}c\qquad((ab^{*}a\cup c)^{*}b)^{*}$ 
Since any regular expression $p$ describes a regular language $L(p)$, we may extend the definition of star height to regular languages. However, since a regular language may be described by more than one regular expressions, we need to be a little careful:
The star height of a regular language $L$ is the least integer $i$ such that $L$ may be described by a regular expression of star height $i$. In other words:
$\operatorname{ht}(L)=\min\{h(p)\mid L=L(p)\mbox{, }p\in R(\Sigma)\},$ 
where $R(\Sigma)$ is the set of all regular expressions over $\Sigma$.
Remarks.

Any regular language over a singleton alphabet has star height at most 1, which can be proved by the pumping lemma for regular languages.

If the alphabet $\Sigma$ consists of at least two letters, then for every positive integer $n$, there is a regular language whose star height is $n$. In fact, it can be shown that if $\Sigma=2$, then for every $n$, there is a regular language $L$ over $\Sigma$ such that $\operatorname{ht}(L)=n$.

It was an open question whether an algorithm exists for determining $\operatorname{ht}(L)$ for an arbitrary regular language $L$. In 2005, the question was resolved by D. Kirsten in the positive, and the algorithm is that of a nondeterministic finite automaton.

We may also define star height on generalized regular expressions. For a regular language $L$, define $\operatorname{ht}_{g}(L)=\min\{h(p)\mid L=L(p)\mbox{, }p\in R_{g}(\Sigma)\}$, where $R_{g}(\Sigma)$ is the set of all generalized regular expressions. It is clear that $\operatorname{ht}_{g}(L)\leq\operatorname{ht}(L)$. However, it is still an open question whether, for every integer $n$, there is a regular language $L$ with $\operatorname{ht}_{g}(L)=n$.
A starfree language has star height $0$ with respect to representations by generalized regular expressions, but may be positive with respect to representations by regular expressions, for example $L=\{ab\}^{*}$.
References
 1 A. Salomaa, Formal Languages, Academic Press, New York (1973).
 2 A. Salomaa, Jewels of Formal Language Theory, Computer Science Press (1981).
Mathematics Subject Classification
20M35 no label found68Q70 no label found Forums
 Planetary Bugs
 HS/Secondary
 University/Tertiary
 Graduate/Advanced
 Industry/Practice
 Research Topics
 LaTeX help
 Math Comptetitions
 Math History
 Math Humor
 PlanetMath Comments
 PlanetMath System Updates and News
 PlanetMath help
 PlanetMath.ORG
 Strategic Communications Development
 The Math Pub
 Testing messages (ignore)
 Other useful stuff
 Corrections