You are here
Homederivation language
Primary tabs
derivation language
Background
Let $G=(\Sigma,N,P,\sigma)$ be a formal grammar. A pair $(W_{1},W_{2})$ of words over $\Sigma$ is said to correspond to the production $p\to q$ if
$W_{1}=u_{1}pv_{1}\qquad\mbox{and}\qquad W_{2}=u_{2}qv_{2}$ 
for some words $u_{i},v_{j}$ over $\Sigma$. We also say that $(W_{1},W_{2})$ is a derivation step, and write $W_{1}\to W_{2}$.
Recall that a derivation in $G$ is a finite sequence of words
$W_{1},W_{2},\ldots,W_{n}$ 
over $\Sigma$ such that $W_{i}\Rightarrow W_{{i+1}}$, for $i=1,\ldots,n1$. The derivation is also written
$W_{1}\Rightarrow W_{2}\Rightarrow\cdots\Rightarrow W_{n}.$ 
The derivation above has $n1$ steps. Zerostep derivations are also permitted. These are just words over $\Sigma$.
The reflexive transitive closure of $\Rightarrow$ is $\Rightarrow^{*}$. Thus, $V\Rightarrow^{*}W$ means that there is a derivation starting with $V$ and ending with $W$. There may be more than one derivation from $V$ to $W$.
If we consider each production as a “letter” in the alphabet $P$, then the above derivation can be represented by a “word” over $P$ in the following manner: the “word” is formed by taking concatenations of the “letters”, where concatenations correspond to successive applications of productions in $P$:
$[p_{1}\to q_{1}][p_{2}\to q_{2}]\cdots[p_{{n1}}\to q_{{n1}}].$ 
Derivation language is thus a certain collection of derivation words, formally defined below.
Definitions
Treating productions as “letters” is really nothing more than labeling each production with some symbol. Formally, call a labeling of an alphabet $P$ a surjection $f:F\to P$, where $F$ is some alphabet. For any $p\in P$, a label for $p$ is an element $x\in F$ such that $f(x)=p$. We will only be interested in injective labeling (hence a bijection) from now on.
Definition. Suppose we are given a labeling $f$ of the set $P$ of productions in the grammar $G$. Given a derivation $D:W_{1}\Rightarrow W_{2}\Rightarrow\ldots\Rightarrow W_{n}$, a derivation word $U$ for $D$ is defined inductively as follows:
1. if $n=1$, then the empty word $U:=\lambda$,
2. if $n=2$, then $U:=x\in F$ is a label for a production that $W_{1}\Rightarrow W_{2}$ corresponds to,
3. if $n>2$, then $U:=U_{1}U_{2}$, where

$U_{1}$ is a derivation word for the derivation $W_{1}\Rightarrow\cdots\Rightarrow W_{i}$,

$U_{2}$ is a derivation word for the derivation $W_{i}\Rightarrow\cdots\Rightarrow W_{n}$.

If $U$ is a derivation word for derivation $D$, let us abbreviate this by writing $f[U]=D$. Note that we are not applying the labeling $f$ to $U$, it is merely a notational convenience.
A derivation word is sometimes called a control word, for it defines or controls whether and how a word may be derived from another word. Note that any $W_{1}\Rightarrow^{*}W_{2}$ may correspond to several distinct derivations, and hence several distinct derivation words. Also, the same derivation word may correspond to distinct derivations.
For example, let $G$ be a grammar over two symbols $($ and $)$ with productions $\sigma\to\lambda$, $\sigma\to(\sigma)$, and $\sigma\to\sigma\sigma$ ($G$ generates the parenthesis language $\boldsymbol{\operatorname{Paren}_{1}}$) Label the productions as $a,b,c$ respectively. Then the derivation $\sigma\Rightarrow^{*}(()())$ correspond to derivation words $bcbbaa$ and $bcbaba$. Notice that $\sigma\sigma\Rightarrow(\sigma)\sigma$ and $\sigma\sigma\Rightarrow\sigma(\sigma)$ both correspond to the derivation word $b$.
Definition. The derivation language of a grammar $G=(\Sigma,N,P,\sigma)$ is the set of all derivation words for derivations on words generated by $G$. In other words, consider the labeling $f:F\to P$. The derivation language of $G$ is the set
$\{U\in F^{*}\mid f[U]\mbox{ is a derivation of the form }\sigma\Rightarrow^{*}% u\mbox{ for some }u\in N^{*}\}.$ 
The derivation language of $G$ is also called the Szilard language of $G$, named after its inventor, and is denoted by $\operatorname{Sz}(G)$.
For example, let $G$ be the grammar over a the letter $a$, with productions given by $\sigma\to\sigma$, $\sigma\to a$. If the productions are labeled $b,c$, then $\operatorname{Sz}(G)=\{b^{n}c\mid n\geq 0\}$. Note that $L(G)=\{a\}$. Likewise, if $G^{{\prime}}$ is the grammar over $a$, with productions $\sigma\to A\sigma$, $A\to\lambda$, and $\sigma\to a$, labeled $p,q,r$ respectively, then $L(G^{{\prime}})=\{a\}$. However, $\operatorname{Sz}(G^{{\prime}})$ is quite different from $\operatorname{Sz}(G)$:
$\displaystyle\operatorname{Sz}(G^{{\prime}})$  $\displaystyle=\{u\in F^{*}\mid$  $\displaystyle u=vrw\mbox{, where }v\in\{p,q\}^{*},w\in\{q\}^{*},$  
$\displaystyle\#_{u}(p)=\#_{u}(q)\mbox{ and }\#_{x}(p)\geq\#_{x}(q)\mbox{ for % all }x\leq u\}$ 
where

$F=\{p,q,r\}$,

$\#_{u}(r)$ means the number of occurrences of $r$ in word $u$,

$v\leq u$ means that $v$ is a prefix of $u$.
Remarks. Let $G$ be a formal grammar.

Some properties of $\operatorname{Sz}(G)$:
(a) $\operatorname{Sz}(G)$ is always contextsensitive.
(b) If $G$ is regular, so is $\operatorname{Sz}(G)$.
(c) if $G$ is contextfree, $\operatorname{Sz}(G)$ may not be; in fact, for any contextfree language $L$, there is a contextfree grammar $G$ such that $L=L(G)$ but $\operatorname{Sz}(G)$ is not contextfree.
(d) There exists a contextfree language $L$ such that $\operatorname{Sz}(G)$ is not contextfree for any grammar $G$ generating $L$.

However, if one considers the subset $\operatorname{Sz}_{L}(G)$ of $\operatorname{Sz}(G)$ consisting of all derivation words corresponding to leftmost derivations, then $\operatorname{Sz}_{L}(G)$ is contextfree.

It is an open question that, given any (contextsensitive) language $L$, whether there is a grammar $G$ such that $L=\operatorname{Sz}(G)$.
References
 1 A. Salomaa, Formal Languages, Academic Press, New York (1973).
 2 G. E. Révész, Introduction to Formal Languages, Dover Publications (1991).
Mathematics Subject Classification
68Q42 no label found68Q45 no label found Forums
 Planetary Bugs
 HS/Secondary
 University/Tertiary
 Graduate/Advanced
 Industry/Practice
 Research Topics
 LaTeX help
 Math Comptetitions
 Math History
 Math Humor
 PlanetMath Comments
 PlanetMath System Updates and News
 PlanetMath help
 PlanetMath.ORG
 Strategic Communications Development
 The Math Pub
 Testing messages (ignore)
 Other useful stuff
 Corrections