You are here
HomeLR(k)
Primary tabs
LR(k)
One way is to look for any subword $v$ of $u$ such that there is a production $A\to v$. If this is successful, we may replace $v$ with $A$ in $u$ to obtain a word $w$, so that $w\Rightarrow u$. We may then repeat the process on $w$ to obtain another word $x$ such that $x\Rightarrow w$ (if successful). In the end, if everything works successfully, we arrive at the starting nonterminal symbol $\sigma$, and get a derivation $\sigma\Rightarrow^{*}u$ as a result, so that $u\in L(G)$. This procedure is known as the bottomup parsing of the word $u$.
In general, unless one is very lucky, successfully finding a derivation $\sigma\Rightarrow^{*}u$ requires many trials and errors, since at each stage, for a given word $u$, there may be several words $w$ such that $w\Rightarrow u$.
Nevertheless, there is a particular family of contextfree grammars, called the $LR(k)$ grammars, which make the bottomup parsing described above straightforward in the sense that, given a word $u$, once a word $w$ is found such that $w\Rightarrow_{R}u$, any other word $w^{{\prime}}$ such that $w^{{\prime}}\Rightarrow_{R}u$ forces $w^{{\prime}}=w$. Here, $\Rightarrow_{R}$ is known as the rightmost derivation (meaning that $u$ is obtained from $w$ by replacing the rightmost nonterminal in $w$). The $L$ in $LR(k)$ means scanning the symbols of $u$ from left to right, $R$ stands for finding a rightmost derivation for $u$, and $k$ means having the allowance to look at up to $k$ symbols ahead while scanning.
The details are as follows:
Definition. Let $G=(\Sigma,N,P,\sigma)$ be a contextfree grammar such that $\sigma\to\sigma$ is not a production of $G$, and $k\geq 0$ an integer. Suppose $U$ is any sentential form over $\Sigma$ with the following setup: $U=U_{1}U_{2}U_{3}$ where

$U_{3}$ is a terminal word,

$X\to U_{2}$ a production, and

$\sigma\Rightarrow_{R}^{*}U_{1}XU_{3}\Rightarrow_{R}U$.
Let $n=U_{1}U_{2}+k$, and $Z$ the prefix of $U$ of length $n$ (if $U<n$, then set $Z=U$).
Then $G$ is said to be $LR(k)$ if $W$ is another sentential form having $Z$ as a prefix, with the following setup: $W=W_{1}W_{2}W_{3}$, where

$W_{3}$ is a terminal,

$Y\to W_{2}$ is a production, and

$\sigma\Rightarrow_{R}^{*}W_{1}YW_{3}\Rightarrow_{R}W$,
implies that
$W_{1}=U_{1},\qquad Y=X,\qquad\mbox{and}\qquad W_{2}=U_{2}.$ 
Simply put, if $D_{U}$ and $D_{W}$ are the rightmost derivations of $U$ and $W$ respectively, and if the prefix of $U$ obtained by including $k$ symbols beyond the last replacement in $D_{U}$ is also a prefix of $W$, then the prefix of $U^{{\prime}}$ obtained by including $k$ symbols beyond the last replacement in $D_{U}$ is also a prefix of $W^{{\prime}}$, where $U^{{\prime}}$ and $W^{{\prime}}$ are words at the next to the last step in $D_{U}$ and $D_{W}$ respectively. In particular, if $U=W$, then $U^{{\prime}}=W^{{\prime}}$. This implies that any derivable in an $LR(k)$ grammar has a unique rightmost derivation, hence
Proposition 1.
Any $LR(k)$ grammar is unambiguous.
Examples.

Let $G$ be the grammar consisting of one nonterminal symbol $\sigma$ (which is also the final nonterminal symbol), two terminal symbols $a,b$, with productions
$\sigma\to a\sigma b,\qquad\sigma\to\sigma b\qquad\mbox{and}\qquad\sigma\to b.$ Then $G$ is not $LR(k)$ for any $k\geq 0$. For instance, look at the following two derivations of $U=a^{2}\sigma b^{3}$:
$\sigma\Rightarrow^{*}a\sigma b^{2}\Rightarrow a^{2}\sigma b^{3}\qquad\mbox{and% }\qquad\sigma\Rightarrow^{*}a^{2}\sigma b^{2}\Rightarrow a^{2}\sigma b^{3}$ Here, $U_{1}=a$, $U_{2}=\sigma b$. Let $k=1$. Then the criteria in the definition are satisfied. Yet, $W_{1}=a^{2}\neq U_{1}$. Therefore, $G$ is not $LR(1)$.

Note that the grammar $G$ above generates the language $L=\{a^{m}b^{n}\mid n>m\}$, which can also be generated by the grammar with three nonterminal symbols $\sigma,X,Y$, with $\sigma$ the final nonterminal symbol, where the productions are given by
$\sigma\to XY,\quad X\to aXb,\quad X\to\lambda,\quad Y\to Yb,\quad\mbox{and}% \quad Y\to b.$ However, this grammar is $LR(1)$.
Determining whether a contextfree grammar is $LR(k)$ is a nontrivial problem. Nevertheless, an algorithm exists for determining, given a contextfree grammar $G$ and a nonnegative integer $k$, whether $G$ is $LR(k)$. On the other hand, without specifying $k$ in advance, no algorithms exist that determine if $G$ is $LR(k)$ for some $k$.
Definition. A language is said to be $LR(k)$ if it can be generated by an $LR(k)$ grammar.
Theorem 1.
Every $LR(k)$ language is deterministic contextfree. Every deterministic contextfree language is $LR(1)$.
Hence, deterministic contextfree languages are the same as $LR(1)$ languages.
References
 1 A. Salomaa, Formal Languages, Academic Press, New York (1973).
 2 J.E. Hopcroft, J.D. Ullman, Formal Languages and Their Relation to Automata, AddisonWesley, (1969).
Mathematics Subject Classification
03D10 no label found68Q42 no label found68Q05 no label found Forums
 Planetary Bugs
 HS/Secondary
 University/Tertiary
 Graduate/Advanced
 Industry/Practice
 Research Topics
 LaTeX help
 Math Comptetitions
 Math History
 Math Humor
 PlanetMath Comments
 PlanetMath System Updates and News
 PlanetMath help
 PlanetMath.ORG
 Strategic Communications Development
 The Math Pub
 Testing messages (ignore)
 Other useful stuff
 Corrections