LL(1) Parsing
Explanation of the name
The L’s in LL stand for
- Left to right scanning
- Leftmost derivation
LL(1) is a type of predictive parsing meaning it never backtracks. How does it accomplish this?
- Restricted grammar
- Lookahead
These parsers only allow grammars that are LL(k) and they theoretically have k tokens of lookahead
$A \rightarrow \alpha| \beta$, we want to be able to choose between $\alpha$ and $\beta$ by just looking at the next input character
LL(1) is a top-down parser meaning it starts with the start symbol and tries to expand to the given string
It reads the input from Left to right and produces the left most derivation
So it builds the tree in the same order as the input i.e the node that gets expanded to become a leaf will correspond to the first token and the next node will correspond to the next token
LL(1) uses a table from $N \times T$
T[A, t]
means that we are currently at a A
node in the parse tree and the next input is a t
. Suppose this value is $\alpha$. That means that either
- There is a derivation from $\alpha$ of the form $t\beta$ where $\beta \in (N \cup T)^*$ i.e the derivation from $\alpha$ as $t$ as it first symbol
- Or if $\alpha \Rightarrow^* \varepsilon$ and $S \Rightarrow^* \beta A t \delta$
Or in layman’s terms We choose to expand $A$ to $\alpha$ on seeing that the next input is $t$ if
- $\alpha$ expands to something which has $t$ as its first symbol
- $\alpha$ expands to $\varepsilon$ and there is a derivation from $S$ such that $t$ comes after $A$. So when $A$ is expanded to $\alpha$ and $\alpha$ becomes $\varepsilon$ we still have $t$ at the start. This in effect is equivalent to $\alpha$ expanding to $t$.
We need a way to formalize these two things
Let $FIRST(\alpha)$ be defined as follows: $x \in FIRST(\alpha) \iff \alpha \Rightarrow^* t\beta$ where $\beta \in (N \cup T)^$ $FIRST(X) = {t | X \Rightarrow^ t\alpha} \cup {\varepsilon | X \rightarrow^* \varepsilon}; \alpha \in (N \cup T)^*$
[!note] We don’t usually include $\varepsilon$ as terminal. But we need to store if a NT goes to $\varepsilon$
This takes care of the first point - Can this NT expand to the input at hand at some point?
Let $FOLLOW(\alpha)$ be defined as follows: $x \in FOLLOW(\alpha) \iff \varepsilon \in FIRST(\alpha) S \Rightarrow^* \beta \alpha t \delta$ or $t \in FIRST(\beta)$ for any $\beta$ that appears after $\alpha$ in any production. There exists a production $A \rightarrow B\alpha\beta\gamma$ and $t \in FIRST(\beta)$; $B, \gamma \in (N \cup T)^*$