CMSC330

Grammars

Compilers/Interpreters

Grammars

ASTs

Compilers/Interpreters

Compiler: convert from language to machine code

Practically: convert from language to assembly

Abstracted: convert from language to another

Three things we need for a translator

Lexer
Parser
Evaluator/Generator

Abstracted: convert from language to another

Three things we need for a translator

Lexer

Typically uses RE to convert text file to a list of tokens (word tpyes)

Parser
Evaluator/Generator

Abstracted: convert from language to another

Three things we need for a translator

Lexer

Typically uses RE to convert text file to a list of tokens (word tpyes)

Parser

Converts Tokens to an Abstract Syntax Tree using a Context Free Grammar (CFG)

Evaluator/Generator

Abstracted: convert from language to another

Three things we need for a translator

Lexer

Typically uses RE to convert text file to a list of tokens (word tpyes)

Parser

Converts Tokens to an Abstract Syntax Tree using a Context Free Grammar (CFG)

Evaluator/Generator

Converts AST to Target or evaluates to value

Grammars

RE: describes a set of strings

CFGs: describes a set of strings but more of them

Grammar: structure of a langauge

We already saw a CFG

\(\begin{array}{rl} R \rightarrow & \emptyset\\ & \vert \epsilon\\ & \vert \sigma \\ & \vert R_1R_2\\ & \vert R_1\vert R_2\\ & \vert R_1^*\\ \end{array} \)

Things we need for a CFG

\(\sum\): Alphabet (known as terminals)
Non-Terminals: Denotes groups of terminals
Productions: Says what symbols can replace others
Start Symbol: where to start

Things we need for a CFG

\(\sum\): Alphabet (known as terminals)
Non-Terminals: Denotes groups of terminals
Productions: Says what symbols can replace others
Start Symbol: where to start

Consider the simple grammar for /[01]+/

\(S \rightarrow 0S \vert 1S \vert \epsilon\)

If I want to generate the string "001"

\(S \Rightarrow 0S\)

\(0S \Rightarrow 00S\)

\(00S \Rightarrow 001S\)

\(001S \Rightarrow 001\)

This process is called deriving

Designing Grammars

Recursion allows for repetition

\(S \Rightarrow 0S\vert \epsilon\)

Designing Grammars

Seperate Non-terminals can be used for concatenation

\(S \rightarrow AB\)
\(A \rightarrow hello\)
\(B \rightarrow world\)

Designing Grammars

Unline Regex, we can refer backwards

Start at middle of string and build outward

Notation: \(a^nb^n\)

\(S \rightarrow aSb\vert \epsilon\)

Designing Grammars

Each path in a union can be a Non-terminal

(c|k)l(i|y)(ff|ph)

\(S \rightarrow AlBC\)
\(A \rightarrow c|k\)
\(B \rightarrow i|y\)
\(C \rightarrow ff|ph\)