The aim of this module is to introduce the students to formal languages and compilers.
This module is taught during the Autumn Term. There are two hours of lectures and two hours of laboratory sessions. The lab sessions start in the second week of term (Week 3). The schedule of topics given here is provisional.
Two one-hour lectures per week. Monday 1pm (5A.118) and 3pm (5A.330)
One two-hour lab per week\newline Tuesday 2pm–4pm (CSEE Lab 3) starting in Week 3.
| Week | Provisional Topic |
|---|---|
| 2 | Background. Tokenisation. Regular Expressions |
| 3 | Finite State Machines (NFA and DFA). Translations. |
| 4 | Language Theory. Context Free Grammars. BNF. |
| 5 | Parsing (top-down). Ambiguity. Parsing (bottom-up). |
| 6 | Concrete v Abstract Syntax. Static Semantic Analysis. Types. Symbol Tables. |
| 7 | Scoping. Hierarchical symbol tables. Small Compiler assignment. |
| 8 | Types. Type checking. Inference with types. |
| 9 | Code generation.Intermediate languages. Stack machines. Three-address code. |
| 10 | Local and global optimisations. Liveness analysis. |
| 11 | Runtime organisation. Activation Records. Calling conventions. Heap. Recursion. |
| Week | Provisional Topic |
|---|---|
| 2 | — |
| 3 | Start work on Expression Analyser and Forth familiarisation |
| 4 | Produce regex and CFG rules. Create token streams. Think about error handling. |
| 5 | Representation of grammar rules and implementation of parsing algorithms |
| 6 | Demonstrate prototype expression analyser |
| 7 | Demonstrate final analyser prototypes. Work towards Small Compiler. |
| 8 | Continue work on small compiler, and understanding of target language. |
| 9 | You should aim to have assignemnt statements working in the assignment. |
| 10 | Demonstrate prototype small compiler |
| 11 | Demonstrate final small compiler |
You should start on building the Expression Analyser for Assignment One. You may wish to refamiliarise yourself with the second CE204 assignment and the initial program code you were given in CE204 and as well as your submitted solution. You should also have a go at running the gForth interpreter, which is the target of both the expression analyser and the compiler project. (The CE305 notes on the gForth interpreter contain some simple exercises to try).
Things to try:
Note the potential ambiquity for some tokens, such as - and +.
Both of these can occur as binary operators (23+32, 32-23), and
as (unary) signs, where they are normally considered to be part of
the number (e.g. -23 and +32). They may also prefix an
expression (as in -(2+3), which could be intepreted as an
abbreviation of 0-(2+3)).
There are further ambiguities to consider with floating point
numbers. Compare 1e1 + -23e-32 with 1e1 + -23e -32, and with
1e1 + - 23e-32, for example. Note that some of these are
syntactically or semantically ill-formed, but it is not up to the
tokeniser to deal with such matters.
Note: You may wish to try to simplify the development of your assignment using a "compiler-compiler" tool, such as ANTLR. Now might be a good time to start investigating such tools.
Continue work on the first assignment.
You should continue work on the Small Compiler for the second assignment. You might want to experiment with variables, conditionals and loops in Forth, as discussed in the lectures for week 21.
Note that gForth requires that control statements only occur within "compiled" Forth, which means they must occur within the context of a definition.
<Forth global variable declarations go here> : program <Body of Forth program code goes here> ; program
If you are stuck, try to make sure your tokeniser and parser are working with assignment statements. If you cannot get declarations working, then you can test your programs by declaring some default variables automatically in the compiler's output, as shown below.
variable a variable b variable c : program <output of compiler goes here> ; program
Once these are working, you can start on if-then
conditionals. While-loops should then be fairly easy. If you are
stuck on if-then-else then look at the notes on the "dangling
else" problem. Then you can go back and sort out variable
declaration and "type-checking".
These should be read in conjunction with your own notes of the lectures.
Split: 40 percent Coursework Mark and 60 percent Exam Mark.
Forty percent of the mark for the module is based on coursework.
Details to be confirmed
| Item | Title | Set | Due | Weight | Deadline | Lab Demonstration |
|---|---|---|---|---|---|---|
| 1 | Expression Analyser | w4 | w6 | 15% | Wednesday, 9th November 2011 | Prototype: w6, Final: w7 |
| 2 | Small Compiler | w8 | w11 | 25% | Wednesday, 14th December 2011 | Prototype: w11 |
Sixty percent of the mark for the module is awarded through a two-hour written examination which is sat during Summer Examination period.
The following is provisional!
Date: Autumn Term 2011/12
HTML generated by org-mode 6.21b in emacs 23