High-level programming languages like Python make programming a breeze, but how do they work? There's a big gap between Python and machine instructions for modern computers. Learn how to translate Python programs all the way to Intel x86 assembly language with a pit stop at C along the way.
Most compiler courses teach one phase of the compiler at a time, such as parsing, semantic analysis, and register allocation. The problem with that approach is it is difficult to understand how the whole compiler fits together and why each phase is designed the way it is. Instead, each week we implement a successively larger subset of the Python language. The very first subset is a tiny language of expressions, and by the time we are done the language includes objects, inheritance, and first-class functions.
Distance Learning: This course is also offered through the CAETE program.
Time: MWF 11:00am-11:50am
Location: ECCS 1B14
Office hours: TBD
Textbook: None.
Recommended books:
- Modern Compiler Implementation in Java 2nd edition by Andrew W. Appel
- Python in a Nutshell by Alex Martelli
- The C Programming Language 2nd edition by Kernighan and Ritchie
Grading: homework 10%, weekly quizzes 30%, midterm exam 20%, final exam 40%.
Workload: up to 9 hours of out-of-class work per week.
Policies regarding honor code, religious observances, disabilities, and classroom behavior can be found here.
Homework and Exams for distance students (CAETE): Because there is a delay in when you can watch the videos of the class, all homeworks and exams are due one week after the posted date. The questions for each quiz will be emailed to you one week after the posted date of the quiz and you will email the answers back to Jeremy by the next day. The quizzes are closed book and should be completed alone, just use your brain and only use the computer to type in your answers. For example, do not use the Python interpreter on your computer to help you answer the quiz.
Assignments:
(Due dates are not yet finalized and the assignments are subject
to change before the Fall of 2008.)
- Integer expressions and ASTs [pdf], due 9/5.
- Parsing with PLY (Python Lex-Yacc) [pdf], due 9/12.
- Floats, bools, and polymorphism [pdf], due 9/19.
- Type analysis and specialization [pdf], due 9/24.
- Variables and control flow [pdf], both parts due 10/1.
- Lists, dictionaries, and heap allocation [pdf], due 10/8.
- Functions and closure conversion [pdf], due 10/17.
- Objects and classes [pdf], due 11/5.
- First steps toward x86 [pdf], due 11/12.
- Removing structured control flow and 3-operand statements [pdf], due 11/26
- Function calling conventions and the stack [pdf], due 12/3.
- Register allocation [pdf], due 12/10.
Instructions for turning in assignments: Assignments are due by 6am of the due date and should be sent via email to Jeremy. Make sure that the subject line of your email starts with [ECEN4553]. The assignment will be in a tar or zip file attached to the email. The tar/zip file will contain a directory whose name is the first and last name of one of the group members, separated by an underscore, followed by an underscore and the assignment number. So, for example, if one of the group members is Larry Johnson, then the directory name for the first assignment could be Larry_Johnson_1. Inside this directory, there should be a python file that implements your compiler, named 'compile.py', a subdirectory named 'test', a file named 'group.txt' that contains a list of the group members, with one name per line, and any other support files needed by your compiler, such as the code for PLY, etc. The 'test' subdirectory should contain all of the python test programs that you created to test your compiler.
Quizzes (given in class):
- 9/10: on material from assignment 1 (Integer expressions and ASTs).
- 9/17: on material from assignment 2 (Parsing).
- 9/24: on material from assignment 3 (Floats, bools, and polymorphism).
- 10/1: on material from assignment 4 (Type analysis).
- 10/8: on material from assignment 5 (Variables, SSA form, and iterative type analysis).
- 10/15: on material from assignment 6 (Lists, dictionaries, and heap allocation)
- Practice quiz for assignment 7 with solution. (Midterm will include material up through assignment 7)
- 11/9: on material from assignment 8.
- 11/16: on material from assignment 9
- 11/29: on material from assignment 10
- 12/7: on material from assignment 11
Exams:
- Midterm: 10/26 in class
- Final: 12/15, 1:30pm-4:00pm
Grades
- Please send Jeremy a secret code word so that you can identify your grades.
- Assignments are graded on a 4 point scale (0-4).
- Quizes are graded on a 10 point scale.
- Exams are graded on a 100 point scale.
- Script to automate regression testing: run_tests.py
- The CAD lab for group work: ECEE 282.
- Class email list: intro-to-compilers-at-cu@googlegroups.com, google group web page
- Examples from Python crash course.
- Test programs for P0: test_p0.tar.
- Test programs for P1: test_p1.tar.
- Test programs for P2: test_p2.tar.
- Test programs for P3: test_p3.tar.
- Test programs for P4: test_p4.tar.
- Test programs for P5: test_p5.tar.
- IA32 Assembly for Compiler Writers
- IA-32 Opcode Dictionary
- Intel Manuals
- Nice slides about x86 architecture and assembly language
- Introduction to Linux Intel Assembly Language
- Linux Assembly Hello World Tutorial
- GNU Assembler Manual
- Insight Debugger
- Debugging with DDD
- Hash table implementation in C.
- Paper: Garbage Collection in an Uncooperative Environment by Hans-J. Boehm and Mark Weiser pdf
- The GNU statement expression extension is helpful when you need to embed a sequence of statements in an expression.
- The Python Bytecode Interpreter: ceval.c, opcode.h
- Python AST definitions: ast.py
- interp.py
Resources: