Questions and CORRECT Answers
lex - lexical analysis - CORRECT ANSWER - converts a character stream to token stream,
removes whitespace & comments. Write regular expression decription of tokens, create DFA and
turn into code. Generally need 1 token buffer to find end of token, may also need higher priority
in the case where there is more than one match eg keywords vs. variables names. RE/FA can't
match brackets, require context-free grammars for this
syn - Syntax analysis - CORRECT ANSWER - converts a token stream into a parsetree eg
(abstract) syntax tree
trans - Translation (linearization) - CORRECT ANSWER - converts a tree into simple
(linear) intermediate code eg JVM, deals with scope & allocation of variables, determining the
type of expressions, selection of overloaded operators
cg - Target Code Generation - CORRECT ANSWER - translates intermediate code into
target machine code— eg assembly
Assembler - CORRECT ANSWER - convert text instructions into binary instructions eg .s
to .o on Linux or .asm to .obj on windows, substitutes addresses for labels
Disassembler - CORRECT ANSWER - convert object file back into assembler level form
Multi-pass compiler - CORRECT ANSWER - n front-ends (lex/syn) & m back-ends, gives
nm compilers for n languages into m architectures
Static/Global Variables - CORRECT ANSWER - allocated to fixed location in memory
Local Variables - CORRECT ANSWER - need multiple copies for recursion etc, allocated
to fixed offset from $fp (4*n)
, Stack - CORRECT ANSWER - mem block in which stack frames are allocated, function
call allocates a new stack frame, return de-allocates
$fp - CORRECT ANSWER - MIPS register points to stack frame of currently active
function
$sp - CORRECT ANSWER - points to lowest used location
Stack Frame - CORRECT ANSWER - has pointer to previous stack frame (FP') and return
address (RA) - stack linkage information
Stack Parameter Passing - CORRECT ANSWER - caller pushes args onto $sp, parameters
then accessed at $fp+8,12 etc
Address Space Map - CORRECT ANSWER - code, static data, stack and heap segments
Just in Time Compilation - CORRECT ANSWER - distribute intermediate code and
perform cg once host architecture is known (eg JVM)
Character-stream form - CORRECT ANSWER - early interpreters which would re-lex &
re-parse statements when they were encountered
Token-stream form - CORRECT ANSWER - parsing can be done when program is read eg
BBC Basic stored token form reparsed on execute
Syntax-tree form - CORRECT ANSWER - natural & simple, commonly used for PHP or
Python