其目的在于, 将需要compile的源代码逐字读入compiler, 并将每一个符合"词汇命名规则(Lexical Syntax)"的字段转换成token存储.
换句话说, 就是一遍读, 一遍看每一个单词的拼写对不对. 对的就转成token, 拼错了就输出error.
作业中对应: A6 P4
(某些语言的compiler在Scanning之后还包含Preprocessing, 因为不在241讨论范围内, 不做解释)
Parsing又叫Syntactic Analysis.
其目的在于, 将token与token联系在一起, 并将他们的转换成符合一定规范的"中间格式", 一般是某种树状结构, 例如241中定义的WLI.
在Parsing过程中, 如果遇到不符合某种语言的"语法规则(Grammar)"时, 则输出error. 如果语法正确, 则输出对应格式.
简单说, 就是看的说的话是不是人话, 有没有缺个标点少个括号.
如果不是人话那就说明你该重新学语法去了.
作业中对应: A8 P4
Semantic Analysis的目的在于, 检查程序是否存在语义上的冲突. 或者说, 上下文是不是相符.
比如在C中, 如果没有declare过变量int a;, 则a = 3;就不合法.
再比如, 如果a declare为int, 则a = 'b';就不合法, 因为a不能为char.
作业中对应: A9 A10
(Optimization为代码优化, 241没有涉及, 知道即可)
将Intermediate Format转换为另一种格式, 比如MIPS或者二进制文件, 可与上一步同步进行.
将多个compile好的多个文件链接在一起, 生成一个可执行的二进制文件(或仅生成合并以后的单个文件, 但文件本身可能无法执行)
作业中对应: A5 P1, P2
附: 09FALL 第二题答案
BOF, EOF, id, -, (, )
S, expr, term
S
0 S BOF expr EOF 1 expr term 2 expr expr - term 3 term id 4 term ( expr )
Total State: 11个 (0 to 11)
Total Transitions: 28个
State Symbol Action 0 BOF shift to state 6 1 ( shift to state 3 1 id shift to state 2 1 term shift to state 8 3 ( shift to state 3 3 expr shift to state 7 3 id shift to state 2 3 term shift to state 4 6 ( shift to state 3 6 expr shift to state 10 6 id shift to state 2 6 term shift to state 4 7 ) shift to state 9 7 - shift to state 1 10 - shift to state 1 10 EOF shift to state 5 2 ) reduce by rule 3 2 - reduce by rule 3 2 EOF reduce by rule 3 4 ) reduce by rule 1 4 - reduce by rule 1 4 EOF reduce by rule 1 8 ) reduce by rule 2 8 - reduce by rule 2 8 EOF reduce by rule 2 9 ) reduce by rule 4 9 - reduce by rule 4 9 EOF reduce by rule 4
BOF id - ( id ) - id EOF
State Stack (At EOL) |
Description | |||||||
S0 | 0 | Start | ||||||
BOF | S6 | 0 6 | S0 BOF S6 | |||||
id | S2 | 0 6 2 | S6 id S2 | |||||
R3 | - | 0 6 | S2 - Reduce by rule #3 (replace id with term) |
|||||
term | S4 | 0 6 4 | S6 term S4 | |||||
R1 | - | 0 6 | S4 - Reduce by rule #1 (replace term with expr) |
|||||
expr | S10 | 0 6 10 | S6 expr S10 | |||||
- | S1 | 0 6 10 1 | S10 - S1 | |||||
( | S3 | 0 6 10 1 3 | S1 ( S3 | |||||
id | S2 | 0 6 10 1 3 2 | S3 id S2 | |||||
R3 | ) | 0 6 10 1 3 | S2 ) Reduce by rule #3 (replace id with term) |
|||||
term | S4 | 0 6 10 1 3 4 | S3 term S4 | |||||
R1 | ) | 0 6 10 1 3 | S4 ) Reduce by rule #1 (replace term with expr) |
|||||
expr | S7 | 0 6 10 1 3 7 | S3 expr S7 | |||||
) | S9 | 0 6 10 1 3 7 9 | S7 ) S9 | |||||
R4 | R4 | R4 | - | 0 6 10 1 | S9 - Reduce by rule #4 (repace "( expr )" with term) |
|||
term | S8 | 0 6 10 1 8 | S1 term S8 | |||||
R2 | R2 | R2 | - | 0 6 | S8 - Reduce by rule #2 (replace "expr - term" with expr) |
|||
expr | S10 | 0 6 10 | S6 expr S10 | |||||
- | S1 | 0 6 10 1 | S10 - S1 | |||||
id | S2 | 0 6 10 1 2 | S1 id S2 | |||||
R3 | EOF | 0 6 10 1 | S2 EOF Reduce by rule #3 (replace id with term) |
|||||
term | S8 | 0 6 10 1 8 | S1 term S8 | |||||
R2 | R2 | R2 | EOF | 0 6 | S8 EOF Reduce by rule #2 (replace "expr - term" with expr) |
|||
expr | S10 | 0 6 10 | S6 expr S10 | |||||
EOF | S5 | Final S5 | S10 EOF S5 |