## Compilerconstructie

najaar 2012

http://www.liacs.nl/home/rvvliet/coco/

Rudy van Vliet kamer 124 Snellius, tel. 071-527 5777 rvvliet(at)liacs.nl

college 7, dinsdag 6 november 2012

Code Generation

# Code Generator Position in a Compiler

source Front intermediate Code intermediate Code target program End code Optimizer code Generator program

- Output code must
- be correct
- use resources of target machine effectively
- Code generator must run efficiently

Generating optimal code is undecidable problem Heuristics are available

N

## Input to the Code Generator

8.1 Issues in Design of Code Generator

• Input to the code generator

- Intermediate representation of source program
- Three-address representations (e.g., quadruples)
- Virtual machine representations (e.g., bytecodes)
- Postfix notation
- Graphical representations (e.g., syntax trees and DAGs)
- Information from symbol table to dresses determine run-time ad-
- Input is free of errors
- Type checking and conversions have been done

ω

Evaluation order

• Register allocation and assignment

Instruction selection

The target program

- RISC: reduced instruction set computer
- CISC: complex instruction set computer
- Possible output

The Target Program

- Common target-machine architectures

- Stack-based

- Absolute machine code (executable code)
- Relocatable machine code (object files for linker)
- Assembly-language

### IJ

## **Instruction Selection**

- Given IR program can be implemented by many different code sequences
- Different machine instruction speeds
- Naive approach: statement-by-statement translation, with code template for each IR statement

Example: x = y + z

Now, a = b + c

d = a + e

LD RO, y ADD RO, RO, z ST x, RO LD RO, b
ADD RO, RO, c
ST a, RO
LD RO, a
ADD RO, RO, e
ST d, RO

### Target Machine

- Designing code generator requires understanding of target machine and its instruction set
- Our machine model
- has n general purpose registers  $\mathtt{RO},\mathtt{R1},\ldots,\mathtt{R}n-1$
- assumes operands are integers

# **Instructions of Target Machine**

- Load operations: LD dst, addr e.g., LD r,x or LD  $r_1,r_2$
- $\bullet$  Store operations: ST x,r
- $\bullet$  Computation operations: OP  $dst, src_1, src_2$  e.g.,  $SUB\ r_1, r_2, r_3$
- ullet Unconditional jumps: BR L
- $\bullet \ \, \text{Conditional jumps: B} \\ \text{cond } r,L \\ \text{e.g., BLTZ } r,L \\$

# Addressing Modes of Target Machine

| Form  | Form Address              | Example         |
|-------|---------------------------|-----------------|
| r     | J                         | LD R1, R2       |
| x     | x                         | LD R1, x        |
| a(r)  | a + contents(r)           | LD R1, a(R2)    |
| c(r)  | c + contents(r)           | LD R1, 100(R2)  |
| *7    | contents(r)               | LD R1, *R2      |
| *c(r) | contents(c + contents(r)) | LD R1, *100(R2) |
| #c    |                           | LD R1, #100     |

9

## Addressing Modes (Examples)

a[j] = cLD R1, c LD R2, j MUL R2, R2, #8 ST a(R2), R1 LD R1, i MUL R1, R1, #8 LD R2, a(R1) ST b, R2 = a[i]: x = \*p if x < y goto L LD R1, x LD R2, y SUB R1, R1, R2 BLTZ R1, M LD R1, p LD R2, 0(R1) ST x, R2

10

### Instruction Costs

- Costs associated with compiling / running a program

   Compilation time

   Size, running time, power consumption of target program
- Finding optimal target problem: undecidable
- (Simple) cost per target-language instruction:  $-1 + cost \ for \ addressing \ modes \ of \ operands \\ \approx \ length \ (in \ words) \ of \ instruction$

### ш\_\_\_ п

| LD R1,   | LD RO  | LD RO | instr       | Examples: |
|----------|--------|-------|-------------|-----------|
| 1        | ,<br>× | ), R1 | instruction | ples:     |
| *100(R2) |        |       | n           |           |
| 2        | 2      | Ц     | cost        |           |
|          |        |       |             |           |

11

### 8.4 **Basic Blocks and Flow Graphs**

- Basic block: maximal sequence of consecutive three-address instructions, such that
- (a) Flow of control can only enter through first instruction of
- (b) Control leaves block without halting or branching
- Flow graph: graph with nodes: basic blocks

edges: indicate flow between blocks

12

### **Determining Basic Blocks**

- Determine leaders
- H First three-address instruction is leader
- Ŋ Any instruction that is target of goto is leader
- ω Any instruction that immediately follows goto is leader

For each leader, its basic block consists of leader and all instructions up to next leader (or end of program)  $\label{eq:bound} % \begin{array}{c} \left( \frac{1}{2} \right) & \left( \frac$ 

13

14

### Determining Basic **Blocks** (Example)

### Determine leaders

for i = 1 to 10 do for j = 1 to 10 do a[i,j] = 0.0; for i = 1 to 10 do a[i,i] = 1.0; Pseudo code Three-address code i = 1 t1 = 10 \* 1 t2 = t1 + j t2 = t1 + j t4 = t3 - 88 t4 = t3 - 88 j = j + 1 if j <= 10 goto if j + 1 if i <= 10 goto t5 = i - 1 t6 = 88 \* t5 t6 = 88 \* t5 (13) 2 (3)

### Determining Basic Blocks (Example)

### Determine leaders

for i=1 to 10 do for j=1 to 10 do a[i,j]=0.0;for i=1 to 10 do a[i,i]=1.0;Pseudo code

 $\downarrow \downarrow \downarrow$  $\downarrow\downarrow\downarrow$ Three-address code i = 1 t1 = 10 \* 1 t2 = t1 + j t3 = 8 \* t2 t4 = t3 - 88 a[t4] = 0.0 j = j + 1 if i <= 10 goto t6 = 88 \* t5 a[t6] = 1.0 t6 = 88 \* t5 a[t6] = 1.0 a[t6] = 1.0 goto (13) (3) (2)

### Flow Graph

Edge from block  ${\cal B}$ to block Q

- if there is (un)conditional jump from end of  ${\cal B}$  to beginning of  ${\cal C}$
- if C immediately follows B in original order, and B does not end in unconditional jump

16

## Flow Graph (Example)





## Loops in Flow Graph

Loop is set of nodes

- Example • With unique loop entry e • Every node in L has nonempty path in L to e
- $\{B_3\}$ , with loop entry  $B_3$
- $\{B_2, B_3, B_4\}$ , with loop
- entry  $B_2$   $\{B_6\}$ , with loop entry  $B_6$
- $B_3$  $B_1$  $B_2$  $t_1 = 10 * i$   $t_2 = t_1 + j$   $t_3 = 8 * t_2$   $t_4 = t_3 - 88$   $a[t_4] = 0.0$  j = j + 1  $if j \leftarrow 10 g$ ENTRY
- $B_4$ i = i + 1if  $i \leftarrow 10$  goto  $B_2$ 10 goto B<sub>3</sub>
- 18

 $B_5$ 

## **Next-Use Information**

- Next-use information is needed for dead-code elimination and register assignment
- (i) x = a \* b
- (j N H C + ×

Instruction j uses value of x computed at x is live at i, i.e., we need value of x later

For each three-address statement  $x=\mbox{next-uses}$  of x,y,z $y\ op\ z$  in block, record

19

# **Determining Next-Use Information**

For single basic block

- Assume all non-temporary variables are live on exit
- Make backward scan of instructions in block
- For each instruction i: x =: y op
- 1. Attach to i current next-use- and liveness information of x,y,z
- 2. Set  $\boldsymbol{x}$  to 'not live' and 'no next use'
- 3. Set y and z to 'live' Set 'next uses' of y and z to i

20

# Passing Liveness Information over Blocks

Example of loop



21

22

# Passing Liveness Information over Blocks

Example of loop



### 8.6 A Simple Code Generator

Use of registers

- Operands of operation must be in registers
- To hold values of temporary variables
- To hold (global) values that are used in several blocks
- To manage run-time stack

Assumption: subset of registers available for block

Machine instructions of form

- ID reg, memST mem, regOP reg, reg, reg

# Register and Address Descriptors

- Register descriptor keeps track of what is currently in register Example:
- LD R,x $\rightarrow$  register R contains
- Initially, all registers are empty
- Address descriptor keeps track of locations where current value of a variable can be found
- Example:

$$\operatorname{LD} R, x \longrightarrow x \text{ is (also) in } R$$

Information stored in symbol table

23

# The Code-Generation Algorithm

For each three-address instruction  $\boldsymbol{x} = \boldsymbol{y}$  op

- Use getReg(x =y op z) to select registers  $R_x, R_y, R_z$
- Ņ . If y is not in  $R_y$ , then issue instruction LD  $R_y, y'$  where y' is a memory location for y (according to address descriptor)
- If z is not in  $R_z$ ,
- 4. Issue instruction  $OP\ R_x, R_y, R_z$

At end of block: store all variables that are live-on-exit and not in their memory locations (according to address descriptor)

# Managing Register / Address Descriptors

Description in book

J(c)

4 a - b

LD R1,

LD R2,

SUB R2, R1,

u = a - c

LD R3, C

SUB R1, R1, R3,

V + t + u

ADD R3, R2, R1

a = d

LD R3, d

= v + u

ADD R1, R'

ADD R1, R' Example: d = (a - b) + (a - c) + (a - c) $a = \dots$  old value of d

ST a, R2

26

### Function getReg

For each instruction  $x=y\ op\ z$ 

- To compute  $R_y$
- If y is in register, - $\rightarrow R_y$
- 2. Else, if empty register available,  $\longrightarrow R_{y}$ 3. Else, select occupied register
  For each register R and variable v in R(a) If v is also somewhere else, then OK
- (b) If v is x, and x is not z, then OK (c) Else, if  $\boldsymbol{v}$  is not used later, then  $\mathsf{OK}$
- (d) Else, ST v,Ris required

Take R with smallest number of stores

27

### Function getReg

For each instruction  $x=y\ op\ z$ 

To compute  $R_x$ , similar with few differences

For each instruction x=y, choose  $R_x=R_y$ 

28

# 8.8 Register Allocation and Assignment

So far, live variables in registers are stored at end of block

Use of registers

- Operands of operation must be in registers
- To hold values of temporary variables
- To hold (global) values that are used in several blocks
- To manage run-time stack

29

### Usage counts

With  $\boldsymbol{x}$  in register during loop L

- $\bullet$  Save 1 for each use of x that is not preceded by assignment in same block
- $\bullet$  Save 2 for each block, where x is assigned a value and x is live on  $\operatorname{exit}$
- Total savings  $\approx$  $\sum_{\text{blocks }B\in L}$ use(x,B) + 2 \* live(x,B)

Choose variables  $\boldsymbol{x}$  with largest savings

30

## Usage counts (Example)



Savings for a are 1+1+1\*2=4

### 8.5 Optimization of Basic **Blocks**

To improve running time of code

- Local optimization: within block
- Global optimization: across blocks

block Local optimization benefits from DAG representation of basic

32

# **DAG** Representation of Basic Blocks

- 1. A node for initial value of each variable appearing in block
- 2. A node N for each statement s in block Children of N are nodes corresponding to last definitions of operands used by s
- 3. Node N is labeled by operator applied at s N has list of variables for which s is last definition in block

### Example:

 Q
 C
 D
 D

 II
 II
 II
 II

 D
 D
 D
 D

 D
 D
 D
 D

 I
 I
 I
 I

 Q
 C
 Q
 C

33

## Local Common Subexpression Elimination

- Use value-number method to detect common subexpressions
- Remove redundant computations

### Example:

 d
 c
 d
 e

 n
 n
 n
 n

 n
 d
 e
 d

 n
 d
 e
 e

 n
 d
 e
 e

 n
 e
 e
 e
 e

 n
 e
 e
 e
 e

 n
 e
 e
 e
 e

 n
 e
 e
 e
 e

 n
 e
 e
 e
 e

 n
 e
 e
 e
 e

 n
 e
 e
 e
 e

 n
 e
 e
 e
 e

 n
 e
 e
 e
 e

 n
 e
 e
 e
 e

34

# Local Common Subexpression Elimination

- Use value-number method to detect common subexpressions
- Remove redundant computations

### Example:

| Д  | C  | Ъ  | മ  |
|----|----|----|----|
| II | II | II | II |
| a  | Ъ  | മ  | Ъ  |
| 1  | +  | 1  | +  |
| Д  | C  | Д  | C  |
|    |    |    |    |
|    |    |    |    |
|    |    |    |    |
|    |    |    |    |
|    |    |    |    |
|    |    |    |    |
| Д  | C  | Ъ  | ø  |
| II | II | II | II |
| Ъ  | Ъ  | ø  | Ъ  |
|    | +  | 1  | +  |
|    | C  | d  | O  |

85

## Dead Code Elimination

- Remove roots with no live variables attached
- If possible, repeat

### Example:

No common subexpression

If c and e are not live...

36

## **Dead Code Elimination**

- Remove roots with no live variables attached
- If possible, repeat

### Example:

дρ 11 11 д д 1 + d c

No common subexpression

If c and e are not live...

37

## Algebraic Transformations

(see assignment 3)

Algebraic identities:

$$x+0 = 0+x = x$$
  
 $x*1 = 1*x = x$ 

Reduction in strength:

$$x^2 = x * x$$
 (cheaper)  
 $2*x = x + x$  (cheaper)  
 $x/2 = x*0.5$  (cheaper)

Constant folding:

$$2*3.14 = 6.28$$

38

## Algebraic Transformations

Common subexpressions resulting from commutativity  $\slash\$  associativity of operators:

$$x * y = y * x$$
  
$$c + d + b = (b + c) + d$$

Common subexpressions generated by relational operators:

$$x > y \Leftrightarrow x - y > 0$$

## 8.7 Peephole Optimization

- Examines short sequence of instructions in a window (peephole) and replace them by faster/shorter sequence
- Applied to intermediate code or target code
- Typical optimizations
- Redundant instruction elimination
- Eliminating unreachable code
- Flow-of-control optimization
- Algebraic simplification
- Use of machine idioms

40

# **Redundant Instruction Elimination**

### Example:

ST a, RO LD RO, a

41

42

if debug == 1 goto L1
goto L2
L1: print debugging information
L2:

# **Eliminating Unreachable Code**

if debug != 1 goto L2
L1: print debugging information
L2:

If debug is set to 0 at beginning of program,  $\ldots$ 

# Flow-of-Control Optimizations

### Example:

goto L1

43

4

## Compiler constructie

college 7 Code Generation

Chapters for reading: 8.intro, 8.1, 8.2, 8.4, 8.5–8.5.4, 8.6–8.8

45

**Eliminating Unreachable Code** 

### Example:

L1: goto L2