To make it easier to navigate this post, here are the contents: -------------------------------------- A. Data Hazards 1. Needing to forward 2. Needing to stall 3. Not needing anything B. Control hazards 1. Flushing the pipeline - jump 2. Flushing the pipeline - branch -------------------------------------- ==================== A. DATA HAZARDS ==================== In our 5-stage pipelined architecture, all instructions take 5 cycles. What's important to remember is *when* certain pieces of data are ready. In particular, IF calculate pc+4 ID read registers / jump target / jump taken EX know ALU result / branch condition+target / branch taken MEM access memory WB write registers When instructions use different pieces of information at different times, we run into no problems. The pipeline runs smoothly and we can get a maximum of 5x speedup (assuming all stages are full, all the time). For example, addi $1, $0, 4 addi $2, $2, 4 add $0, $0, $0 the pipeline diagram of the stages is 1 2 3 4 5 6 7 addi IF ID EX MEM WB addi IF ID EX MEM WB add IF ID EX MEM WB ----------------------- 1. Needing to forward ----------------------- The problem arises when data in one stage depends on data in another stage. This is called a data dependency, and is a potential hazard --- are we going to get the data in time? For example, addi $1, $1, 4 lw $2, 0($1) has a dependency between $1 in the addi and in the lw. This will need to be resolved by forwarding, as we will see. Note that the addi has the result of the addition at the end of EXecute (cycle 3), but has not written the result back into the register bank until WB (cycle 5). The lw instruction needs the data in ID (cycle 3) when it reads the registers... or in the worst case, in EX when it calculates the address. Solution: we can forward the result of the addi operation in EX (cycle 3) into the EX stage of the lw instruction (cycle 4). 1 2 3 4 5 6 addi IF ID EX\MEM WB lw IF ID`EX MEM WB where the \` signifies an arrow. :) --------------------- 1. Needing to stall --------------------- You may also need to fix things with stalling in conjunction with forwarding. For example, lw $1, 0($2) addi $4, $1, 4 will be a problem (the data used in $1 in the addi may not be back from memory in the lw). 1 2 3 4 5 6 lw IF ID EX MEM\WB add IF ID `EX MEM WB Because the data isn't back from MEM until the end of the 4th cycle, but it's needed in EX in the beginning 4 (which is impossible), we need to stall the pipeline at cycle 4 and forward from MEM (4) into EX (5). Note that the pipeline stalls all the way down... meaning that NO stages are exected in cycle 4. If there were more instructions, the flow would look like this, 1 2 3 4 5 6 7 8 9 10 lw IF ID EX MEM\WB addi IF ID `EX MEM WB nop IF ID EX MEM WB nop IF ID EX MEM WB nop IF ID EX MEM WB with a big hole or "bubble" in cycle 4. ------------------------- 3. Not needing anything ------------------------- Sometimes data dependencies work themselves out. For example, add $1, $1, $0 lw $2, 0($3) sub $5, $0, $4 xor $3, $4, $0 addi $3, $1, 4 there is a dependency between the add and the addi ($1), but if you look at the timing for it, 1 2 3 4 5 6 7 8 9 add IF ID EX MEM WB lw IF ID EX MEM WB sub IF ID EX MEM WB xor IF ID EX MEM WB addi IF ID EX MEM WB the data is already in the register bank in cycle 5, and it's read out of the register bank in cycle 6! So there is no need to forward or stall. ====================== A. CONTROL HAZARDS ====================== A control hazard happens when you have a branch or jump, i.e., you are modifying the control flow of the program. For example, in a predict-not-taken (PNT) scheme where jumps are calculated and exectuted in ID, you don't know it's a jump until the end of ID... which means you've already fetched the next instruction. In a branch, you don't know if it's taken until EX, so you have fetched two instructions (and decoded one). If you're wrong, you need to flush the pipeline. --------------------------------- 1. Flushing the pipeline: jump --------------------------------- A jump is like an unconditional branch. You know the condition and the target in ID; you can take the jump in ID. If you have a jump followed by another instruction (such as in a loop), you will begin executing the wrong instruction. j address add $0, $0, $0 . . address: sub $0, $0, $0 The jump starts its IF and ID. During ID, the next instruction (which is at PC+4 since the PC is updated in IF), add, is being fetched. However, at the end of ID we realize we don't want the add instruction; instead, we want to jump to address. So we must flush the pipeline of the pesky add and fetch the sub. j IF ID EX MEM WB add >>>IF<<<< (flushed) sub IF ID EX ... ----------------------------------- 2. Flushing the pipeline: branch ----------------------------------- For a branch, the situation is similar, except the branch takes longer to figure out if it is taken. For a predict-not-taken (PNT) scheme, we start fetching the instructions at PC+4 away from the branch (as opposed to the branch target address). If we're right, and the branch is not taken (BNT), we don't lose any time. However, if we're wrong and the branch is taken (BT), we need to flush the pipeline of 2 wrong instructions. For a not-taken branch in a PNT scheme, BNT IF ID EX MEM WB nop(right) IF ID EX MEM WB nop(right) IF ID EX MEM WB there is no problem. For a taken branch in a PNT scheme, BT IF ID EX MEM WB nop(wrong) IF ID<<<< (flushed) nop(wrong) IF<<<< (flushed) add(right) IF ID EX... we have to flush two instructions out of the pipeline. This is analogously similar to BT in a PT (predict-taken) scheme, and a BNT in a PT scheme, respectively.https://classes.soe.ucsc.edu/cmpe110/Spring04/pipelining.txt