言简意赅
A single instruction enters the CPU at the Fetch stage and the PC is incremented in one clock cycle. In the next clock cycle, the instruction moves to the Decode stage. In the third clock cycle, the instruction moves to the Access stage and the operands are loaded. In the last two stages, the instruction is executed and the result is stored.
here's multi-cycle datapath:
the first things you should notice when looking at the datapath is that it has fewer functional units than the single cycle cpu. we only have one memory unit, and only one alu. on the other hand, we have lots of registers that we didn't have before: ir ("instruction register"), mdr ("memory data register"), a, b, and aluout.
so, the obvious first question is: why can we get away with fewer functional units, and why do we need all these registers? we don't need as many functional units because we can re-use the same functional unit for a different purpose on a different clock cycle. for example, during the first cycle of execution, we use the alu to compute pc+4. on the second cycle, we use the alu to precompute the target address of a branch.
we need the extra registers because we will need data from earlier cycles in later cycles. for example, we read the register file in the second cycle of execution, but we will need the values that we read in the third cycle. the extra registers allow us to remember values across clock cycles.
if i point to any component on the multi-cycle datapath, you should be able to tell me what it is and why we need it.