实现一个简易的五级流水的CPU,解决Hazard,并实现板极验证。
1 设计总览
2 实现原理
根据Top view 将整个CPU分为3个模块:
1 PCPU 主要模块:用从指令内存得到的指令进行运算处理,并从数据内存中写入或读取数据。这是最核心的部分,其他模块都很好设计。
2 Instruction_Mem 指令模块: 为PCPU模块提供指令,使得PCPU中通过指令寄存器中的值,找到指令的内存地址,从而读取指令。
3 Data_Mem 数据模块: PCPU从中读取数据或写入数据
板极验证:
用4个开关表示选择哪一个数据显示,用4个7段数码管显示数字
3 整个PCPU的工作流程:
五级流水:
注释:
IF : 从指令模块中,通过指令寄存器PC寻址,获得指令。
ID: 解码指令,根据操作码提取要用到的数据输入通用寄存器。
EX: ALU运算单元在指令的调度下,使用之前通用寄存器中的数据根据不同指令完成不同运算。运算后将结果输出到reg_C中,并设置各个标记位的值。
MEM:这里主要是针对LOAD,STORE这类的读写指令要用到的。对于其他指令没有特别要求。
WB:回写把计算结果写入指令的左值中,这里除了跳转指令和load,store指令,都是写回第一个操作数(寄存器)
4 需要实现的汇编指令
5 分阶段实现优化过程
阶段一 : 实现最基本的cpu(没有指令存储器,数据存储器,没有解决hazard,没有板极验证,什么也没有…)
完成步骤:
步骤1:理解流水线工作原理
上课时流水线是听懂了,寄存器赋值,取数这些基本原理大概弄懂了。不过实际操作还是有很多要注意的。
步骤2:开始逐级编写基本代码
有了老师给的样例代码,整个框架基本构建好了。所以要做的就是弄清楚每个指令在每一级流水中需要做些什么,然后在每一级流水中再总结这些操作的共同点,不同点,需要特别注意的地方。
1)指令操作码的设计:
优化:这里考虑了一下控制指令的特殊性,因为控制指令在流水每一级操作中有很大相似性,所以为了简化电路,还有判断时代码的简化,把所有控制指令的前两位设为“11”,而其他指令前两位都不是“11”。这样编写代码的时候更容易理清逻辑,综合的时候电路也简单一些。
2)指令各级流水的设计:
这里思考了很多,设计每一条指令都需要特别小心,最重要的是在ID和EX还有ALUo中。
总结一下,主要就是整个16 bits的指令在流水线的每一步中起到“指挥”作用,然后每一级流水的工作就是在上一级完成后,利用
A ID中的寄存器赋值:
reg_A 和 reg_B中存储的是在指令右边将要送入ALU参与运算的值。
由于指令的右边通常声明的是寄存器的编号(因为用户只能对通用寄存器进行操作,无法直接访问内存)或者立即数,所以要通过编号找到确定的寄存器,再将该通用寄存器里的值赋给reg_A 或者 reg_B。
优化: 这里为了减少reg_A 和 reg_B的翻转,考虑了一些优化。比如把要用到立即数的指令的立即数不写入寄存器,而是在下一级流水中之间从指令中取数。
B ex中ALUo的计算
主要就是用verilog本身的运算直接操作。运算后赋值个ALUo寄存器和cf标志位。
优化:比如一些用立即数的操作,之前没有存入reg_A 和 reg_B,直接从ex_ir中取数计算。
C mem中的读写
主要要和data模块进行数据传输,指令就是load和store。
D wb中写回运算结果
由于算数指令和逻辑运算指令基本都是要写回到寄存器。所以这里判断为:非跳转指令,非store,load指令,则写回第一个操作数(寄存器)。
6 源代码
1).v
`timescale 1ns / 1ps ////////////////////////////////////////////////////////////////////////////////// // Company: // Engineer: // // Create Date: 14:15:05 12/18/2014 // Design Name: // Module Name: CPU // Project Name: // Target Devices: // Tool versions: // Description: // // Dependencies: // // Revision: // Revision 0.01 - File Created // Additional Comments: // ////////////////////////////////////////////////////////////////////////////////// // data transfer & Arithmetic `define NOP 5'b00000 `define HALT 5'b00001 `define LOAD 5'b00010 `define STORE 5'b00011 `define LDIH 5'b10000 `define ADD 5'b01000 `define ADDI 5'b01001 `define ADDC 5'b10001 `define SUB 5'b01011 `define SUBI 5'b10011 `define SUBC 5'b10111 `define CMP 5'b01100 // control `define JUMP 5'b11000 `define JMPR 5'b11001 `define BZ 5'b11010 `define BNZ 5'b11011 `define BN 5'b11100 `define BNN 5'b11101 `define BC 5'b11110 `define BNC 5'b11111 // logic / shift `define AND 5'b01101 `define OR 5'b01111 `define XOR 5'b01110 `define SLL 5'b00100 `define SRL 5'b00110 `define SLA 5'b00101 `define SRA 5'b00111 // general register `define gr0 3'b000 `define gr1 3'b001 `define gr2 3'b010 `define gr3 3'b011 `define gr4 3'b100 `define gr5 3'b101 `define gr6 3'b110 `define gr7 3'b111 // FSM `define idle 1'b0 `define exec 1'b1 /******* the whole module CPU is made of Instuction_Mem module, PCPU module and Data_Mem module ********/ module CPU( input wire clk, clock, enable, reset, start, input wire[3:0] select_y, output [7:0] select_segment, output [3:0] select_bit ); wire[15:0] d_datain; wire[15:0] i_datain; wire[7:0] d_addr; wire[7:0] i_addr; wire[15:0] d_dataout; wire d_we; wire[15:0] y; reg [20:0] count = 21'b0; Instruction_Mem instruction(clock,reset,i_addr,i_datain); PCPU pcpu(clock, enable, reset, start, d_datain, i_datain, select_y, i_addr, d_addr, d_dataout, d_we, y); Data_memory data(clock, reset, d_addr, d_dataout, d_we, d_datain); Board_eval eval(clk, y, select_segment, select_bit); endmodule /************************ Instruction memeory module *****************************/ module Instruction_Mem ( input wire clock, reset, input wire[7:0] i_addr, output [15:0] i_datain ); reg[15:0] i_data[255:0]; // 8 bits pc address to get instructions reg[15:0] temp; always@(negedge clock) begin if(!reset) begin i_data[0] <= {`LOAD, `gr1, 1'b0, `gr0, 4'b0000}; i_data[1] <= {`LOAD, `gr2, 1'b0, `gr0, 4'b0001}; i_data[2] <= {`ADD, `gr3, 1'b0, `gr1, 1'b0, `gr2}; i_data[3] <= {`SUB, `gr3, 1'b0, `gr1, 1'b0, `gr2}; i_data[4] <= {`CMP, `gr3, 1'b0, `gr2, 1'b0, `gr1}; i_data[5] <= {`ADDC, `gr3, 1'b0, `gr1, 1'b0, `gr2}; i_data[6] <= {`SUBC, `gr3, 1'b0, `gr1, 1'b0, `gr2}; i_data[7] <= {`SLL, `gr2, 1'b0, `gr3, 1'b0, 3'b001}; i_data[8] <= {`SRL, `gr3, 1'b0, `gr1, 1'b0, 3'b001}; i_data[9] <= {`SLA, `gr4, 1'b0, `gr1, 1'b0, 3'b001}; i_data[10] <= {`SRA, `gr5, 1'b0, `gr1, 1'b0, 3'b001}; i_data[11] <= {`STORE, `gr3, 1'b0, `gr0, 4'b0010}; i_data[12] <= {`HALT, 11'b000_0000_0000}; end else begin temp = i_data[i_addr[7:0]]; end end assign i_datain = temp; endmodule /**************************** PCPU module ***************************/ module PCPU( input wire clock, enable, reset, start, input wire [15:0] d_datain, // output from Data_Mem module input wire [15:0] i_datain, // output from Instruction_Mem module input wire [3:0] select_y, // for the board evaluation output [7:0] i_addr, output [7:0] d_addr, output [15:0] d_dataout, output d_we, output [15:0] y ); reg [15:0] gr [7:0]; reg nf, zf, cf; reg state, next_state; reg dw; reg [7:0] pc; reg[15:0] y_forboard; reg [15:0] id_ir; reg [15:0] wb_ir; reg [15:0] ex_ir; reg [15:0] mem_ir; reg [15:0] smdr = 0; reg [15:0] smdr1 = 0; reg signed [15:0] reg_C1; //有符号 reg signed [15:0] reg_A; reg signed [15:0] reg_B; reg signed [15:0] reg_C; reg signed [15:0] ALUo; //************* CPU control *************// always @(posedge clock) begin if (!reset) state <= `idle; else state <= next_state; end always @(*) begin case (state) `idle : if ((enable == 1'b1) && (start == 1'b1)) next_state <= `exec; else next_state <= `idle; `exec : if ((enable == 1'b0) || (wb_ir[15:11] == `HALT)) next_state <= `idle; else next_state <= `exec; endcase end assign i_addr = pc; // 准备下一条指令的地址 //************* IF *************// always @(posedge clock or negedge reset) begin if (!reset) begin id_ir <= 16'b0; pc <= 8'b0; end else if (state ==`exec) // Stall happens in IF stage, always compare id_ir with i_datain to decide pc and id_ir begin // 当即将被执行的指令要用到之前load写入的值时, stall two stages , id and ex. /* 指令中后第二、三个操作数均为寄存器时,需要判断LOAD的第一个操作数是否与这些指令的后两个寄存器有冲突 为一部分算数运算指令和逻辑运算指令 */ if((i_datain[15:11] == `ADD ||i_datain[15:11] == `ADDC ||i_datain[15:11] == `SUB ||i_datain[15:11] == `SUBC ||i_datain[15:11] == `CMP ||i_datain[15:11] == `AND ||i_datain[15:11] == `OR ||i_datain[15:11] == `XOR) &&( (id_ir[15:11] == `LOAD && (id_ir[10:8] == i_datain[6:4] || id_ir[10:8] == i_datain[2:0])) ||(ex_ir[15:11] == `LOAD && (ex_ir[10:8] == i_datain[6:4] || ex_ir[10:8] == i_datain[2:0])) ) ) // end if begin id_ir <= 16'bx; pc <= pc; // hold pc end /* 指令中第二个操作数为寄存器变量并参与运算时,需要判断LOAD的第一个操作数是否与这些指令的第二个操作数的寄存器有冲突 为移位指令和STORE指令 */ else if (( i_datain[15:11] == `SLL ||i_datain[15:11] == `SRL ||i_datain[15:11] == `SLA ||i_datain[15:11] == `SRA ||i_datain[15:11] == `STORE) &&((id_ir[15:11] == `LOAD &&(id_ir[10:8] == i_datain[6:4])) ||(ex_ir[15:11] == `LOAD &&(ex_ir[10:8] == i_datain[6:4])) ) ) begin id_ir <= 16'bx; pc <= pc; // hold pc end /* 跳转指令系列,id和ex阶段都需要stall,mem阶段跳转 */ else if(id_ir[15:14] == 2'b11 || ex_ir[15:14] == 2'b11) begin id_ir <= 16'bx; pc <= pc; // hold pc end /* mem阶段跳转 */ else begin // BZ & BNZ if(((mem_ir[15:11] == `BZ) && (zf == 1'b1)) || ((mem_ir[15:11] == `BNZ) && (zf == 1'b0))) begin id_ir <= 16'bx; pc <= reg_C[7:0]; end // BN & BNN else if(((mem_ir[15:11] == `BN) && (nf == 1'b1)) || ((mem_ir[15:11] == `BNN) && (nf == 1'b0))) begin id_ir <= 16'bx; pc <= reg_C[7:0]; end // BC & BNC else if(((mem_ir[15:11] == `BC) && (cf == 1'b1)) || ((mem_ir[15:11] == `BNC) && (cf == 1'b0))) begin id_ir <= 16'bx; pc <= reg_C[7:0]; end // JUMP else if((mem_ir[15:11] == `JUMP) || (mem_ir[15:11] == `JMPR)) begin id_ir <= 16'bx; pc <= reg_C[7:0]; end // 非跳转指令且没有检测到冲突 else begin id_ir <= i_datain; pc <= pc + 1; end end // end else end // else reset end // end always //************* ID *************// always @(posedge clock or negedge reset) begin if (!reset) begin ex_ir <= 16'b0; reg_A <= 16'b0; reg_B <= 16'b0; smdr <= 16'b0; end else if (state == `exec) //Data forwarding happens in ID stage, always check id_ir to decide reg_A/B begin ex_ir <= id_ir; // ********************reg_A 赋值******************* // /* 其他无冲突的情况 */ // reg_A <= r1: 要用到 r1 参与运算的指令,即除 "JUMP" 外的控制指令和一些运算指令,将寄存器r1中的值赋给reg_A if ((id_ir[15:14] == 2'b11 && id_ir[15:11] != `JUMP) || (id_ir[15:11] == `LDIH) || (id_ir[15:11] == `ADDI) || (id_ir[15:11] == `SUBI)) reg_A <= gr[id_ir[10:8]]; else if (id_ir[15:11] == `LOAD) reg_A <= gr[id_ir[6:4]]; // case for data forwarding, 当前指令第2个操作数用到之前指令第1个操作数的结果 else if(id_ir[6:4] == ex_ir[10:8]) reg_A <= ALUo; else if(id_ir[6:4] == wb_ir[10:8]) reg_A <= reg_C1; else if(id_ir[6:4] == mem_ir[10:8]) reg_A <= reg_C; //reg_A <= r2: 如果运算中不用到 r1,要用到 r2, 则将 gr[r2] else reg_A <= gr[id_ir[6:4]]; //************************* reg_B赋值************************// if (id_ir[15:11] == `STORE) begin reg_B <= {12'b0000_0000_0000, id_ir[3:0]}; //value3 smdr <= gr[id_ir[10:8]]; // r1 end // case for data forwarding, 当前指令第3个操作数用到之前指令第1个操作数的结果 else if(id_ir[2:0] == ex_ir[10:8]) reg_B <= ALUo; else if(id_ir[2:0] == wb_ir[10:8]) reg_B <= reg_C1; else if(id_ir[2:0] == mem_ir[10:8]) reg_B <= reg_C; /* 其他无冲突的情况 */ else if ((id_ir[15:11] == `ADD) || (id_ir[15:11] == `ADDC) || (id_ir[15:11] == `SUB) || (id_ir[15:11] == `SUBC) || (id_ir[15:11] == `CMP) || (id_ir[15:11] == `AND) || (id_ir[15:11] == `OR) || (id_ir[15:11] == `XOR)) reg_B <= gr[id_ir[2:0]]; end end //************* ALUo *************// always @ (*) begin // {val2, val3} if (ex_ir[15:11] == `JUMP) ALUo <= {8'b0, ex_ir[7:0]}; // 跳转指令 r1 + {val2, val3} else if (ex_ir[15:14] == 2'b11) ALUo <= reg_A + {8'b0, ex_ir[7:0]}; //算数运算,逻辑运算,计算结果到ALUo, 并计算cf标志位 else begin case(ex_ir[15:11]) `LOAD: ALUo <= reg_A + {12'b0000_0000_0000, ex_ir[3:0]}; `STORE: ALUo <= reg_A + reg_B; `LDIH: {cf, ALUo} <= reg_A + { ex_ir[7:0], 8'b0 }; `ADD: {cf, ALUo} <= reg_A + reg_B; `ADDI:{cf, ALUo} <= reg_A + { 8'b0, ex_ir[7:0] }; `ADDC: {cf, ALUo} <= reg_A + reg_B + cf; `SUB: {cf, ALUo} <= {{1'b0, reg_A} - reg_B}; `SUBI: {cf, ALUo} <= {1'b0, reg_A }- { 8'b0, ex_ir[7:0] }; `SUBC:{cf, ALUo} <= {{1'b0, reg_A} - reg_B - cf}; `CMP: {cf, ALUo} <= {{1'b0, reg_A} - reg_B}; `AND: {cf, ALUo} <= {1'b0, reg_A & reg_B}; `OR: {cf, ALUo} <= {1'b0, reg_A | reg_B}; `XOR: {cf, ALUo} <= {1'b0, reg_A ^ reg_B}; `SLL: {cf, ALUo} <= {reg_A[4'b1111 - ex_ir[3:0]], reg_A << ex_ir[3:0]}; `SRL: {cf, ALUo} <= {reg_A[ex_ir[3:0] - 4'b0001], reg_A >> ex_ir[3:0]}; `SLA: {cf, ALUo} <= {reg_A[ex_ir[3:0] - 4'b0001], reg_A <<< ex_ir[3:0]}; `SRA: {cf, ALUo} <= {reg_A[4'b1111 - ex_ir[3:0]], reg_A >>> ex_ir[3:0]}; default: begin end endcase end end //************* EX *************// always @(posedge clock or negedge reset) begin if (!reset) begin mem_ir <= 16'b0; reg_C <= 16'b0; dw <= 0; nf <= 0; zf <= 0; smdr1 <= 16'b0; end else if (state == `exec) begin mem_ir <= ex_ir; reg_C <= ALUo; if (ex_ir[15:11] == `STORE) begin dw <= 1'b1; smdr1 <= smdr; end // 设置标志位zf, nf, 算数和逻辑运算 else if(ex_ir[15:14] != 2'b11 && ex_ir[15:11] != `LOAD) begin zf <= (ALUo == 0)? 1:0; nf <= (ALUo[15] == 1'b1)? 1:0; dw <= 1'b0; end else dw <= 1'b0; end end // PCPU module 的输出 assign d_dataout = smdr1; assign d_we = dw; assign d_addr = reg_C[7:0]; //************* MEM *************// always @(posedge clock or negedge reset) begin if (!reset) begin wb_ir <= 16'b0; reg_C1 <= 16'b0; end else if (state == `exec) begin wb_ir <= mem_ir; if (mem_ir[15:11] == `LOAD) reg_C1 <= d_datain; else if(mem_ir[15:14] != 2'b11) reg_C1 <= reg_C; end end //************* WB *************// always @(posedge clock or negedge reset) begin if (!reset) begin gr[0] <= 16'b0; gr[1] <= 16'b0; gr[2] <= 16'b0; gr[3] <= 16'b0; gr[4] <= 16'b0; gr[5] <= 16'b0; gr[6] <= 16'b0; gr[7] <= 16'b0; end else if (state == `exec) begin // 回写到 r1 if ((wb_ir[15:14] != 2'b11) &&(wb_ir[15:11] != `STORE) &&(wb_ir[15:11] != `CMP) ) gr[wb_ir[10:8]] <= reg_C1; end end // 板极验证 assign y = y_forboard; // 板极验证需要的输出 always @(select_y) begin case(select_y) 4'b0000: y_forboard <= {8'B0,pc}; 4'b0001: y_forboard <= id_ir; 4'b0010: y_forboard <= reg_A; 4'b0011: y_forboard <= reg_B; 4'b0100: y_forboard <= smdr; 4'b0101: y_forboard <= ALUo; 4'b0110: y_forboard <= {15'b0, cf}; 4'b0111: y_forboard <= {15'b0, nf}; 4'b1000: y_forboard <= reg_C; 4'b1001: y_forboard <= reg_C1; 4'b1010: y_forboard <= gr[0]; 4'b1011: y_forboard <= gr[1]; 4'b1100: y_forboard <= gr[2]; 4'b1101: y_forboard<= gr[3]; 4'b1110: y_forboard <= gr[4]; 4'b1111: y_forboard <= gr[5]; endcase end endmodule /**************************** Data memory module ******************************/ module Data_memory ( input wire clock, reset, input wire [7:0] d_addr, input wire [15:0] d_dataout, input wire d_we, output [15:0] d_datain ); reg[15:0] temp; reg[15:0] d_data[255:0]; always@(negedge clock) begin if(!reset) begin d_data[0] <= 16'hFc00; d_data[1] <= 16'h00AB; end else if(d_we) begin d_data[d_addr] <= d_dataout; end else begin temp = d_data[d_addr]; end end assign d_datain = temp; endmodule /**************************** Board evaluation module ******************************/ module Board_eval ( input wire clock, input wire [15:0] y, output reg [7:0] select_segment, output reg [3:0] select_bit ); parameter SEG_NUM0 = 8'b00000011, SEG_NUM1 = 8'b10011111, SEG_NUM2 = 8'b00100101, SEG_NUM3 = 8'b00001101, SEG_NUM4 = 8'b10011001, SEG_NUM5 = 8'b01001001, SEG_NUM6 = 8'b01000001, SEG_NUM7 = 8'b00011111, SEG_NUM8 = 8'b00000001, SEG_NUM9 = 8'b00001001, SEG_A = 8'b00010001, SEG_B = 8'b11000001, SEG_C = 8'b01100011, SEG_D = 8'b10000101, SEG_E = 8'b01100001, SEG_F = 8'b01110001; // 位选 parameter BIT_3 = 4'b0111, BIT_2 = 4'b1011, BIT_1 = 4'b1101, BIT_0 = 4'b1110; reg [20:0] count = 0; always @ (posedge clock) begin count <= count + 1'b1; end always @ (posedge clock) begin case(count[19:18]) 2'b00: begin select_bit <= BIT_3; case(y[15:12]) 4'b0000: select_segment <= SEG_NUM0; 4'b0001: select_segment <= SEG_NUM1; 4'b0010: select_segment <= SEG_NUM2; 4'b0011: select_segment <= SEG_NUM3; 4'b0100: select_segment <= SEG_NUM4; 4'b0101: select_segment <= SEG_NUM5; 4'b0110: select_segment <= SEG_NUM6; 4'b0111: select_segment <= SEG_NUM7; 4'b1000: select_segment <= SEG_NUM8; 4'b1001: select_segment <= SEG_NUM9; 4'b1010: select_segment <= SEG_A; 4'b1011: select_segment <= SEG_B; 4'b1100: select_segment <= SEG_C; 4'b1101: select_segment <= SEG_D; 4'b1110: select_segment <= SEG_E; 4'b1111: select_segment <= SEG_F; endcase end 2'b01: begin select_bit <= BIT_2; case(y[11:8]) 4'b0000: select_segment <= SEG_NUM0; 4'b0001: select_segment <= SEG_NUM1; 4'b0010: select_segment <= SEG_NUM2; 4'b0011: select_segment <= SEG_NUM3; 4'b0100: select_segment <= SEG_NUM4; 4'b0101: select_segment <= SEG_NUM5; 4'b0110: select_segment <= SEG_NUM6; 4'b0111: select_segment <= SEG_NUM7; 4'b1000: select_segment <= SEG_NUM8; 4'b1001: select_segment <= SEG_NUM9; 4'b1010: select_segment <= SEG_A; 4'b1011: select_segment <= SEG_B; 4'b1100: select_segment <= SEG_C; 4'b1101: select_segment <= SEG_D; 4'b1110: select_segment <= SEG_E; 4'b1111: select_segment <= SEG_F; endcase end 2'b10: begin select_bit <= BIT_1; case(y[7:4]) 4'b0000: select_segment <= SEG_NUM0; 4'b0001: select_segment <= SEG_NUM1; 4'b0010: select_segment <= SEG_NUM2; 4'b0011: select_segment <= SEG_NUM3; 4'b0100: select_segment <= SEG_NUM4; 4'b0101: select_segment <= SEG_NUM5; 4'b0110: select_segment <= SEG_NUM6; 4'b0111: select_segment <= SEG_NUM7; 4'b1000: select_segment <= SEG_NUM8; 4'b1001: select_segment <= SEG_NUM9; 4'b1010: select_segment <= SEG_A; 4'b1011: select_segment <= SEG_B; 4'b1100: select_segment <= SEG_C; 4'b1101: select_segment <= SEG_D; 4'b1110: select_segment <= SEG_E; 4'b1111: select_segment <= SEG_F; endcase end 2'b11: begin select_bit <= BIT_0; case(y[3:0]) 4'b0000: select_segment <= SEG_NUM0; 4'b0001: select_segment <= SEG_NUM1; 4'b0010: select_segment <= SEG_NUM2; 4'b0011: select_segment <= SEG_NUM3; 4'b0100: select_segment <= SEG_NUM4; 4'b0101: select_segment <= SEG_NUM5; 4'b0110: select_segment <= SEG_NUM6; 4'b0111: select_segment <= SEG_NUM7; 4'b1000: select_segment <= SEG_NUM8; 4'b1001: select_segment <= SEG_NUM9; 4'b1010: select_segment <= SEG_A; 4'b1011: select_segment <= SEG_B; 4'b1100: select_segment <= SEG_C; 4'b1101: select_segment <= SEG_D; 4'b1110: select_segment <= SEG_E; 4'b1111: select_segment <= SEG_F; endcase end endcase end endmodule
2).test
`timescale 1ns / 1ps //////////////////////////////////////////////////////////////////////////////// // Company: // Engineer: // // Create Date: 21:22:32 12/29/2014 // Design Name: CPU // Module Name: C:/Users/liang/Desktop/embed/CPU/CPU/CPUTest.v // Project Name: CPU // Target Device: // Tool versions: // Description: // // Verilog Test Fixture created by ISE for module: CPU // // Dependencies: // // Revision: // Revision 0.01 - File Created // Additional Comments: // //////////////////////////////////////////////////////////////////////////////// module CPU_test; // Inputs reg clock; reg enable; reg reset; reg [3:0] select_y; reg start; // Outputs wire [15:0] y; // Instantiate the Unit Under Test (UUT) CPU cpu ( .clock(clock), .enable(enable), .reset(reset), .start(start), .select_y(select_y) ); initial begin // Initialize Inputs clock = 0; enable = 0; reset = 0; select_y = 0; start = 0; // Wait 100 ns for global reset to finish #100; forever begin #5 clock <= ~clock; end // Add stimulus here end initial begin // Wait 100 ns for global reset to finish #100; $display("pc: id_ir : ex_ir :reg_A: reg_B: reg_C: cf: nf: zf: regC1: gr1: gr2: gr3: gr4: gr5:"); $monitor("%h: %b: %b: %h: %h: %h: %h: %h: %h: %h: %h: %h: %h: %h: %h", cpu.pcpu.pc, cpu.pcpu.id_ir, cpu.pcpu.ex_ir, cpu.pcpu.reg_A, cpu.pcpu.reg_B, cpu.pcpu.reg_C, cpu.pcpu.cf, cpu.pcpu.nf, cpu.pcpu.zf, cpu.pcpu.reg_C1, cpu.pcpu.gr[1], cpu.pcpu.gr[2], cpu.pcpu.gr[3], cpu.pcpu.gr[4], cpu.pcpu.gr[5]); enable <= 1; start <= 0; select_y <= 0; #10 reset <= 0; #10 reset <= 1; #10 enable <= 1; #10 start <=1; #10 start <= 0; end endmodule
3).ucf
NET "select_segment[7]" LOC = "T17"; NET "select_segment[6]" LOC = "T18"; NET "select_segment[5]" LOC = "U17"; NET "select_segment[4]" LOC = "U18"; NET "select_segment[3]" LOC = "M14"; NET "select_segment[2]" LOC = "N14"; NET "select_segment[1]" LOC = "L14"; NET "select_segment[0]" LOC = "M13"; NET "select_bit[0]" LOC = "N16"; NET "select_bit[1]" LOC = "N15"; NET "select_bit[2]" LOC = "P18"; NET "select_bit[3]" LOC = "P17"; NET "clk" LOC = "V10"; NET "clock" LOC = "B8"; NET "clock" CLOCK_DEDICATED_ROUTE = FALSE; NET "reset" LOC = "T10"; NET "enable" LOC = "T9"; NET "start" LOC = "V9"; NET "select_y[3]" LOC = "T5"; NET "select_y[2]" LOC = "V8"; NET "select_y[1]" LOC = "U8"; NET "select_y[0]" LOC = "N8";
7 测试用的指令:
i_data[0] <= {`LOAD, `gr1, 1'b0, `gr0, 4'b0000};
i_data[1] <= {`LOAD, `gr2, 1'b0, `gr0, 4'b0001};
i_data[2] <= {`ADD, `gr3, 1'b0, `gr1, 1'b0, `gr2};
i_data[3] <= {`SUB, `gr3, 1'b0, `gr1, 1'b0, `gr2};
i_data[4] <= {`CMP, `gr3, 1'b0, `gr2, 1'b0, `gr1};
i_data[5] <= {`ADDC, `gr3, 1'b0, `gr1, 1'b0, `gr2};
i_data[6] <= {`SUBC, `gr3, 1'b0, `gr1, 1'b0, `gr2};
i_data[7] <= {`SLL, `gr2, 1'b0, `gr3, 1'b0, 3'b001};
i_data[8] <= {`SRL, `gr3, 1'b0, `gr1, 1'b0, 3'b001};
i_data[9] <= {`SLA, `gr4, 1'b0, `gr1, 1'b0, 3'b001};
i_data[10] <= {`SRA, `gr5, 1'b0, `gr1, 1'b0, 3'b001};
i_data[11] <= {`STORE, `gr3, 1'b0, `gr0, 4'b0010};
i_data[12] <= {`HALT, 11'b000_0000_0000};
pc: id_ir : ex_ir : reg_A reg_B reg_C cf nf zf regC1: gr1: gr2: gr3: gr4: gr5:
00: 0000000000000000: 0000000000000000: 0000: 0000: 0000: x: 0: 0: 0000: 0000: 0000: 0000: 0000: 0000
//以下两个指令是LOAD,将Data Memory中 第一个和第二个数取出,存储在gr1和gr2中
01: 0001000100000000: 0000000000000000: xxxx: xxxx: xxxx: x: x: x: 0000: 0000: 0000: 0000: 0000: 0000
02: 0001001000000001: 0001000100000000: 0000: xxxx: xxxx: x: x: x: xxxx: 0000: 0000: 0000: 0000: 0000
// Stall
02: xxxxxxxxxxxxxxxx: 0001001000000001: 0000: 0000: 0000: x: x: x: xxxx: 0000: 0000: 0000: 0000: 0000
02: xxxxxxxxxxxxxxxx: xxxxxxxxxxxxxxxx: xxxx: 0000: 0001: x: x: x: fc00: 0000: 0000: 0000: 0000: 0000
//开始ADD指令,gr3 <= gr1 + gr2
03: 0100001100010010: xxxxxxxxxxxxxxxx: xxxx: 0000: 0001: x: x: x: 00ab: fc00: 0000: 0000: 0000: 0000
//开始SUB指令
04: 0101101100010010: 0100001100010010: fc00: 00ab: 0001: 1: x: x: 00ab: fc00: 00ab: 0000: 0000: 0000、
//开始CMP指令 经过两级流水,此时ADD指令已经把加法运算结果写入到了reg_C ( gr1(fc00) – gr2(00ab))
05: 0110001100100001: 0101101100010010: fc00: 00ab: fcab: 0: 1: 0: 00ab: fc00: 00ab: 0000: 0000: 0000
//开始ADDC指令 经过两级流水,此时SUB指令已经把减法运算结果写入到了reg_C ( gr1(fc00 )- gr2(00ab)),有了借位,cf标志位为1
06: 1000101100010010: 0110001100100001: 00ab: fc00: fb55: 1: 1: 0: fcab: fc00: 00ab: 0000: 0000: 0000
//开始SUBC指令 经过两级流水,此时CMP指令已经把减法运算结果写入到了reg_C (gr2(00ab) – gr1(fc00)),00ab < fc00,cf标志位为0
07: 1011101100010010: 1000101100010010: fc00: 00ab: 04ab: 0: 0: 0: fb55: fc00: 00ab: fcab: 0000: 0000
//开始SLL指令 经过两级流水,此时ADDC指令已经把加法运算结果写入到了reg_C (gr1(fc00) + gr2(00ab) + CF(0))
08: 0010001000110001: 1011101100010010: fc00: 00ab: fcab: 0: 1: 0: 04ab: fc00: 00ab: fb55: 0000: 0000
//开始SRL指令 经过两级流水,此时SUBC指令已经把减法运算结果写入到了reg_C (gr1(fc00) - gr2(00ab) - CF(0))
09: 0011001100010001: 0010001000110001: fb55: 00ab: fb55: 1: 1: 0: fcab: fc00: 00ab: fb55: 0000: 0000
//开始SLA指令 经过两级流水,此时SLL指令已经把移位运算结果写入到了reg_C (gr3(fb55) 左移 1位)
0a: 0010110000010001: 0011001100010001: fc00: 00ab: f6aa: 0: 1: 0: fb55: fc00: 00ab: fcab: 0000: 0000
//开始SRA指令 经过两级流水,此时SRL指令已经把移位运算结果写入到了reg_C (gr1(fc00) 右移 1位)
0b: 0011110100010001: 0010110000010001: fc00: 00ab: 7e00: 0: 0: 0: f6aa: fc00: 00ab: fb55: 0000: 0000
//开始STORE指令 经过两级流水,此时SLA指令已经把移位运算结果写入到了reg_C (gr1(fc00) 算术左移 1位)
0c: 0001101100000010: 0011110100010001: fc00: 00ab: f800: 1: 1: 0: 7e00: fc00: f6aa: fb55: 0000: 0000
//开始HALT指令 经过两级流水,此时SRA指令已经把移位运算结果写入到了reg_C (gr1(fc00) 算术右移 1位)
0d: 0000100000000000: 0001101100000010: xxxx: 0002: fe00: 1: 1: 0: f800: fc00: f6aa: 7e00: 0000: 0000
0e: xxxxxxxxxxxxxxxx: 0000100000000000: xxxx: 0002: xxxx: 1: 1: 0: fe00: fc00: f6aa: 7e00: f800: 0000
0f: xxxxxxxxxxxxxxxx: xxxxxxxxxxxxxxxx: xxxx: 0002: xxxx: 1: x: x: xxxx: fc00: f6aa: 7e00: f800: fe00
10: xxxxxxxxxxxxxxxx: xxxxxxxxxxxxxxxx: xxxx: 0002: xxxx: 1: x: x: xxxx: fc00: f6aa: 7e00: f800: fe00
11: xxxxxxxxxxxxxxxx: xxxxxxxxxxxxxxxx: xxxx: 0002: xxxx: 1: x: x: xxxx: fc00: f6aa: 7e00: f800: fe00
以上以reg_C举例说明验证,其他寄存器验证相应的前移或后移一级流水线即可。