OpenRisc-66-基于ORPSoC对linux进行RTL仿真

引言

前面,我们介绍过对裸机程序进行RTL仿真,那些裸机程序规模比较小,只有几KB大小。

另外,我们也已经实现了针对O_board的SoC进行了RTL仿真(http://blog.csdn.net/rill_zhen/article/details/21190757),本小节,我们将实现在ML501平台上对linux进行RTL仿真。


1,DDR2仿真模型的修改

针对ML501的ORPSoC工程中,默认配置的DDR2的仿真模型与实际板子上使用的DDR2 SDRAM的参数不一致,我们要进行修改。

a,实际内存参数

要想对DDR2 SDRAM的仿真模型进行修改,我们首先要弄明白几个概念。

RANK,BANK,row,,column。这几个都是逻辑上的概念。

此外还有channel,module,chip,device等物理上的概念。

对于ML501使用的DDR2 SDRAM来说,其具体参数如下所示:

通过查看内存条,我们可以看到如下内容:MT4HTF3264HY-667F1     1RX16  256MB PC-5300S,

其中3263是指内存条的organization:32Megx64,x64表示整个内存条的数据线(DQ)宽度是64bit。

667表示内存条的speed grade。PC-5300也是speed grade。

1RX16表示内存条上面的4个device,每个数据宽度是16,16X4正好是64bit。

256MB,毫无疑问,表示内存条的容量是256M bytes。


通过内存条上面的标示,我们就可以获得很多信息,此外,通过查看其数据手册,我们会得到更详细的参数:

OpenRisc-66-基于ORPSoC对linux进行RTL仿真_第1张图片

OpenRisc-66-基于ORPSoC对linux进行RTL仿真_第2张图片

RANK:是single rank。

BANK:BA是2bit,说明bank数量是4,每个bank的大小是256MB/4=64MB。

row:宽度是[12:0],一共13bit。

column:宽度是[9:0],一共10bit。



b,仿真模型参数

确定了我们实际使用的内存条的参数之后,我们就可以修改仿真模型的具体参数了。

需要注意的是ddr2_model.v只是一个timing model,具体的storage,需要我们自己根据实际情况来定。

这里需要修改的是MEM_BITS,由于ddr2_model.v是一个device的仿真模型,每个device中包含4个四分之一的bank,共64MB,所以对于如下定义:


    // Memory Storage
`ifdef MAX_MEM
    reg     [BL_MAX*DQ_BITS-1:0] memory  [0:`MAX_SIZE-1];
`else//     [8      *  16   -1:0]        [0:(1<<22) -1]==>26bit==>64MB
    reg     [BL_MAX*DQ_BITS-1:0] memory  [0:`MEM_SIZE-1];
    reg     [`MAX_BITS-1:0]      address [0:`MEM_SIZE-1];
    reg     [MEM_BITS:0]         memory_index;
    reg     [MEM_BITS:0]         memory_used;
`endif

我们需要定义MEM_BITS为22,如下所示:

OpenRisc-66-基于ORPSoC对linux进行RTL仿真_第3张图片


完整的参数,如下所示:


/****************************************************************************************
*
*   Disclaimer   This software code and all associated documentation, comments or other 
*  of Warranty:  information (collectively "Software") is provided "AS IS" without 
*                warranty of any kind. MICRON TECHNOLOGY, INC. ("MTI") EXPRESSLY 
*                DISCLAIMS ALL WARRANTIES EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED 
*                TO, NONINFRINGEMENT OF THIRD PARTY RIGHTS, AND ANY IMPLIED WARRANTIES 
*                OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. MTI DOES NOT 
*                WARRANT THAT THE SOFTWARE WILL MEET YOUR REQUIREMENTS, OR THAT THE 
*                OPERATION OF THE SOFTWARE WILL BE UNINTERRUPTED OR ERROR-FREE. 
*                FURTHERMORE, MTI DOES NOT MAKE ANY REPRESENTATIONS REGARDING THE USE OR 
*                THE RESULTS OF THE USE OF THE SOFTWARE IN TERMS OF ITS CORRECTNESS, 
*                ACCURACY, RELIABILITY, OR OTHERWISE. THE ENTIRE RISK ARISING OUT OF USE 
*                OR PERFORMANCE OF THE SOFTWARE REMAINS WITH YOU. IN NO EVENT SHALL MTI, 
*                ITS AFFILIATED COMPANIES OR THEIR SUPPLIERS BE LIABLE FOR ANY DIRECT, 
*                INDIRECT, CONSEQUENTIAL, INCIDENTAL, OR SPECIAL DAMAGES (INCLUDING, 
*                WITHOUT LIMITATION, DAMAGES FOR LOSS OF PROFITS, BUSINESS INTERRUPTION, 
*                OR LOSS OF INFORMATION) ARISING OUT OF YOUR USE OF OR INABILITY TO USE 
*                THE SOFTWARE, EVEN IF MTI HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH 
*                DAMAGES. Because some jurisdictions prohibit the exclusion or 
*                limitation of liability for consequential or incidental damages, the 
*                above limitation may not apply to you.
*
*                Copyright 2003 Micron Technology, Inc. All rights reserved.
*
****************************************************************************************/

    // Parameters current with 512Mb datasheet rev N

    // Timing parameters based on Speed Grade

                                          // SYMBOL UNITS DESCRIPTION
`define sg37E 
`define x16
//`define MAX_MEM

`ifdef sg37E
    parameter TCK_MIN          =    3750; // tCK    ps    Minimum Clock Cycle Time
    parameter TJIT_PER         =     125; // tJIT(per)  ps Period JItter
    parameter TJIT_DUTY        =     125; // tJIT(duty) ps Half Period Jitter
    parameter TJIT_CC          =     250; // tJIT(cc)   ps Cycle to Cycle jitter
    parameter TERR_2PER        =     175; // tERR(nper) ps Accumulated Error (2-cycle)
    parameter TERR_3PER        =     225; // tERR(nper) ps Accumulated Error (3-cycle)
    parameter TERR_4PER        =     250; // tERR(nper) ps Accumulated Error (4-cycle)
    parameter TERR_5PER        =     250; // tERR(nper) ps Accumulated Error (5-cycle)
    parameter TERR_N1PER       =     350; // tERR(nper) ps Accumulated Error (6-10-cycle)
    parameter TERR_N2PER       =     450; // tERR(nper) ps Accumulated Error (11-50-cycle)
    parameter TQHS             =     400; // tQHS   ps    Data hold skew factor
    parameter TAC              =     500; // tAC    ps    DQ output access time from CK/CK#
    parameter TDS              =     100; // tDS    ps    DQ and DM input setup time relative to DQS
    parameter TDH              =     225; // tDH    ps    DQ and DM input hold time relative to DQS
    parameter TDQSCK           =     450; // tDQSCK ps    DQS output access time from CK/CK#
    parameter TDQSQ            =     300; // tDQSQ  ps    DQS-DQ skew, DQS to last DQ valid, per group, per access
    parameter TIS              =     250; // tIS    ps    Input Setup Time
    parameter TIH              =     375; // tIH    ps    Input Hold Time
    parameter TRC              =   55000; // tRC    ps    Active to Active/Auto Refresh command time
    parameter TRCD             =   15000; // tRCD   ps    Active to Read/Write command time
    parameter TWTR             =    7500; // tWTR   ps    Write to Read command delay
    parameter TRP              =   15000; // tRP    ps    Precharge command period
    parameter TRPA             =   15000; // tRPA   ps    Precharge All period
    parameter TXARDS           =       6; // tXARDS tCK   Exit low power active power down to a read command
    parameter TXARD            =       2; // tXARD  tCK   Exit active power down to a read command
    parameter TXP              =       2; // tXP    tCK   Exit power down to a non-read command
    parameter TANPD            =       3; // tANPD  tCK   ODT to power-down entry latency
    parameter TAXPD            =       8; // tAXPD  tCK   ODT power-down exit latency
    parameter CL_TIME          =   15000; // CL     ps    Minimum CAS Latency
`endif                                          // ------ ----- -----------


`ifdef x16
  `ifdef sg37E
    parameter TFAW             =   50000; // tFAW  ps     Four Bank Activate window
  `endif
`endif

    // Timing Parameters

    // Mode Register
    parameter AL_MIN           =       0; // AL     tCK   Minimum Additive Latency
    parameter AL_MAX           =       6; // AL     tCK   Maximum Additive Latency
    parameter CL_MIN           =       3; // CL     tCK   Minimum CAS Latency
    parameter CL_MAX           =       7; // CL     tCK   Maximum CAS Latency
    parameter WR_MIN           =       2; // WR     tCK   Minimum Write Recovery
    parameter WR_MAX           =       8; // WR     tCK   Maximum Write Recovery
    parameter BL_MIN           =       4; // BL     tCK   Minimum Burst Length
    parameter BL_MAX           =       8; // BL     tCK   Minimum Burst Length
    // Clock
    parameter TCK_MAX          =    8000; // tCK    ps    Maximum Clock Cycle Time
    parameter TCH_MIN          =    0.48; // tCH    tCK   Minimum Clock High-Level Pulse Width
    parameter TCH_MAX          =    0.52; // tCH    tCK   Maximum Clock High-Level Pulse Width
    parameter TCL_MIN          =    0.48; // tCL    tCK   Minimum Clock Low-Level Pulse Width
    parameter TCL_MAX          =    0.52; // tCL    tCK   Maximum Clock Low-Level Pulse Width
    // Data
    parameter TLZ              =     TAC; // tLZ    ps    Data-out low-impedance window from CK/CK#
    parameter THZ              =     TAC; // tHZ    ps    Data-out high impedance window from CK/CK#
    parameter TDIPW            =    0.35; // tDIPW  tCK   DQ and DM input Pulse Width
    // Data Strobe
    parameter TDQSH            =    0.35; // tDQSH  tCK   DQS input High Pulse Width
    parameter TDQSL            =    0.35; // tDQSL  tCK   DQS input Low Pulse Width
    parameter TDSS             =    0.20; // tDSS   tCK   DQS falling edge to CLK rising (setup time)
    parameter TDSH             =    0.20; // tDSH   tCK   DQS falling edge from CLK rising (hold time)
    parameter TWPRE            =    0.35; // tWPRE  tCK   DQS Write Preamble
    parameter TWPST            =    0.40; // tWPST  tCK   DQS Write Postamble
    parameter TDQSS            =    0.25; // tDQSS  tCK   Rising clock edge to DQS/DQS# latching transition
    // Command and Address
    parameter TIPW             =     0.6; // tIPW   tCK   Control and Address input Pulse Width  
    parameter TCCD             =       2; // tCCD   tCK   Cas to Cas command delay
    parameter TRAS_MIN         =   40000; // tRAS   ps    Minimum Active to Precharge command time
    parameter TRAS_MAX         =70000000; // tRAS   ps    Maximum Active to Precharge command time
    parameter TRTP             =    7500; // tRTP   ps    Read to Precharge command delay
    parameter TWR              =   15000; // tWR    ps    Write recovery time
    parameter TMRD             =       2; // tMRD   tCK   Load Mode Register command cycle time
    parameter TDLLK            =     200; // tDLLK  tCK   DLL locking time
    // Refresh
    parameter TRFC_MIN         =  105000; // tRFC   ps    Refresh to Refresh Command interval minimum value
    parameter TRFC_MAX         =70000000; // tRFC   ps    Refresh to Refresh Command Interval maximum value
    // Self Refresh
    parameter TXSNR   = TRFC_MIN + 10000; // tXSNR  ps    Exit self refesh to a non-read command
    parameter TXSRD            =     200; // tXSRD  tCK   Exit self refresh to a read command
    parameter TISXR            =     TIS; // tISXR  ps    CKE setup time during self refresh exit.
    // ODT
    parameter TAOND            =       2; // tAOND  tCK   ODT turn-on delay
    parameter TAOFD            =     2.5; // tAOFD  tCK   ODT turn-off delay
    parameter TAONPD           =    2000; // tAONPD ps    ODT turn-on (precharge power-down mode)
    parameter TAOFPD           =    2000; // tAOFPD ps    ODT turn-off (precharge power-down mode)
    parameter TMOD             =   12000; // tMOD   ps    ODT enable in EMR to ODT pin transition
    // Power Down
    parameter TCKE             =       3; // tCKE   tCK   CKE minimum high or low pulse width

    // Size Parameters based on Part Width

`ifdef x16
    parameter ADDR_BITS        =      13; // Address Bits
    parameter ROW_BITS         =      13; // Number of Address bits
    parameter COL_BITS         =      10; // Number of Column bits
    parameter DM_BITS          =       2; // Number of Data Mask bits
    parameter DQ_BITS          =      16; // Number of Data bits
    parameter DQS_BITS         =       2; // Number of Dqs bits
    parameter TRRD             =   10000; // tRRD   Active bank a to Active bank b command time
`endif

`ifdef QUAD_RANK
    `define DUAL_RANK // also define DUAL_RANK
    parameter CS_BITS          =       4; // Number of Chip Select Bits
    parameter RANKS            =       4; // Number of Chip Select Bits
`else `ifdef DUAL_RANK
    parameter CS_BITS          =       2; // Number of Chip Select Bits
    parameter RANKS            =       2; // Number of Chip Select Bits
`else
    parameter CS_BITS          =       2; // Number of Chip Select Bits
    parameter RANKS            =       1; // Number of Chip Select Bits
`endif `endif

    // Size Parameters
    parameter BA_BITS          =       2; // Set this parmaeter to control how many Bank Address bits
// if MEM_BITS== 14, a DQ=16 each part, DQ=64 total (4 parts) => 1MB total (256KB each)
// if MEM_BITS== 15, a DQ=16 each part, DQ=64 total (4 parts) => 2MB total (512KB each)
// if MEM_BITS== 16, a DQ=16 each part, DQ=64 total (4 parts) => 4MB total (1MB each)
// if MEM_BITS== 17, a DQ=16 each part, DQ=64 total (4 parts) => 8MB total (2MB each)
//parameter MEM_BITS         =      14; // Number of write data bursts can be stored in memory.  The default is 2^10=1024.
   parameter MEM_BITS         =      22; // Number of write data bursts can be stored in memory.  //256MB total(64MB each),Rill modify from 17 to 22 140410
    parameter AP               =      10; // the address bit that controls auto-precharge and precharge-all
    parameter BL_BITS          =       3; // the number of bits required to count to MAX_BL
    parameter BO_BITS          =       2; // the number of Burst Order Bits

    // Simulation parameters
    parameter STOP_ON_ERROR    =       1; // If set to 1, the model will halt on command sequence/major errors
    parameter DEBUG            =       0; // Turn on Debug messages
    parameter BUS_DELAY        =       0; // delay in nanoseconds
    parameter RANDOM_OUT_DELAY =       0; // If set to 1, the model will put a random amount of delay on DQ/DQS during reads
    parameter RANDOM_SEED      = 711689044; //seed value for random generator.

    parameter RDQSEN_PRE       =       2; // DQS driving time prior to first read strobe
    parameter RDQSEN_PST       =       1; // DQS driving time after last read strobe
    parameter RDQS_PRE         =       2; // DQS low time prior to first read strobe
    parameter RDQS_PST         =       1; // DQS low time after last valid read strobe
    parameter RDQEN_PRE        =       0; // DQ/DM driving time prior to first read data
    parameter RDQEN_PST        =       0; // DQ/DM driving time after last read data
    parameter WDQS_PRE         =       1; // DQS half clock periods prior to first write strobe
    parameter WDQS_PST         =       1; // DQS half clock periods after last valid write strobe




c,preload的修改

目前,我们已经建立的和实际硬件一致的仿真模型,但是我们在仿真前,要把linux的镜像实现load到仿真模型中才行,这就需要了解DDR2 SDRAM的内部组织结构,了解BL_MAX,BL_BITS,DQ_BITS等参数的具体含义,了解DDR2 SDRAM的读写过程和时序。这些内容请参考《memory system - cache dram disk》一书。这里不再赘述。

对于仿真linux而言,由于编译时指定的内存大小是32MB,所以,我在preload时也只load32MB,一个bank是64MB,所以我们只需要load bank0即可,但是bank0是分布在4个device里的。

下面是修改后的orpsoc_testbench.v的部分代码:


`ifdef XILINX_DDR2
 `ifndef GATE_SIM
   defparam dut.xilinx_ddr2_0.xilinx_ddr2_if0.ddr2_mig0.SIM_ONLY = 1;
 `endif

   always @( * ) begin
      ddr2_ck_sdram        <=  #(TPROP_PCB_CTRL) ddr2_ck_fpga;
      ddr2_ck_n_sdram      <=  #(TPROP_PCB_CTRL) ddr2_ck_n_fpga;
      ddr2_a_sdram    <=  #(TPROP_PCB_CTRL) ddr2_a_fpga;
      ddr2_ba_sdram         <=  #(TPROP_PCB_CTRL) ddr2_ba_fpga;
      ddr2_ras_n_sdram      <=  #(TPROP_PCB_CTRL) ddr2_ras_n_fpga;
      ddr2_cas_n_sdram      <=  #(TPROP_PCB_CTRL) ddr2_cas_n_fpga;
      ddr2_we_n_sdram       <=  #(TPROP_PCB_CTRL) ddr2_we_n_fpga;
      ddr2_cs_n_sdram       <=  #(TPROP_PCB_CTRL) ddr2_cs_n_fpga;
      ddr2_cke_sdram        <=  #(TPROP_PCB_CTRL) ddr2_cke_fpga;
      ddr2_odt_sdram        <=  #(TPROP_PCB_CTRL) ddr2_odt_fpga;
      ddr2_dm_sdram_tmp     <=  #(TPROP_PCB_DATA) ddr2_dm_fpga;//DM signal generation
   end // always @ ( * )
   
   // Model delays on bi-directional BUS
   genvar dqwd;
   generate
      for (dqwd = 0;dqwd < DQ_WIDTH;dqwd = dqwd+1) begin : dq_delay
	 wiredelay #
	   (
            .Delay_g     (TPROP_PCB_DATA),
            .Delay_rd    (TPROP_PCB_DATA_RD)
	    )
	 u_delay_dq
	   (
            .A           (ddr2_dq_fpga[dqwd]),
            .B           (ddr2_dq_sdram[dqwd]),
            .reset       (rst_n)
	    );
      end
   endgenerate
   
   genvar dqswd;
   generate
      for (dqswd = 0;dqswd < DQS_WIDTH;dqswd = dqswd+1) begin : dqs_delay
	 wiredelay #
	   (
            .Delay_g     (TPROP_DQS),
            .Delay_rd    (TPROP_DQS_RD)
	    )
	 u_delay_dqs
	   (
            .A           (ddr2_dqs_fpga[dqswd]),
            .B           (ddr2_dqs_sdram[dqswd]),
            .reset       (rst_n)
	    );
	 
	 wiredelay #
	   (
            .Delay_g     (TPROP_DQS),
            .Delay_rd    (TPROP_DQS_RD)
	    )
	 u_delay_dqs_n
	   (
            .A           (ddr2_dqs_n_fpga[dqswd]),
            .B           (ddr2_dqs_n_sdram[dqswd]),
            .reset       (rst_n)
	    );
      end
   endgenerate
   
   assign ddr2_dm_sdram = ddr2_dm_sdram_tmp;
   //parameter NUM_PROGRAM_WORDS=1048576; 
parameter NUM_PROGRAM_WORDS=8388608;   //Rill modify from 1048576
   integer ram_ptr, program_word_ptr, k;
   reg [31:0] tmp_program_word;
   reg [31:0] program_array [0:NUM_PROGRAM_WORDS-1]; // 1M words = 4MB//8M words = 32MB
   reg [8*16-1:0] ddr2_ram_mem_line; //8*16-bits= 8 shorts (half-words)
   genvar 	  i, j;
   generate
      // if the data width is multiple of 16
      for(j = 0; j < CS_NUM; j = j+1) begin : gen_cs // Loop of 1
         for(i = 0; i < DQS_WIDTH/2; i = i+1) begin : gen // Loop of 4 (DQS_WIDTH=8)
	    initial
	      begin

 `ifdef PRELOAD_RAM
  `include "ddr2_model_preload.v"
 `endif
	      end
	    
	    ddr2_model u_mem0
	      (
	       .ck        (ddr2_ck_sdram[CLK_WIDTH*i/DQS_WIDTH]),
	       .ck_n      (ddr2_ck_n_sdram[CLK_WIDTH*i/DQS_WIDTH]),
	       .cke       (ddr2_cke_sdram[j]),
	       .cs_n      (ddr2_cs_n_sdram[CS_WIDTH*i/DQS_WIDTH]),
	       .ras_n     (ddr2_ras_n_sdram),
	       .cas_n     (ddr2_cas_n_sdram),
	       .we_n      (ddr2_we_n_sdram),
	       .dm_rdqs   (ddr2_dm_sdram[(2*(i+1))-1 : i*2]),
	       .ba        (ddr2_ba_sdram),
	       .addr      (ddr2_a_sdram),
	       .dq        (ddr2_dq_sdram[(16*(i+1))-1 : i*16]),
	       .dqs       (ddr2_dqs_sdram[(2*(i+1))-1 : i*2]),
	       .dqs_n     (ddr2_dqs_n_sdram[(2*(i+1))-1 : i*2]),
	       .rdqs_n    (),
	       .odt       (ddr2_odt_sdram[ODT_WIDTH*i/DQS_WIDTH])
	       );
         end
      end
   endgenerate
   
`endif

下面是ddr2_model_preload.v的修改后的代码:


// File intended to be included in the generate statement for each DDR2 part.
// The following loads a vmem file, "sram.vmem" by default, into the SDRAM.

// Wait until the DDR memory is initialised, and then magically
// load it
$display("%t: wait phy_init_done",$time);
@(posedge dut.xilinx_ddr2_0.xilinx_ddr2_if0.phy_init_done);
$display("%t: Loading DDR2",$time);

$readmemh("sram.vmem", program_array);
/* Now dish it out to the DDR2 model's memory */
for(ram_ptr = 0 ; ram_ptr < 64*1024/*4096*/ ; ram_ptr = ram_ptr + 1)
  begin

     // Construct the burst line, with every second word from where we
     // started, and picking the correct half of the word with i%2
     program_word_ptr = ram_ptr * 16 + (i/2) ; // Start on word0 or word1

     tmp_program_word = program_array[program_word_ptr];
     ddr2_ram_mem_line[15:0] = tmp_program_word[15 + ((i%2)*16):((i%2)*16)];

     program_word_ptr = program_word_ptr + 2;
     tmp_program_word = program_array[program_word_ptr]; 
     ddr2_ram_mem_line[31:16] = tmp_program_word[15 + ((i%2)*16):((i%2)*16)];
     
     program_word_ptr = program_word_ptr + 2;
     tmp_program_word = program_array[program_word_ptr];
     ddr2_ram_mem_line[47:32] = tmp_program_word[15 + ((i%2)*16):((i%2)*16)];
     
     program_word_ptr = program_word_ptr + 2;
     tmp_program_word = program_array[program_word_ptr];
     ddr2_ram_mem_line[63:48] = tmp_program_word[15 + ((i%2)*16):((i%2)*16)];
     
     program_word_ptr = program_word_ptr + 2;
     tmp_program_word = program_array[program_word_ptr];
     ddr2_ram_mem_line[79:64] = tmp_program_word[15 + ((i%2)*16):((i%2)*16)];
     
     program_word_ptr = program_word_ptr + 2;
     tmp_program_word = program_array[program_word_ptr];
     ddr2_ram_mem_line[95:80] = tmp_program_word[15 + ((i%2)*16):((i%2)*16)];
     
     program_word_ptr = program_word_ptr + 2;
     tmp_program_word = program_array[program_word_ptr];
     ddr2_ram_mem_line[111:96] = tmp_program_word[15 + ((i%2)*16):((i%2)*16)];
     
     program_word_ptr = program_word_ptr + 2;
     tmp_program_word = program_array[program_word_ptr];
     ddr2_ram_mem_line[127:112] = tmp_program_word[15 + ((i%2)*16):((i%2)*16)];
     
     // Put this assembled line into the RAM using its memory writing TASK
     //                 (bank ,row          , { col               }, data
     u_mem0.memory_write(2'b00,ram_ptr[19:7], {ram_ptr[6:0],3'b000},ddr2_ram_mem_line);
     
     //$display("Writing 0x%h, ramline=%d",ddr2_ram_mem_line, ram_ptr);
     
  end // for (ram_ptr = 0 ; ram_ptr < ...
$display("(%t) * DDR2 RAM %1d preloaded",$time, i);

这里有两点需要注意:

首先,program_array[]是连续线性的,但是4个device的组织不是连续线性的,所以在调用memory_write()之前一定要变成DDR2 SDRAM实际的组织形式。

此外,由于我们只preload了32MB,小于一个bank,所以bank的地址我们一直是2‘b00,如果以后需要仿真的程序规模超过一个bank的大小了,那么就需要修改bank地址了。



2,验证

修改orpsocv2/sw/makefile.inc中,使之使用现成的elf文件,生成vmem文件。具体修改方法和操作步骤,之前已经介绍过了,这里不再赘述。

执行:make rtl-test TEST=linux PRELOAD_RAM=1

即可得到linux的仿真结果,和实际下板的结果相同。

毫无疑问,由于linux程序规模很大,如果要等到linux启动完成,需要等待很久。

下面是部分输出:

OpenRisc-66-基于ORPSoC对linux进行RTL仿真_第4张图片


下面是仿真一晚上的结果:


# vsim -do {set StdArithNoWarnings 1; run -all; exit} -c -quiet -suppress 8598 tb 
# //  ModelSim SE 10.1c Jul 27 2012 Linux 3.5.0-43-generic
# //
# //  Copyright 1991-2012 Mentor Graphics Corporation
# //  All Rights Reserved.
# //
# //  THIS WORK CONTAINS TRADE SECRET AND PROPRIETARY INFORMATION
# //  WHICH IS THE PROPERTY OF MENTOR GRAPHICS CORPORATION OR ITS
# //  LICENSORS AND IS SUBJECT TO LICENSE TERMS.
# //
# set StdArithNoWarnings 1 
# 1
#  run -all 
# Block Memory Generator CORE Generator module orpsoc_testbench.dut.xilinx_ddr2_0.xilinx_ddr2_if0.cache_mem0.inst is using a behavioral model for simulation which will not precisely model memory collision behavior.
# Xilinx DDR2 MIGed controller at orpsoc_testbench.dut.xilinx_ddr2_0.xilinx_ddr2_if0.ddr2_mig0
# 
# 
# * Starting simulation of design RTL.
# * Test: linux
# 
#      0.00 ns: wait phy_init_done
# 0.0 ps: wait phy_init_done
# 0.0 ps: wait phy_init_done
# 0.0 ps: wait phy_init_done
# (1000.0 ps)(orpsoc_testbench.eth_phy0)PHY configured to 100 Mbps!
# (1000.0 ps)(orpsoc_testbench.eth_phy0)Ethernet link is up!
# Input Error : RST on instance orpsoc_testbench.dut.xilinx_ddr2_0.xilinx_ddr2_if0.ddr2_mig0.u_ddr2_infrastructure.gen_pll_adv.u_pll_adv at time 112500.0 ps must be asserted at least for 10 ns.
# DEBUG i2c_slave; stop condition detected at 174000.0 ps
# orpsoc_testbench.gen_cs[0].gen[0].u_mem0.cmd_task: at time 8525725.0 ps WARNING: 200 us is required before CKE goes active.
# orpsoc_testbench.gen_cs[0].gen[1].u_mem0.cmd_task: at time 8525725.0 ps WARNING: 200 us is required before CKE goes active.
# orpsoc_testbench.gen_cs[0].gen[2].u_mem0.cmd_task: at time 8525725.0 ps WARNING: 200 us is required before CKE goes active.
# orpsoc_testbench.gen_cs[0].gen[3].u_mem0.cmd_task: at time 8525725.0 ps WARNING: 200 us is required before CKE goes active.
# First Stage Calibration completed at time 25988000.0 ps
# Second Stage Calibration completed at time 32918000.0 ps
# Third Stage Calibration completed at time 40320000.0 ps
# Fourth Stage Calibration completed at time 51338000.0 ps
# Calibration completed at time 51338000.0 ps
# 53430000.0 ps: Loading DDR2
# (53430000.0 ps) * DDR2 RAM 3 preloaded
# 53430000.0 ps: Loading DDR2
# (53430000.0 ps) * DDR2 RAM 2 preloaded
# 53430000.0 ps: Loading DDR2
# (53430000.0 ps) * DDR2 RAM 1 preloaded
# 53430000.0 ps: Loading DDR2
# (53430000.0 ps) * DDR2 RAM 0 preloaded
# Or1200 IC enabled at 7808632500.0 ps
# Or1200 DC enabled at 7919317500.0 ps
# Compiled-in FDT at 0xc026b8a0
# 
# Linux version 3.1.0-rc6-00002-g0da8eb1-dirty (openrisc@openrisc-VirtualBox) (gcc version 4.5.1-or32-1.0rc4 (OpenRISC 32-bit toolchain for or32-linux (built 20111017)) ) #107 Mon Mar 3 08:13:04 CET 2014
# 
# CPU: OpenRISC-12 (revision 8) @66 MHz
# 
# -- dcache: 32768 bytes total, 32 bytes/line, 1 way(s)
# 
# -- icache: 32768 bytes total, 32 bytes/line, 1 way(s)
# 
# -- dmmu:   64 entries, 1 way(s)
# 
# -- immu:   64 entries, 1 way(s)
# 
# -- additional features:
# 
# -- debug unit
# 
# -- PIC
# 
# -- timer
# 
# setup_memory: Memory: 0x0-0x2000000
# 
# Reserved - 0x01ffd9dc-0x00002624
# 
# Setting up paging and PTEs.
# 
# map_ram: Memory: 0x0-0x2000000
# 
# On node 0 totalpages: 4096
# 
# free_area_init_node: node 0, pgdat c02525b8, node_mem_map c03cc000
# 
#   Normal zone: 16 pages used for memmap
# 
#   Normal zone: 0 pages reserved
# 
#   Normal zone: 4080 pages, LIFO batch:0
# 
# dtlb_miss_handler c0002000
# 
# itlb_miss_handler c0002108
# 
# OpenRISC Linux -- http://openrisc.net
# 
# pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
# 
# pcpu-alloc: [0] 0 
# 
# Built 1 zonelists in Zone order, mobility grouping off.  Total pages: 4080
# 
# Kernel command line: console=uart,mmio,0x90000000,115200
# 
# Early serial console at MMIO 0x90000000 (options '115200')
# 
# bootconsole [uart0] enabled
# 
# PID hash table entries: 128 (order: -4, 512 bytes)
# 
# Dentry cache hash table entries: 4096 (order: 1, 16384 bytes)
# 
# Inode-cache hash table entries: 2048 (order: 0, 8192 bytes)
# 
# Memory: 28648k/32768k available (2064k kernel code, 4120k reserved, 316k data, 1416k init, 0k highmem)
# 
# mem_init_done ...........................................
# 
# NR_IRQS:32
# 
# 133.33 BogoMIPS (lpj=666666)
# 
# pid_max: default: 32768 minimum: 301
# 
# Mount-cache hash table entries: 1024
# 
# devtmpfs: initialized
# 
# NET: Registered protocol family 16
# 
# Switching to clocksource openrisc_timer
# 
# Switched to NOHz mode on CPU #0
# 
# NET: Registered protocol family 2
# 
# IP route cache hash table entries: 2048 (order: 0, 8192 bytes)
# 
# TCP established hash table entries: 1024 (order: 0, 8192 bytes)
# 
# TCP bind hash table entries: 1024 (order: -1, 4096 bytes)
# 
# TCP: Hash tables configured (established 1024 bind 1024)
# 
# TCP reno registered
# 
# UDP hash table entries: 512 (order: 0, 8192 bytes)
# 
# UDP-Lite hash table entries: 512 (order: 0, 8192 bytes)
# 
# NET: Registered protocol family 1
# 
# RPC: Registered named UNIX socket transport module.
# 
# RPC: Registered udp transport module.
# 
# RPC: Registered tcp transport module.
# 
# RPC: Registered tcp NFSv4.1 backchannel transport module.
# 
# Unpacking initramfs
# 
# Break key hit 
# Break at an unknown location
#  exit 



3,小结

之前搞嵌入式,linux的启动信息很熟悉,但是如果想知道linux启动过程中,硬件的具体工作时序,几乎是不可能的,现在板子上所有设备的每个clock的状态,通过RTL仿真,即可实现。

enjoy!

你可能感兴趣的:(OpenRisc-66-基于ORPSoC对linux进行RTL仿真)