EDA开源仿真工具verilator入门8:verilator 5.0 最新版本仿真玄铁性能对比

verilator最新已经升级到到了5.005,新版本的verilator在功能上更加完善,加入对Semantic Scheduling算法标准的支持,但因此效率肯定会有所损失,本节将接上一节测试最新版本性能变化,首先测试原始版本verilator的性能,我们先看下目前的verilator版本,输入:

verilator --version

结果如下:

Verilator 4.220 2022-03-12 rev v4.220

接着以玄铁为例,具体步骤参考上一节,我们重新写以下主仿真文件sim_main1.cpp,直接给出全部内容:

// DESCRIPTION: Verilator: Verilog example module
//
// This file ONLY is placed under the Creative Commons Public Domain, for
// any use, without warranty, 2017 by Wilson Snyder.
// SPDX-License-Identifier: CC0-1.0
//======================================================================
#include 
// For std::unique_ptr
#include 

// Include common routines
#include 

// Include model header, generated from Verilating "top.v"
#include "Vtop.h"

#include 

// Legacy function required only so linking works on Cygwin and MSVC++
double sc_time_stamp() { return 0; }

int main(int argc, char** argv, char** env) {
    // This is a more complicated example, please also see the simpler examples/make_hello_c.

    // Prevent unused variable warnings
    if (false && argc && argv && env) {}

    // Create logs/ directory in case we have traces to put under it
    Verilated::mkdir("logs");

    // Construct a VerilatedContext to hold simulation time, etc.
    // Multiple modules (made later below with Vtop) may share the same
    // context to share time, or modules may have different contexts if
    // they should be independent from each other.

    // Using unique_ptr is similar to
    // "VerilatedContext* contextp = new VerilatedContext" then deleting at end.
    const std::unique_ptr contextp{new VerilatedContext};

    // Set debug level, 0 is off, 9 is highest presently used
    // May be overridden by commandArgs argument parsing
    contextp->debug(0);

    // Randomization reset policy
    // May be overridden by commandArgs argument parsing
    contextp->randReset(2);

    // Verilator must compute traced signals
    contextp->traceEverOn(true);

    // Pass arguments so Verilated code can see them, e.g. $value$plusargs
    // This needs to be called before you create any model
    contextp->commandArgs(argc, argv);

    // Construct the Verilated model, from Vtop.h generated from Verilating "top.v".
    // Using unique_ptr is similar to "Vtop* top = new Vtop" then deleting at end.
    // "TOP" will be the hierarchical name of the module.
    const std::unique_ptr top{new Vtop{contextp.get(), "TOP"}};

    // Set Vtop's input signals
    int cntNum = 100000;
    top->clk = 0;
    int i = 0;
    struct timeval StartTime;
    struct timeval EndTime;
    double TimeUse=0;

    gettimeofday(&StartTime, NULL);

    // Simulate until $finish
    while (!contextp->gotFinish()) {
        // Historical note, before Verilator 4.200 Verilated::gotFinish()
        // was used above in place of contextp->gotFinish().
        // Most of the contextp-> calls can use Verilated:: calls instead;
        // the Verilated:: versions simply assume there's a single context
        // being used (per thread).  It's faster and clearer to use the
        // newer contextp-> versions.

        contextp->timeInc(1);  // 1 timeprecision period passes...
        // Historical note, before Verilator 4.200 a sc_time_stamp()
        // function was required instead of using timeInc.  Once timeInc()
        // is called (with non-zero), the Verilated libraries assume the
        // new API, and sc_time_stamp() will no longer work.

        // Toggle a fast (time/2 period) clock
        top->clk = !top->clk;

        // Toggle control signals on an edge that doesn't correspond
        // to where the controls are sampled; in this example we do
        // this only on a negedge of clk, because we know
        // reset is not sampled there.
        //if (!top->clk) {
        //    if (contextp->time() > 1 && contextp->time() < 10) {
        //        top->reset_l = !1;  // Assert reset
        //    } else {
        //        top->reset_l = !0;  // Deassert reset
        //    }
        //    // Assign some other inputs
        //    top->in_quad += 0x12;
        //}

        // Evaluate model
        // (If you have multiple models being simulated in the same
        // timestep then instead of eval(), call eval_step() on each, then
        // eval_end_step() on each. See the manual.)
        top->eval();
        if (i > cntNum)
        break;
        if ( (i % 1000) == 0) {
            std::cout << i << " cycles have run" << std::endl;
        }
        i++;
        // Read outputs
        //VL_PRINTF("[%" VL_PRI64 "d] clk=%x rstl=%x iquad=%" VL_PRI64 "x"
        //          " -> oquad=%" VL_PRI64 "x owide=%x_%08x_%08x\n",
        //          contextp->time(), top->clk, top->reset_l, top->in_quad, top->out_quad,
        //          top->out_wide[2], top->out_wide[1], top->out_wide[0]);
    }
    gettimeofday(&EndTime, NULL);
    TimeUse = 1000000*(EndTime.tv_sec-StartTime.tv_sec)+EndTime.tv_usec-StartTime.tv_usec;
    std::cout << "The time was: " <<  (double)(TimeUse / 1000000) << "s"<< std::endl;
    // Final model cleanup
    top->final();

    // Coverage analysis (calling write only after the test is known to pass)
#if VM_COVERAGE
    Verilated::mkdir("logs");
    contextp->coveragep()->write("logs/coverage.dat");
#endif

    // Return good completion status
    // Don't use exit() or destructor won't get called
    return 0;
}

运行仿真程序Vtop仿真结果如下:

-Info: /home/s/shenzhou/xuantie_test/openc910/smart_run/logical/tb/tb_verilator.v:371: $dumpvar ignored, as Verilated without --trace
	********* Init Program *********
	********* Wipe memory to 0 *********
	********* Read program *********
%Warning: inst.pat:0: $readmem file not found
%Warning: data.pat:0: $readmem file not found
	********* Load program to memory *********
0 cycles have run
1000 cycles have run
2000 cycles have run
3000 cycles have run
4000 cycles have run
5000 cycles have run
6000 cycles have run
7000 cycles have run
8000 cycles have run
9000 cycles have run
10000 cycles have run
11000 cycles have run
12000 cycles have run
13000 cycles have run
14000 cycles have run
15000 cycles have run
16000 cycles have run
17000 cycles have run
18000 cycles have run
19000 cycles have run
20000 cycles have run
21000 cycles have run
22000 cycles have run
23000 cycles have run
24000 cycles have run
25000 cycles have run
26000 cycles have run
27000 cycles have run
28000 cycles have run
29000 cycles have run
30000 cycles have run
31000 cycles have run
32000 cycles have run
33000 cycles have run
34000 cycles have run
35000 cycles have run
36000 cycles have run
37000 cycles have run
38000 cycles have run
39000 cycles have run
40000 cycles have run
41000 cycles have run
42000 cycles have run
43000 cycles have run
44000 cycles have run
45000 cycles have run
46000 cycles have run
47000 cycles have run
48000 cycles have run
49000 cycles have run
50000 cycles have run
51000 cycles have run
52000 cycles have run
53000 cycles have run
54000 cycles have run
55000 cycles have run
56000 cycles have run
57000 cycles have run
58000 cycles have run
59000 cycles have run
60000 cycles have run
61000 cycles have run
62000 cycles have run
63000 cycles have run
64000 cycles have run
65000 cycles have run
66000 cycles have run
67000 cycles have run
68000 cycles have run
69000 cycles have run
70000 cycles have run
71000 cycles have run
72000 cycles have run
73000 cycles have run
74000 cycles have run
75000 cycles have run
76000 cycles have run
77000 cycles have run
78000 cycles have run
79000 cycles have run
80000 cycles have run
81000 cycles have run
82000 cycles have run
83000 cycles have run
84000 cycles have run
85000 cycles have run
86000 cycles have run
87000 cycles have run
88000 cycles have run
89000 cycles have run
90000 cycles have run
91000 cycles have run
92000 cycles have run
93000 cycles have run
94000 cycles have run
95000 cycles have run
96000 cycles have run
97000 cycles have run
98000 cycles have run
99000 cycles have run
100000 cycles have run
The time was: 10.9586s

也就是100000个cycle需要大概11s。

升级verilator,详细参考安装调试,在verilator源代码的master分支输入:

git pull

输入:

git tag

可以看到最新tag:

v4.218
v4.220
v4.222
v4.224
v4.226
v4.228
v5.002
v5.004
v5.005

切换到最新的tag,按之前介绍的方法编译安装(gcc>=9.0),安装完成后查看版本为:

Verilator 5.005 devel rev v5.004-58-g5fce23e90

先用verilator编译生成文件:

verilator --no-timing --timescale 1ns/100fs -Os -x-assign 0 --threads 4 -Wno-fatal  --cc --exe --top-module top ../vrlt_cfg.vlt -f ../logical/filelists/sim_verilator.fl ../logical/tb/sim_main1.cpp

这里添加了一个verilator的配置文件vrlt_cfg.vlt内容如下:

`verilator_config
split_var -module "ct_ifu_ipctrl" -var "missigned_bry_vld"
split_var -module "ct_idu_dep_reg_src2_entry" -var "x_read_data"
split_var -module "ct_l2c_data" -var "data_ram_cen"
split_var -module "ct_cp0_regs" -var "local_icg_en"
split_var -module "plic_hreg_busif" -var "hart_int_cmplt_vld_tmp"
split_var -module "plic_hreg_busif" -var "hart_int_claim_vld_tmp"
split_var -module "plic_hreg_busif" -var "mie_lst_read_tmp"
split_var -module "plic_hreg_busif" -var "sie_lst_read_tmp"
split_var -module "plic_32to1_arb" -var "tmp_sel_out"
split_var -module "plic_kid_busif" -var "prio_lst_read_tmp"
split_var -module "plic_granu_arb" -var "tmp_out"
split_var -module "plic_hreg_busif" -var "hart_claim_read_data_tmp"
split_var -module "plic_hreg_busif" -var "hart_ict_read_data_tmp"
split_var -module "csky_apb_1tox_matrix" -var "slv_pready_data_pre"
split_var -module "csky_apb_1tox_matrix" -var "slv_pready_pslverr_pre"
split_var -module "plic_kid_busif" -var "ip_read_data_tmp"
split_var -module "plic_granu_arb" -var "tmp_pos_out"
split_var -module "ct_idu_rf_pipe0_decd" -var "decd_imm_sel"
split_var -module "ct_idu_rf_pipe1_decd" -var "decd_imm_sel"
split_var -module "ct_idu_rf_prf_eregfile" -var "fesr_acc_with_fcr"
split_var -module "ct_idu_id_decd" -var "decd_sel"
split_var -module "ct_iu_bju_pcfifo" -var "pcfifo_pop2_bypass_sel"
split_var -module "ct_iu_bju_pcfifo" -var "pcfifo_pop1_bypass_sel"
split_var -module "ct_iu_bju_pcfifo" -var "pcfifo_pop0_bypass_sel"

这里用到了verilator的split_var选项,将对应的module的复杂维度输入信号做变量的切分,如果不做切分会出问题(多线程编译生成的仿真程序会出现core的现象),在编译仿真程序后执行仿真程序,最终结果如下:

0 cycles have run
1000 cycles have run
2000 cycles have run
3000 cycles have run
4000 cycles have run
5000 cycles have run
6000 cycles have run
7000 cycles have run
8000 cycles have run
9000 cycles have run
10000 cycles have run
11000 cycles have run
12000 cycles have run
13000 cycles have run
14000 cycles have run
15000 cycles have run
16000 cycles have run
17000 cycles have run
18000 cycles have run
19000 cycles have run
20000 cycles have run
21000 cycles have run
22000 cycles have run
23000 cycles have run
24000 cycles have run
25000 cycles have run
26000 cycles have run
27000 cycles have run
28000 cycles have run
29000 cycles have run
30000 cycles have run
31000 cycles have run
32000 cycles have run
33000 cycles have run
34000 cycles have run
35000 cycles have run
36000 cycles have run
37000 cycles have run
38000 cycles have run
39000 cycles have run
40000 cycles have run
41000 cycles have run
42000 cycles have run
43000 cycles have run
44000 cycles have run
45000 cycles have run
46000 cycles have run
47000 cycles have run
48000 cycles have run
49000 cycles have run
50000 cycles have run
51000 cycles have run
52000 cycles have run
53000 cycles have run
54000 cycles have run
55000 cycles have run
56000 cycles have run
57000 cycles have run
58000 cycles have run
59000 cycles have run
60000 cycles have run
61000 cycles have run
62000 cycles have run
63000 cycles have run
64000 cycles have run
65000 cycles have run
66000 cycles have run
67000 cycles have run
68000 cycles have run
69000 cycles have run
70000 cycles have run
71000 cycles have run
72000 cycles have run
73000 cycles have run
74000 cycles have run
75000 cycles have run
76000 cycles have run
77000 cycles have run
78000 cycles have run
79000 cycles have run
80000 cycles have run
81000 cycles have run
82000 cycles have run
83000 cycles have run
84000 cycles have run
85000 cycles have run
86000 cycles have run
87000 cycles have run
88000 cycles have run
89000 cycles have run
90000 cycles have run
91000 cycles have run
92000 cycles have run
93000 cycles have run
94000 cycles have run
95000 cycles have run
96000 cycles have run
97000 cycles have run
98000 cycles have run
99000 cycles have run
100000 cycles have run
The time was: 13.2089s

可以看到,最新版本的verilator由于加入了更多的支持,因此性能要变差一些,但整体差距并不是太大。

你可能感兴趣的:(FPGA+EDA,c++,开发语言)