加法器的verilog实现(串行进位、并联、超前进位、流水线)

    
加法器的verilog实现(串行进位、并联、超前进位、流水线)

 

总结:

从下面的Timing summary来看,流水线的频率最高、并行加法器次之,串行进位加法器再次,超前进位加法器最慢。

按理论,超前进位加法器应该比串行进位加法器快,此处为何出现这种情况,原因未知。

并行加法器因为使用加法符号实现的,从RTL图上也可以看到,具体是用加法器实现的,这个加法器是经过优化的,因此速度比较快。

流水线因为减小了组合逻辑的延时,因此可以达到较高的运行频率(注意运行速度与响应速度是不同的概念)。具体是通过缓存中间结果,从而分割组合逻辑实现流水线的。

相应地,串行进位加法器组合逻辑时延较大,因此速度较慢。

资源占用因为涉及到不同类型资源的比较,不较复杂,此处不再分析。

 注意几点:

  1. 有符号数的加法,要进行符号位扩展;
  2. 对流水线加法,注意最后一级是符号扩展,前几级是0扩展;

代码以及综合结果如下。

8bit级联(即串行进位)加法器:

复制代码
  1 module adder_serial(rst_n,
  2                             clk,
  3                             a,
  4                             b,
  5                             cin,
  6                             sum
  7                                  );
  8      
  9 parameter     DATA_SIZE = 8;
 10      
 11 input rst_n;
 12 input clk;
 13 input [DATA_SIZE - 1 : 0] a;
 14 input [DATA_SIZE - 1 : 0] b;    
 15 input cin;
 16 
 17 output [DATA_SIZE : 0] sum;
 18 
 19 reg [DATA_SIZE - 1 : 0] a_r;
 20 reg [DATA_SIZE - 1 : 0] b_r;
 21 reg cin_r;
 22 
 23 wire [DATA_SIZE : 0] sum_tmp;
 24 wire     [DATA_SIZE - 1 : 0] cout_tmp;
 25 
 26 reg [DATA_SIZE : 0] sum;
 27 
 28 always@(posedge clk)
 29     if(!rst_n)
 30         begin
 31             a_r <= 8'd0;
 32             b_r <= 8'd0;
 33             cin_r <= 1'b0;
 34         end
 35     else
 36         begin
 37             a_r <= a;
 38             b_r <= b;
 39             cin_r <= cin;
 40         end
 41 //级联加法器,本级的进位输出作为下一级的进位输入    
 42 full_adder_1bit u0_full_adder_1bit (
 43     .a(a_r[0]), 
 44     .b(b_r[0]), 
 45     .cin(cin), 
 46     .sum(sum_tmp[0]), 
 47     .cout(cout_tmp[0])
 48     );
 49      
 50 full_adder_1bit u1_full_adder_1bit (
 51     .a(a_r[1]), 
 52     .b(b_r[1]), 
 53     .cin(cout_tmp[0]), 
 54     .sum(sum_tmp[1]), 
 55     .cout(cout_tmp[1])
 56     );
 57  
 58 full_adder_1bit u2_full_adder_1bit (
 59     .a(a_r[2]), 
 60     .b(b_r[2]), 
 61     .cin(cout_tmp[1]), 
 62     .sum(sum_tmp[2]), 
 63     .cout(cout_tmp[2])
 64     );
 65 
 66 full_adder_1bit u3_full_adder_1bit (
 67     .a(a_r[3]), 
 68     .b(b_r[3]), 
 69     .cin(cout_tmp[2]), 
 70     .sum(sum_tmp[3]), 
 71     .cout(cout_tmp[3])
 72     );
 73      
 74 full_adder_1bit u4_full_adder_1bit (
 75     .a(a_r[4]), 
 76     .b(b_r[4]), 
 77     .cin(cout_tmp[3]), 
 78     .sum(sum_tmp[4]), 
 79     .cout(cout_tmp[4])
 80     );
 81 
 82 full_adder_1bit u5_full_adder_1bit (
 83     .a(a_r[5]), 
 84     .b(b_r[5]), 
 85     .cin(cout_tmp[4]), 
 86     .sum(sum_tmp[5]), 
 87     .cout(cout_tmp[5])
 88     );
 89      
 90 full_adder_1bit u6_full_adder_1bit (
 91     .a(a_r[6]), 
 92     .b(b_r[6]), 
 93     .cin(cout_tmp[5]), 
 94     .sum(sum_tmp[6]), 
 95     .cout(cout_tmp[6])
 96     );
 97 
 98 full_adder_1bit u7_full_adder_1bit (
 99     .a(a_r[7]), 
100     .b(b_r[7]), 
101     .cin(cout_tmp[6]), 
102     .sum(sum_tmp[7]), 
103     .cout(cout_tmp[7])
104     );
105 
106 full_adder_1bit u8_full_adder_1bit (  //为计算有符号数的加法,加上符号扩展
107     .a(a_r[7]), 
108     .b(b_r[7]), 
109     .cin(cout_tmp[7]), 
110     .sum(sum_tmp[8]), 
111     .cout()
112     );
113 
114 always@(posedge clk)
115     if(!rst_n)
116         begin
117             sum <= 9'd0;
118         end
119     else
120         begin
121             //sum <= {cout_tmp[7],sum_tmp};    //无符号相加时的结果
122             sum <= sum_tmp;        //有符号相加时的结果            
123         end
124 
125 endmodule
复制代码

testbench:

复制代码
module adder_serial_tb;
// Inputs
reg rst_n;
reg clk;
reg [7:0] a;
reg [7:0] b;
reg cin;

// Outputs
wire [8:0] sum;

// Instantiate the Unit Under Test (UUT)

adder_serial uut (
.rst_n(rst_n),
.clk(clk),
.cin(cin),
.a(a),
.b(b),
.sum(sum)
);

parameter CLK_PERIOD = 10;

initial begin
    rst_n = 0;
    clk = 1;
    cin = 0;

    #100;
    rst_n = 1;

end

always #(CLK_PERIOD/2) clk =~clk;

always@(posedge clk)
if(!rst_n)
    begin
        a = 0;
        b = 0;
    end
else
    begin
        a = a + 1;
        b = b - 2;
    end    

endmodule

复制代码

仿真结果:

观察负数、正数相加结果:

加法器的verilog实现(串行进位、并联、超前进位、流水线)_第1张图片

可以看到,结果正确。

器件与仿真工具:

Target Device:

xc5vsx95t-1ff1136

Product Version:

ISE 13.1

综合与静态时序报告分析

RTL电路:

加法器的verilog实现(串行进位、并联、超前进位、流水线)_第2张图片

资源占用:

Device Utilization Summary

[-]

Slice Logic Utilization

Used

Available

Utilization

Note(s)

Number of Slice Registers

25

58,880

1%

 

    Number used as Flip Flops

25

 

 

 

Number of Slice LUTs

17

58,880

1%

 

    Number used as logic

17

58,880

1%

 

        Number using O6 output only

17

 

 

 

Number of occupied Slices

8

14,720

1%

 

Number of LUT Flip Flop pairs used

30

 

 

 

    Number with an unused Flip Flop

5

30

16%

 

    Number with an unused LUT

13

30

43%

 

    Number of fully used LUT-FF pairs

12

30

40%

 

    Number of unique control sets

1

 

 

 

    Number of slice register sites lost
        to control set restrictions

3

58,880

1%

 

Number of bonded IOBs

28

640

4%

 

Number of BUFG/BUFGCTRLs

1

32

3%

 

    Number used as BUFGs

1

 

 

 

Average Fanout of Non-Clock Nets

2.54

 

 

 

最快频率:

Timing summary:

 --------------- 
 
  
 Timing errors: 0  Score: 0  (Setup/Max: 0, Hold: 0) 
 
  
 Constraints cover 112 paths, 0 nets, and 87 connections 
 
  
Design statistics: 
    Minimum period:   2.431ns{1}   (Maximum frequency: 411.353MHz) 


并行加法器
思路 :直接用加法运算符“+”实现。
代码:
按 Ctrl+C 复制代码
按 Ctrl+C 复制代码
综合后RTL级电路:
加法器的verilog实现(串行进位、并联、超前进位、流水线)_第3张图片


仿真结果与级联加法器一致。

adder_serial Project Status (06/08/2013 - 16:09:02)

Project File:

adder_multiplier_fir_2013_06_06.xise

Parser Errors:

No Errors

Module Name:

adder_parrel

Implementation State:

Placed and Routed

Target Device:

xc5vsx95t-1ff1136

  • Errors:

No Errors

Product Version:

ISE 13.1

  • Warnings:

2 Warnings (2 new)

Design Goal:

Balanced

  • Routing Results:

All Signals Completely Routed

Design Strategy:

Xilinx Default (unlocked)

  • Timing Constraints:

All Constraints Met

Environment:

System Settings

  • Final Timing Score:

0  (Timing Report)

 
  

 

 
  

Device Utilization Summary

[-]

Slice Logic Utilization

Used

Available

Utilization

Note(s)

Number of Slice Registers

25

58,880

1%

 

    Number used as Flip Flops

25

 

 

 

Number of Slice LUTs

9

58,880

1%

 

    Number used as logic

9

58,880

1%

 

        Number using O6 output only

9

 

 

 

Number of occupied Slices

7

14,720

1%

 

Number of LUT Flip Flop pairs used

25

 

 

 

    Number with an unused Flip Flop

0

25

0%

 

    Number with an unused LUT

16

25

64%

 

    Number of fully used LUT-FF pairs

9

25

36%

 

    Number of unique control sets

1

 

 

 

    Number of slice register sites lost
        to control set restrictions

3

58,880

1%

 

Number of bonded IOBs

27

640

4%

 

Number of BUFG/BUFGCTRLs

1

32

3%

 

    Number used as BUFGs

1

 

 

 

Average Fanout of Non-Clock Nets

1.76

 

 

 


Timing summary: 
 --------------- 
 
  
 
  
 
  
 Timing errors: 0  Score: 0  (Setup/Max: 0, Hold: 0) 
 
  
 
  
 
  
 Constraints cover 126 paths, 0 nets, and 36 connections 
 
  
 
  
 
  
Design statistics: 
 
  
    Minimum period:   2.261ns{1}   (Maximum frequency: 442.282MHz) 

超前进位加法器:

代码:

复制代码
  1 module adder_carry_ahead(rst_n,
  2                                 clk,
  3                                 a,
  4                                 b,
  5                                 cin,
  6                                 sum
  7                                      );
  8      
  9 parameter     DATA_SIZE = 8;
 10      
 11 input rst_n;
 12 input clk;
 13 input [DATA_SIZE - 1 : 0] a;
 14 input [DATA_SIZE - 1 : 0] b;    
 15 input cin;
 16 
 17 output [DATA_SIZE : 0] sum;
 18 
 19 reg [DATA_SIZE - 1 : 0] a_r;
 20 reg [DATA_SIZE - 1 : 0] b_r;
 21 reg cin_r;
 22 
 23 wire [DATA_SIZE - 1 : 0] G;
 24 wire [DATA_SIZE - 1 : 0] P;
 25 
 26 wire [DATA_SIZE: 0] sum_tmp;
 27 wire     [DATA_SIZE - 1 : 0] cout_tmp;
 28 
 29 reg [DATA_SIZE : 0] sum;
 30 
 31 always@(posedge clk)
 32     if(!rst_n)
 33         begin
 34             a_r <= 8'd0;
 35             b_r <= 8'd0;
 36             cin_r <= 1'b0;
 37         end
 38     else
 39         begin
 40             a_r <= a;
 41             b_r <= b;
 42             cin_r <= cin;
 43         end
 44 
 45 always@(posedge clk)
 46     if(!rst_n)
 47         begin
 48             sum <= 9'd0;
 49         end
 50     else
 51         begin
 52             //sum <= {cout_tmp[7],sum_tmp};    //无符号相加时的结果
 53             sum <= sum_tmp;        //有符号相加时的结果    
 54         end
 55 /*
 56 assign G[0] =  a_r[0] & b_r[0];        //第0bit加法
 57 assign P[0] =  a_r[0] + b_r[0];
 58 assign sum_tmp[0] = a_r[0] ^ b_r[0] ^ cin_r;
 59 assign cout_tmp[0] =  G[0] + P[0] & cin_r;     
 60            
 61 assign G[1] =  a_r[1] & b_r[1];        //第1bit加法
 62 assign P[1] =  a_r[1] + b_r[1];
 63 assign sum_tmp[1] = a_r[1] ^ b_r[1] ^ cout_tmp[0];
 64 assign cout_tmp[1] =  G[1] + P[1] & cout_tmp[0];
 65 
 66 
 67 assign G[2] =  a_r[2] & b_r[2];        //第2bit加法
 68 assign P[2] =  a_r[2] + b_r[2];
 69 assign sum_tmp[2] = a_r[2] ^ b_r[2] ^ cout_tmp[1];
 70 assign cout_tmp[2] =  G[2] + P[2] & cout_tmp[1];     
 71 
 72 assign G[3] =  a_r[3] & b_r[3];        //第3bit加法
 73 assign P[3] =  a_r[3] + b_r[3];
 74 assign sum_tmp[3] = a_r[3] ^ b_r[3] ^ cout_tmp[2];
 75 assign cout_tmp[3] =  G[3] + P[3] & cout_tmp[2]; 
 76 
 77 assign G[4] =  a_r[4] & b_r[4];        //第4bit加法
 78 assign P[4] =  a_r[4] + b_r[4];
 79 assign sum_tmp[4] = a_r[4] ^ b_r[4] ^ cout_tmp[3];
 80 assign cout_tmp[4] =  G[4] + P[4] & cout_tmp[3]; 
 81 
 82 assign G[5] =  a_r[5] & b_r[5];        //第5bit加法
 83 assign P[5] =  a_r[5] + b_r[5];
 84 assign sum_tmp[5] = a_r[5] ^ b_r[5] ^ cout_tmp[4];
 85 assign cout_tmp[5] =  G[5] + P[5] & cout_tmp[4];
 86 
 87 assign G[6] =  a_r[6] & b_r[6];        //第6bit加法
 88 assign P[6] =  a_r[6] + b_r[6];
 89 assign sum_tmp[6] = a_r[6] ^ b_r[6] ^ cout_tmp[5];
 90 assign cout_tmp[6] =  G[6] + P[6] & cout_tmp[5];
 91 
 92 assign G[7] =  a_r[7] & b_r[7];        //第7bit加法
 93 assign P[7] =  a_r[7] + b_r[7];
 94 assign sum_tmp[7] = a_r[7] ^ b_r[7] ^ cout_tmp[6];
 95 assign cout_tmp[7] =  G[7] + P[7] & cout_tmp[6];
 96 
 97 assign sum_tmp[8] = a_r[7] ^ b_r[7] ^ cout_tmp[7];
 98 */
 99 
100 //+与|的功能是等价的,但是若按照上面的方式
101 //仿真结果错误
102 assign G[0] =  a_r[0] & b_r[0];        //第0bit加法
103 assign P[0] =  a_r[0] | b_r[0];
104 assign sum_tmp[0] = a_r[0] ^ b_r[0] ^ cin_r;
105 assign cout_tmp[0] =  G[0] | P[0] & cin_r;     
106            
107 assign G[1] =  a_r[1] & b_r[1];        //第1bit加法
108 assign P[1] =  a_r[1] | b_r[1];
109 assign sum_tmp[1] = a_r[1] ^ b_r[1] ^ cout_tmp[0];
110 assign cout_tmp[1] =  G[1] | P[1] & cout_tmp[0];
111 
112 assign G[2] =  a_r[2] & b_r[2];        //第2bit加法
113 assign P[2] =  a_r[2] | b_r[2];
114 assign sum_tmp[2] = a_r[2] ^ b_r[2] ^ cout_tmp[1];
115 assign cout_tmp[2] =  G[2] | P[2] & cout_tmp[1];     
116 
117 assign G[3] =  a_r[3] & b_r[3];        //第3bit加法
118 assign P[3] =  a_r[3] | b_r[3];
119 assign sum_tmp[3] = a_r[3] ^ b_r[3] ^ cout_tmp[2];
120 assign cout_tmp[3] =  G[3] | P[3] & cout_tmp[2]; 
121 
122 assign G[4] =  a_r[4] & b_r[4];        //第4bit加法
123 assign P[4] =  a_r[4] | b_r[4];
124 assign sum_tmp[4] = a_r[4] ^ b_r[4] ^ cout_tmp[3];
125 assign cout_tmp[4] =  G[4] | P[4] & cout_tmp[3]; 
126 
127 assign G[5] =  a_r[5] & b_r[5];        //第5bit加法
128 assign P[5] =  a_r[5] | b_r[5];
129 assign sum_tmp[5] = a_r[5] ^ b_r[5] ^ cout_tmp[4];
130 assign cout_tmp[5] =  G[5] | P[5] & cout_tmp[4];
131 
132 assign G[6] =  a_r[6] & b_r[6];        //第6bit加法
133 assign P[6] =  a_r[6] | b_r[6];
134 assign sum_tmp[6] = a_r[6] ^ b_r[6] ^ cout_tmp[5];
135 assign cout_tmp[6] =  G[6] | P[6] & cout_tmp[5];
136 
137 assign G[7] =  a_r[7] & b_r[7];        //第7bit加法
138 assign P[7] =  a_r[7] | b_r[7];
139 assign sum_tmp[7] = a_r[7] ^ b_r[7] ^ cout_tmp[6];
140 assign cout_tmp[7] =  G[7] | P[7] & cout_tmp[6];
141 
142 //为了计算有符号数的加法,加上符号扩展
143 assign sum_tmp[8] = a_r[7] ^ b_r[7] ^ cout_tmp[7];
144 
145 endmodule
复制代码

仿真结果前两种加法器。

RTL级电路:

加法器的verilog实现(串行进位、并联、超前进位、流水线)_第4张图片

资源占用:

adder_carry_ahead Project Status

Project File:

adder_multiplier_fir_2013_06_06.xise

Parser Errors:

No Errors

Module Name:

adder_carry_ahead

Implementation State:

Placed and Routed

Target Device:

xc5vsx95t-1ff1136

  • Errors:

No Errors

Product Version:

ISE 13.1

  • Warnings:

1 Warning (0 new)

Design Goal:

Balanced

  • Routing Results:

All Signals Completely Routed

Design Strategy:

Xilinx Default (unlocked)

  • Timing Constraints:

All Constraints Met

Environment:

System Settings

  • Final Timing Score:

0  (Timing Report)

 

Device Utilization Summary

[-]

Slice Logic Utilization

Used

Available

Utilization

Note(s)

Number of Slice Registers

26

58,880

1%

 

    Number used as Flip Flops

26

 

 

 

Number of Slice LUTs

17

58,880

1%

 

    Number used as logic

17

58,880

1%

 

        Number using O6 output only

17

 

 

 

Number of occupied Slices

10

14,720

1%

 

Number of LUT Flip Flop pairs used

32

 

 

 

    Number with an unused Flip Flop

6

32

18%

 

    Number with an unused LUT

15

32

46%

 

    Number of fully used LUT-FF pairs

11

32

34%

 

    Number of unique control sets

1

 

 

 

    Number of slice register sites lost
        to control set restrictions

2

58,880

1%

 

Number of bonded IOBs

28

640

4%

 

Number of BUFG/BUFGCTRLs

1

32

3%

 

    Number used as BUFGs

1

 

 

 

Average Fanout of Non-Clock Nets

2.53

 

 

 

 

 Timing summary:

 --------------- 
 
  
 Timing errors: 0  Score: 0  (Setup/Max: 0, Hold: 0) 
 
  
 Constraints cover 121 paths, 0 nets, and 90 connections 
 
  
Design statistics: 
    Minimum period:   3.309ns{1}   (Maximum frequency: 302.206MHz) 

流水线加法器:
按 Ctrl+C 复制代码
按 Ctrl+C 复制代码
仿真结果与前三种加法器一致。
RTL级电路图:
加法器的verilog实现(串行进位、并联、超前进位、流水线)_第5张图片

资源占用:

adder_pipeline Project Status

Project File:

adder_multiplier_fir_2013_06_06.xise

Parser Errors:

No Errors

Module Name:

adder_pipeline

Implementation State:

Placed and Routed

Target Device:

xc5vsx95t-1ff1136

  • Errors:

No Errors

Product Version:

ISE 13.1

  • Warnings:

50 Warnings (50 new)

Design Goal:

Balanced

  • Routing Results:

All Signals Completely Routed

Design Strategy:

Xilinx Default (unlocked)

  • Timing Constraints:

All Constraints Met

Environment:

System Settings

  • Final Timing Score:

0  (Timing Report)

 

Device Utilization Summary

[-]

Slice Logic Utilization

Used

Available

Utilization

Note(s)

Number of Slice Registers

65

58,880

1%

 

    Number used as Flip Flops

65

 

 

 

Number of Slice LUTs

24

58,880

1%

 

    Number used as logic

18

58,880

1%

 

        Number using O6 output only

18

 

 

 

    Number used as Memory

6

24,320

1%

 

        Number used as Shift Register

6

 

 

 

            Number using O6 output only

6

 

 

 

Number of occupied Slices

22

14,720

1%

 

Number of LUT Flip Flop pairs used

65

 

 

 

    Number with an unused Flip Flop

0

65

0%

 

    Number with an unused LUT

41

65

63%

 

    Number of fully used LUT-FF pairs

24

65

36%

 

    Number of unique control sets

3

 

 

 

    Number of slice register sites lost
        to control set restrictions

5

58,880

1%

 

Number of bonded IOBs

28

640

4%

 

Number of BUFG/BUFGCTRLs

1

32

3%

 

    Number used as BUFGs

1

 

 

 

Average Fanout of Non-Clock Nets

2.13

 

 

 

 Timing summary:

 --------------- 
 
  
 Timing errors: 0  Score: 0  (Setup/Max: 0, Hold: 0) 
 
  
 Constraints cover 93 paths, 0 nets, and 116 connections 
 
  
Design statistics: 
    Minimum period:   2.221ns{1}   (Maximum frequency: 450.248MHz) 
 
  

你可能感兴趣的:(数字集成电路)