FPGA高速设计tips

速度,在FPGA设计中包括三个方面的含义:流量(吞吐量,Throughput),时滞(latency)和时序(timing)。其中流量或吞吐量指的是每个时钟处理的数据量;时滞指从数据输入到处理结束输出数据经过的时钟延时;时序指的是时序元件之间的延时.比如我们说一个设计不满足时序时指的是关键路径的延时,即两个触发器之间的延时大于时钟周期。

1.高流量

流水线设计可以提高设计的吞吐率,缺点是增加了面积。比如下面一个求三次冥采用迭代方法的实现:

   1:  module power3(
   2:      output [7:0] XPower,
   3:      output finished,
   4:      input [7:0] X,
   5:      input clk, start); // the duration of start is a single clock
   6:  reg [7:0] ncount;
   7:  reg [7:0] XPower;
   8:  assign finished = (ncount == 0);
   9:  always@(posedge clk)
  10:      if(start) begin
  11:      XPower <= X;
  12:      ncount <= 2;
  13:     end
  14:     else if(!finished) begin
  15:        ncount <= ncount - 1;
  16:       XPower <= XPower * X;
  17:  end
  18:  endmodule
综合的结果如下:
image 
这个设计吞吐率为8/3,或2.7位/时钟;时滞为3时钟,时序为关键路径延时为一个乘法器。
将迭代环路拆开,使用流水线方式实现如下:
   1:  module power3(
   2:  output reg [7:0] XPower,
   3:  input clk,
   4:  input [7:0] X
   5:  );
   6:  reg [7:0] XPower1, XPower2;
   7:  reg [7:0] X1, X2;
   8:  always @(posedge clk) begin
   9:  // Pipeline stage 1
  10:  X1 <= X;
  11:  XPower1 <= X;
  12:  // Pipeline stage 2
  13:  X2 <= X1;
  14:  XPower2 <= XPower1 * X1;
  15:  // Pipeline stage 3
  16:  XPower <= XPower2 * X2;
  17:  end
  18:  endmodule

综合的结果如下

image

这个设计吞吐率为8,时滞为3个时钟,关键路径延时仍为一个乘法器。

2.低时滞

低时滞通过最小化中间延时的延时实现快速地输入传播到输出的设计,采用并行性,去除流水线,缩短逻辑。如将上面例子中流水线实现的求三次冥的过程将流水线去除,使输入到输出的时序最小化。

   1:  module power3(
   2:  output [7:0] XPower,
   3:  input [7:0] X
   4:  );
   5:  reg [7:0] XPower1, XPower2;
   6:  reg [7:0] X1, X2;
   7:  assign XPower = XPower2 * X2;
   8:  always @* begin
   9:  X1 = X;
  10:  XPower1 = X;
  11:  end
  12:  always @* begin
  13:  X2 = X1;
  14:  XPower2 = XPower1*X1;
  15:  end
  16:  endmodule
综合的结果如下:

image

流量为8位/时钟,时滞在一个乘法器和两个乘法器延时之间,0时钟;关键路径延时为2个乘法器。

可见去除流水线减少了时滞,却增加了组合延时。

3 时序

关键路径的延时决定了系统时钟的最大速度,逻辑延时包括时钟沿到来到数据输出的延时、组合逻辑延时、布线延时、建立时间延时

image 和两个触发器之间时钟的延时。

方法加1增寄存器。 在时滞不影响设计情况下,在关键路径中加入寄存器可缩短关键延时,提高时钟速度。如下一个FIR滤波器的例子。

   1:  module fir(
   2:  output [7:0] Y,
   3:  input [7:0] A, B, C, X,
   4:  input clk,
   5:  input validsample);
   6:  reg [7:0] X1, X2, Y;
   7:  always @(posedge clk)
   8:  if(validsample) begin
   9:  X1 <= X;
  10:  X2 <= X1;
  11:  Y <= A* X+B* X1+C* X2;
  12:  end
  13:  endmodule

综合的结果如下:

image

其中关键路径延时为一个乘法器延时和一个加法器延时。可以在乘法器和加法器中间加入一个寄存器,可以缩短关键路径延时,此时关键路径延时为一个乘法器如下图:

   1:  module fir(
   2:  output [7:0] Y,
   3:  input [7:0] A, B, C, X,
   4:  input clk,
   5:  input validsample);
   6:  reg [7:0] X1, X2, Y;
   7:  reg [7:0] prod1, prod2, prod3;
   8:  always @ (posedge clk) begin
   9:  if(validsample) begin
  10:  X1 <= X;
  11:  X2 <= X1;
  12:  prod1 <= A * X;
  13:  prod2 <= B * X1;
  14:  prod3 <= C * X2;
  15:  end
  16:  Y <= prod1 + prod2 + prod3;
  17:  end
  18:  endmodule

综合结果如下:

image

缩短关键路径的方法还有将系统采用并行设计,拆分成子单元;在一些存在优先级编码的设计中,去除优先级;

 

总结:速度和面积是一对互相矛盾、互相制约的量,即提高速度要牺牲面积,减小面积会降低速度,在设计中要权衡各种因素,优化。

Matthew 5:43-45“[Love for Enemies] “You have heard that it was said, ‘Love your neighbor and hate your enemy.’ But I tell you, love your enemies and pray for those who persecute you, that you may be children of your Father in heaven. He causes his sun to rise on the evil and the good, and sends rain on the righteous and the unrighteous.”

你可能感兴趣的:(tips)