HDL4SE:软件工程师学习Verilog语言(十)

10 状态机

经过前面的学习,应该已经了解verilog的基本用法了。然而对于初学者,可能很奇怪的发现,似乎还是不会做什么东西,如果遇上一个比较复杂的问题,感觉还是无从下手。这是正常的,拿到驾照不敢上路的司机并不少见,音乐考试考了满分对着简谱还是唱不出来的学霸我也见过,通过了四六级面对老外照样说不出口的同学也大有人在。说简单点,就是缺乏实战训练。其实还有一个因素,就是缺乏一些比较高级的概念支撑。很多人听说我是学数学的来做工程师,忍不住安慰我,其实所有的工程都来源于数学…,然而就像所有的计算机程序都与图灵机的纸带程序等价,没看到谁用图灵机的纸带来编个游戏啊,工程有工程的基础,不是什么事情都从数学基础开始推导的,会背九九表的就能计算,其实不需要理解其中的数学基础。所以说缺乏训练,也缺乏一些工程上的概念,缺乏领域内成熟的方法支撑,对工程师而言,也是致命的。
本节我们介绍状态机的概念(State Machine),来支持一些算法和程序性的内容的RTL实现,还是通过把前面的俄罗斯方块游戏中的控制器逐步改成全部用verilog来实现,来体会verilog应用的开发过程。
其实某种意义上讲,做软件就是在做状态机,图灵机就是状态机嘛。把编译出来的程序放在内存中,运行到某个地址的代码时,我们就说现在计算机处于该状态,此时状态就是用CPU的PC寄存器的值。PC寄存器中的每个值就代表CPU的状态机状态,这个状态下要做的动作就是执行PC对应位置的指令。如果是执行跳转指令,则由指令来控制CPU状态机下一个状态,如果是执行其他指令,CPU就将状态自动步进到下一条指令位置。这个跟图灵机是一样的,读纸带,根据读到的内容和当前状态移动读写头,可能写纸带,然后切换状态,如此循环不已,直到到达停机条件。
在数字电路中,状态机也是一样用一个类似于PC寄存器一样的状态寄存器表示状态,然后在某个状态下面执行对应的组合电路计算并修改某些任务相关的寄存器,在某些条件下修改下一个状态,然后下一个周期又在新的状态下工作,这就i是状态机的基本概念。有了状态机的支持,我们就可以把任务就像做软件一样分解成一步一步执行的指令,然后将这些指令在状态机的控制下构成顺序,分支,循环等逻辑意义上的流程,最终完成需求规定的任务。
状态机之间可以相互嵌套,一个大的状态机执行过程中可以启动内部的状态机运行,就象软件中调用子程序一样。
讲太抽象了还是不落地啊,下面以前面俄罗斯方块游戏的控制器为例,逐步把它从c语言实现转换为用verilog实现。

10.1 顶层流程图

俄罗斯方块游戏的顶层流程图如下:
HDL4SE:软件工程师学习Verilog语言(十)_第1张图片

可以看到,俄罗斯方块游戏的控制器顶层状态机共有如下状态:

  1. 初始化,系统开始时进入这个状态,此时控制器初始化内部的状态,初始化显示屏幕内容,分数,级别,速度等。每次游戏玩到满屏时,会进入该状态重新开始游戏。
  2. 刷新到屏幕,每次屏幕内容更新,就需要将帧存中的内容刷新到显示屏幕上去
  3. 按键检测,检测是否有关心的按键,如果有按键则跳到按键处理,否则,跳到下移一行的管理
  4. 按键处理,有按键时跳到这里,对不同的键做出不同的处理这里是一个子状态机在控制运行,后面再详细描述
  5. 触底检测,判断当前块是否被底部的方块挡住,如果挡住,则进入当前块固化状态
  6. 当前块固化:将当前块固化在底板上相应位置
  7. 检测满行并消除:检测底板上的每一行是否满行,如果满行,则消除,根据消除的行数更新得分
  8. 生成新块:将前面生成的新块替换到当前块中,生成一个新的块,检测是否满屏,如果满屏则跳到初始化状态,否则跳到刷新到屏幕状态。
  9. 下移一行控制:没有按键时跳到该状态,此时增加内部一个计数器,内部根据当前速度计算一个计数器上限值,计数器到达该值时则跳到下移一行状态,否则跳到按键检测状态(此时屏幕不需要更新)。
  10. 下移动一行:判断是否能下移一行,如果可以,则下移一行,然后进入刷新到屏幕状态,否则进入当前块固化状态。
    下面我们一步步将这个过程改为用verilog实现。

10.2 用ram基本单元来实现帧存

首先将控制器内部的帧存先用外部的ram来实现,而不是用c语言来实现。前面用c写游戏控制器时,其中的游戏面板数据放在c语言的一个数组中,总共是24行,每行16个方块,每个方块4位表示16种颜色,这样总共需要1536位,我们用64位宽的ram来存储,总共需要24个字,地址宽度为5即可。
用RAM实现时,我们用一读一写的RAM(当然可以用多个读写口的那种,不过一般FPGA和ASIC中一读一写的用得最多)。同时我们把刷新到屏幕这个模块放到verilog中来实现。我们将控制器中的当前块信息,下一块信息,得分,消除行数和速度等参数用端口连出来。为了将刷新到屏幕这个模块单独拿出来,我们还将状态变量以及状态控制信号也接出来,并且刷新到屏幕这个状态机完成后,也将完成信息送回到控制器模块中。由verilog模块来控制ram读写端口的共享。用c写的部分也调整内部逻辑,主要是存储器放到外部来了,这样一个周期只能读一个数据,因此控制方法变化比较大,c语言端也要改为全部由状态机方式编程了。
这样,用c语言写的游戏控制器接口的verilog定义如下:

(* 
  HDL4SE="LCOM", 
  CLSID="158fa52-ca8b-4551-9b87-fc7cff466e2a", 
  softmodule="hdl4se" 
*) 
module teris_ctrl
  (
    input           wClk,
    input           nwReset,
    output          wWrite,
    output [5:0]    bWriteAddr,
    output [63:0]   bWriteData,
    output [5:0]    bReadAddr,
    input  [63:0]   bReadData,
    input  [31:0]   bKeyData,
    input           wStateComplete,
    output          wStateChange,
    output [3:0]    bState,
    output [31:0]   bScore,
    output [31:0]   bSpeed,
    output [31:0]   bLevel,
    output [63:0]   bNextBlock,
    output [63:0]   bCurBlock,
    output [15:0]   bCurBlockPos
  );
endmodule

名字虽然与前一个版本一样,但是CLSID改了,因此编译器生成时其实生成的是新的版本。
注意到用c语言写的模块控制顶层状态机,因此它把状态机状态变化和当前状态通过端口接出来,用verilog写的代码来处理刷新到屏幕这个状态,这个状态处理完毕后通过wStateComplete信号通知顶层状态控制机。这里我们演示一种c语言与verilog语言分别写模块,然后协同工作,前面控制器完全用c写的,现在把帧存和刷新到屏幕的状态处理放在外面来了,在后面会将更多的状态放到外面用verilog语言来实现,直到全部用verilog语言实现。
刷新到屏幕用一个模块实现,接口定义如下:

module flushtodisp(
    input           wClk,
    input   [3:0]   bCtrlState,
    output          wCtrlStateComplete,
    output  [5:0]   bFlushReadAddr,
    input   [63:0]  bFlushReadData,
    output          wWrite,
    output  [31:0]  bWriteAddr,
    output  [31:0]  bWriteData,
    input   [31:0]  bCtrlSpeed,
    input   [31:0]  bCtrlLevel,
    input   [31:0]  bCtrlScore,
    input   [63:0]  bNextBlock,
    input   [63:0]  bCurBlock,
    input   [15:0]  bCurBlockPos
);
endmodule

它接入了wClk,内部是一个时序逻辑实现的状态机,毕竟刷新到屏幕也不是一个周期内能够完成的。它接入了控制器的bCtrlState信号,表示目前控制器的状态机状态,如果状态机状态是ST_SLUSHTODISP(1),flushtodisp就开始工作,否则处于等待状态。flushtodisp开始工作后,通过bFlushReadAddr和bFlushReadData来读RAM,通过wWrite, wWriteAddr, bWriteData来写显示面板。这个模块也接入了控制器输出的Speed, Level, Score, NextBolck, CurrentBlock, CurrentBlockPos等信息,根据这些信息来生成游戏显示面板所需要的数据。
这样主控模块就比较简单了:

module main(
    input wClk, nwReset,
    output wWrite,
    output [31:0] bWriteAddr,
    output [31:0] bWriteData,
    output [3:0]  bWriteMask,
    output wRead,
    output [31:0] bReadAddr,
    input [31:0]  bReadData);

    wire        wram_Write;
    wire [5:0]  bram_WriteAddr;
    wire [63:0] bram_WriteData;
    wire [5:0]  bram_ReadAddr;
    wire [63:0] bram_ReadData;

/* 帧存存储器 */
    hdl4se_ram1p  #(64, 6) ram_0(
      wClk,
      wram_Write,
      bram_WriteAddr,
      bram_WriteData,
      bram_ReadAddr,
      bram_ReadData
    );

/* 游戏控制器 */
    wire          wCtrlWrite;
    wire [5:0]    bCtrlWriteAddr;
    wire [63:0]   bCtrlWriteData;
    wire [5:0]    bCtrlReadAddr;
    wire [63:0]   bCtrlReadData;
    wire [31:0]   bCtrlKeyData;
    wire          wCtrlStateComplete;
    wire          wCtrlStateChange;
    wire [3:0]    bCtrlState;
    wire [31:0]   bCtrlSpeed;
    wire [31:0]   bCtrlLevel;
    wire [31:0]   bCtrlScore;
    wire [63:0]   bNextBlock;
    wire [63:0]   bCurBlock;
    wire [15:0]   bCurBlockPos;

	teris_ctrl ctrl(wClk, nwReset, wCtrlWrite, bCtrlWriteAddr, bCtrlWriteData, 
                    bCtrlReadAddr,bCtrlReadData, bCtrlKeyData,
                    wCtrlStateComplete, wCtrlStateChange, bCtrlState,
                    bCtrlScore, bCtrlSpeed, bCtrlLevel, 
                    bNextBlock, bCurBlock, bCurBlockPos);

    wire [5:0]  bFlushReadAddr;
    wire [63:0] bFlushReadData;    
    
    /* 屏幕刷新 */
    flushtodisp flusher(wClk,
            bCtrlState, wCtrlStateComplete,
            bFlushReadAddr, bFlushReadData,
            wWrite, bWriteAddr, bWriteData,
            bCtrlSpeed, bCtrlLevel, bCtrlScore, 
            bNextBlock, bCurBlock, bCurBlockPos);

    /* ram读写口仲裁 */
    assign wram_Write = (bCtrlState == `ST_FLUSHTODISP) ? 1'b0 : wCtrlWrite; /*刷新模块不写ram*/
    assign bram_WriteAddr = (bCtrlState == `ST_FLUSHTODISP) ? 6'b0 : bCtrlWriteAddr;
    assign bram_WriteData = (bCtrlState == `ST_FLUSHTODISP) ? 64'b0 : bCtrlWriteData;
    assign bram_ReadAddr = (bCtrlState == `ST_FLUSHTODISP) ? bFlushReadAddr : bCtrlReadAddr;
    assign bCtrlReadData = bram_ReadData;
    assign bFlushReadData = bram_ReadData;

/*我们一直在读按键的状态*/
    assign wRead = 1'b1;
    assign bReadAddr = 32'hF000_0000;
    assign bCtrlKeyData = bReadData; /* 按键信息直接接入到控制器中去 */

endmodule

它使用了一个64位32个字的RAM来实现帧存,存放游戏面板中的方块信息。主控模块还实例化了一个c语言编写的游戏控制器实例,用来控制游戏的进程,并生成游戏面板中的方块信息写到RAM中,实例化了一个用verilog语言编写的刷新到显示的模块,在控制器的控制下,将RAM中的信息读出来,结合控制器输出的速度,级别,分数,下一块,当前块,当前块位置等信息,生成游戏显示面板所需要的信号。
主控模块还做了RAM读写口的仲裁,如果控制器状态机处在在ST_FLUSHTODISP状态,RAM就用刷新到显示模块的信号,否则就用游戏控制器的信号。

10.3 控制器模块的实现

值得注意的是,尽管控制器模块是用c语言实现的,但是由于要与刷新到显示模块共享RAM,内部也大幅度修改了实现方式。从架构上,完全改为用状态机方式实现了。我们定义了控制器状态机的状态如下:

enum {
	ST_INIT, /*初始化状态*/
	ST_FLUSHTODISP,/*刷新到显示*/
	ST_CHECKKEY, /*检测按键*/
	ST_CHECKBLOCKCANSETTO,/*测试当前块能否放在指定位置*/
	ST_BLOCKWRITE,/*将当前块写到面板中,称为面板的一部分*/
	ST_CHECKLINE, /*检测是否有全部由方块组成的行,并更新得分等信息*/
	ST_COPYLINES, /*消除全部由方块组成的行*/
};

用c语言写状态机时,注意到状态机靠外部时钟推动,因此状态机状态转移控制代码放在IHDL4SEUnit接口函数ClkTick中:

static int terrisctrl1_hdl4se_unit_ClkTick(HOBJECT object)
{
	sTerrisCtrl1* pobj;
	unsigned int key;
	unsigned int statecomplete;

	pobj = (sTerrisCtrl1*)objectThis(object);

	pobj->write = 0;
	if (pobj->state == ST_INIT) {
	   /* 
	   初始化状态,在c语言初始化分数,速度,级别等信息,并生成新的当前块和下一块 
	   状态机中将显示面板中的数据全部写成0,生成一个空的面板,这里实现时在
	   每个时钟周期反复调用terrisctrl1_hdl4se_unit_Init函数,其实就是不断生成
	   写RAM的信号,将显示RAM清零。
	   */
		terrisctrl1_hdl4se_unit_Init(pobj);
		if (pobj->genContext.complete) {
			pobj->state = ST_FLUSHTODISP;
		}
	}
	else if (pobj->state == ST_FLUSHTODISP) {
	    /*这个状态下,功能由外部的verilog代码来写,这里不断监视外部模块的statecomplete信号,
	    一旦该信号有效,表示刷新到显示的功能执行完毕,转到按键检测状态*/
		objectCall3(pobj->statecomplete_unit, GetValue, pobj->statecomplete_index, 32, pobj->statecompletedata);
		objectCall1(pobj->statecompletedata, GetUint32, &statecomplete);
		if (statecomplete != 0)
			pobj->state = ST_CHECKKEY;
	}
	else if (pobj->state == ST_CHECKKEY) {
	    /* 按键检测状态检测是否有外部送来的按键信息,如果有,则进入相应的处理,否则进入
	    terrisctrl1_hdl4se_unit_Tick判断是否要自动下移一行。
	    按键处理的通用过程是,得到按键后的方块位置和形状(比如旋转),进入判断块是否可以
	    放在该位置,如果能够放,则将方块位置和形状更改为新的位置和形状,如果不能,则忽略
	    这个按键信号。完成后进入到刷新到显示状态。当然会根据按键更新得分。
	    */
		objectCall3(pobj->keydata_unit, GetValue, pobj->keydata_index, 32, pobj->keydata);
		if (!objectCall1(pobj->keydata, IsEQ, pobj->lastkeydata)) {
			objectCall1(pobj->keydata, GetUint32, &key);
			objectCall1(pobj->lastkeydata, Assign, pobj->keydata);
			if (key & 1)
				terrisctrl1_hdl4se_unit_PressKeyStart(pobj, TK_RIGHT);
			if (key & 2)
				terrisctrl1_hdl4se_unit_PressKeyStart(pobj, TK_LEFT);
			if (key & 4)
				terrisctrl1_hdl4se_unit_PressKeyStart(pobj, TK_DOWN);
			if (key & 8)
				terrisctrl1_hdl4se_unit_PressKeyStart(pobj, TK_TURNLEFT);
		}
		else {
		    /*如果没有按键信息,我们在下面的函数中递增一个计数器,如果这个计数器到达游戏
		    速度相关的阈值,则尝试将方块下移一行,如果不能下移,则表示方块已经触底,此时
		    进入将方块固化到底板的状态,否则下移一行,然后进入刷新到显示状态。
		    */
			terrisctrl1_hdl4se_unit_Tick(pobj);
		}	
	}
	else if (pobj->state == ST_CHECKBLOCKCANSETTO) {
	   /*
	   判断方块能否放在指定位置这个状态机比较特别,它可能从好几个地方进入,结束后需要
	   返回到相应的地方接着执行,就像调用一个子程序一样,因此我们特别设计了相关的控制
	   数据结构:
	   struct tagBlockCanSetToContext {
		int nextstate;
		int param;
		int result;
		int index;
		int x;
		int y;
		int complete;
		void (*BlockCanSetToPro)(sTerrisCtrl1 * pobj);
	}blockCanSetToContext;
	   
		其中包括一个在状态完成后需要回调的一个函数和一个完成后需要到达的状态,
		这里不断调用terrisctrl1_hdl4se_unit_BlockCanSetTo来对每个方块进行检
		测,完成后设置状态机状态到指定值,然后调用回调函数。
		*/
		
		terrisctrl1_hdl4se_unit_BlockCanSetTo(pobj, &currentblock, pobj->blockCanSetToContext.x, pobj->blockCanSetToContext.y);
		if (pobj->blockCanSetToContext.complete) {
			pobj->state = pobj->blockCanSetToContext.nextstate;
			pobj->blockCanSetToContext.BlockCanSetToPro(pobj);
		}
	}
	else if (pobj->state == ST_BLOCKWRITE) {
	    /*这个状态是发现方块无法往下移动一行的时候,将方块写入到底板的动作,
		完成后进入到ST_CHECKLINE对每一行进行检测是否由全部由方块组成。
		*/
		terrisctrl1_hdl4se_unit_BlockWrite(pobj);
		if (pobj->genContext.complete) {
			pobj->state = ST_CHECKLINE;
			pobj->genContext.complete = 0;
			pobj->genContext.index = 0;
			pobj->genContext.count = 0;
		}
	}
	else if (pobj->state == ST_CHECKLINE) {
		/*
		该状态下检测每一行是否全部由方块组成,如果是,跳到ST_COPYLINES状态,
		将这一行上面的所有行全部向下移动一行,否则进入刷新到屏幕状态刷新显示,
		当然根据这一次检测中发现的全方块行的行数对得分等进行修改。
		*/
		terrisctrl1_hdl4se_unit_CheckLine(pobj);
	}
	else if (pobj->state == ST_COPYLINES) {
		/*
		该状态将全方块行上面的行都向下移动一行,最上面的行则清零,达到
		消除一行的效果,完成后回到检测行的状态,继续检测是否有其他全方块行。
		*/
		terrisctrl1_hdl4se_unit_CopyLines(pobj);
	}
	else {
		pobj->state = ST_FLUSHTODISP;
	}
	return 0;
}

用c语言按照状态机的方式来写程序,其实在工程中是经常用的,比如在串行通信协议的编写中,收到一个字符就修改当前的状态,看是否满足协议的要求,构成什么样的数据包。在编译中也比较常用,bison就是一个状态机,根据输入的token,来驱动状态机运行,执行某个状态下代码,这样就可以生成语法树等数据结构。
我们来看一个典型的过程:

static void terrisctrl1_hdl4se_unit_BlockCanSetTo(sTerrisCtrl1* pobj, TerrisBlock* pBlock, int x, int y)
{
#define RETURNRESULT(res)  \
do { \
	pobj->blockCanSetToContext.result = res; \
	pobj->blockCanSetToContext.complete = 1; \
	goto BlockCanSetTo_return; \
} while (0)
	int i;
	int j;
	int yy;
	i = pobj->blockCanSetToContext.index / BLOCKSIZE;
	yy = y - BLOCKSIZE / 2 + i + 1;
	pobj->readaddr = YCOUNT - 1 - yy;
	pobj->write = 0;
	pobj->blockCanSetToContext.complete = 0;
	if (pobj->blockCanSetToContext.index > 1) {
		/*从进入这个状态的第二个周期开始进行判断,此时数据已经读入到端口上*/
		i = (pobj->blockCanSetToContext.index-2) / BLOCKSIZE;
		j = (pobj->blockCanSetToContext.index-2) % BLOCKSIZE;
		if (pBlock->subblock[i][j] != 0) {
			int xx, yy;
			unsigned long long line;
			xx = x - BLOCKSIZE / 2 + j;
			yy = y - BLOCKSIZE / 2 + i;
			if (yy < 0)
				goto BlockCanSetTo_return;
			if (yy >= PANELHEIGHT-1)
				RETURNRESULT(0);
			if (xx < 0)
				RETURNRESULT(0);
			if (xx >= PANELWIDTH)
				RETURNRESULT(0);
			objectCall3(pobj->readdata_unit, GetValue, pobj->readdata_index, 64, pobj->readdata);
			objectCall1(pobj->readdata, GetUint64, &line);
			line >>= xx * 4;
			line &= 0xF;
			if (line != 0)
				RETURNRESULT(0);
		}
	}
	if (pobj->blockCanSetToContext.index > BLOCKSIZE * BLOCKSIZE) {
		pobj->blockCanSetToContext.complete = 1;
		pobj->blockCanSetToContext.result = 1;
	}
BlockCanSetTo_return :
	pobj->blockCanSetToContext.index++;
	return;
}

这个过程来判断块能否放在指定位置,相关的信息已经准备好在blockCanSetToContext中,我们的任务是对4x4的方块逐块进行检测,一方面是带颜色块必须在面板范围内,不能把带颜色小方块移动到屏幕外面去。这是靠它的坐标来判断的。另一方面是读出带颜色小方块位置的底板上是否有带颜色小块,如果有,则不能放在指定位置。16个小块都检测通过,则返回值是整个块可以放在指定位置。值得注意的是我们在这里生成的RAM读地址会延迟一拍送出去(verilog中的通过寄存器输出),然后读的值会再延迟一拍才回来,因此每一拍送的地址对应的值要两拍后才能回来。
用状态机方式编程,肯定比用通常方式编程要麻烦一些,编程思路也不一样了,然而这样做出来的模块更加接近硬件,能够与verilog编制的RTL代码协同工作,因此这种思路转换还是值得的。

10.4 刷新到显示模块的实现

刷新到显示模块是用verilog实现的,这里特别拿出来介绍,主要是考虑到原来是c语言实现的,现在用verilog语言实现了,可以前后对比以下,来体会用verilog语言写程序的不同,另一方面也为将来更多的模块用verilog实现打下一个基础。
该模块的代码如下:

module flushtodisp(
    input           wClk,
    input   [3:0]   bCtrlState,
    output          wCtrlStateComplete,
    output  [5:0]   bFlushReadAddr,
    input   [63:0]  bFlushReadData,
    output          wWrite,
    output  [31:0]  bWriteAddr,
    output  [31:0]  bWriteData,
    input   [31:0]  bCtrlSpeed,
    input   [31:0]  bCtrlLevel,
    input   [31:0]  bCtrlScore,
    input   [63:0]  bNextBlock,
    input   [63:0]  bCurBlock,
    input   [15:0]  bCurBlockPos
);

    wire [31:0]   bNextBlockLo = bNextBlock[31:0];
    wire [31:0]   bNextBlockHi = bNextBlock[63:32];
    wire [31:0]   bCurBlockLo =  bCurBlock[31:0];
    wire [31:0]   bCurBlockHi = bCurBlock[63:32];
    wire [4:0]    bCurBlockX = bCurBlockPos[4:0];
    wire [4:0]    bCurBlockY = bCurBlockPos[12:8];

    /* 目前编译器还不支持reg和always块,因此直接用基本单元来做寄存器 */
    wire [7:0] wirein_readaddr, wireout_readaddr, wireout_readaddr_delay_1;
    wire[31:0] wireout_readaddr2;
    hdl4se_reg  #(6) ramreadaddr(wClk, wirein_readaddr, wireout_readaddr);
    hdl4se_reg  #(6) ramreadaddr_delay_1(wClk, wireout_readaddr, wireout_readaddr_delay_1);
    assign wirein_readaddr = (bCtrlState == `ST_FLUSHTODISP) ? wireout_readaddr + 1 : 6'b0;
    assign bFlushReadAddr = wireout_readaddr[6:1];
    assign wCtrlStateComplete = wireout_readaddr == 6'd60; 
    assign bWriteAddr = 32'hf000_0010 + wireout_readaddr_delay_1 * 4;
    assign wWrite = (bCtrlState == `ST_FLUSHTODISP) ? 1 : 0;
    wire [2:0] bWriteDataSel =  (wireout_readaddr_delay_1 < 6'd52)?3'd7:(wireout_readaddr_delay_1 - 6'd52);
    /* 
        0 -- 47,面板内容,    --> 7
        52, 53:  nextblock0, 1 --> 0, 1
        56 : score             --> 4
        57 : level             --> 5
        58 : speed             --> 6
    */ 
    wire [4:0]  line = wireout_readaddr_delay_1[5:1];
    wire        right = wireout_readaddr_delay_1[0];
    wire [15:0] line3 = ((line + 2) == bCurBlockY) ? 16'hffff:16'b0;
    wire [15:0] line2 = ((line + 1) == bCurBlockY) ? 16'hffff:16'b0;
    wire [15:0] line1 = (line == bCurBlockY) ? 16'hffff:16'b0;
    wire [15:0] line0 = (line == (bCurBlockY+1)) ? 16'hffff:16'b0;
    wire [15:0] curblockline = (line0 & bCurBlock[15:0]) | (line1 & bCurBlock[31:16]) | (line2 & bCurBlock[47:32]) | (line3 & bCurBlock[63:48]);
    wire [31:0] curline;
    wire [31:0] selecteddata;
    
    hdl4se_bind2 #(16, 16) curlinebind(curblockline, selecteddata[31:16], curline);

    wire [31:0] leftline_0_15;
    wire [31:0] leftline = bCurBlockX[5:4]?selecteddata:(leftline_0_15 | selecteddata);
    wire [31:0] leftline_1, leftline0, leftline1, leftline2, leftline3, leftline4, leftline5, leftline6, leftline7, leftline8, leftline9;
    hdl4se_bind2 #(4, 28)       leftline0_gen(curblockline[15:12], 28'b0, leftline_1);
    hdl4se_bind2 #(8, 24)       leftline0_gen(curblockline[15:8], 24'b0, leftline0);
    hdl4se_bind2 #(12, 20)      leftline1_gen(curblockline[15:4], 20'b0, leftline1);
    hdl4se_bind2 #(16, 16)      leftline2_gen(curblockline[15:0], 16'b0, leftline2);
    hdl4se_bind3 #(4, 16, 12)   leftline3_gen(4'b0, curblockline[15:0], 12'b0, leftline3);
    hdl4se_bind3 #(8, 16, 8)    leftline4_gen(8'b0, curblockline[15:0], 8'b0, leftline4);
    hdl4se_bind3 #(12,16, 4)    leftline5_gen(12'b0, curblockline[15:0], 4'b0, leftline5);
    hdl4se_bind2 #(16, 16)      leftline6_gen(16'b0, curblockline[15:0], leftline6);
    hdl4se_bind2 #(20, 12)      leftline7_gen(20'b0, curblockline[11:0], leftline7);
    hdl4se_bind2 #(24, 8)       leftline8_gen(24'b0, curblockline[7:0], leftline8);
    hdl4se_bind2 #(28, 4)       leftline9_gen(28'b0, curblockline[3:0], leftline9);

    hdl4se_mux16 #(32) selectleftline(
        bCurBlockX[3:0],
        leftline_1,
        leftline0,
        leftline1,
        leftline2,
        leftline3,
        leftline4,
        leftline5,
        leftline6,
        leftline7,
        leftline8,
        leftline9,
        selecteddata,
        selecteddata,
        selecteddata,
        selecteddata,
        selecteddata,
        leftline_0_15
    );

    wire [31:0] rightline_3_18;
    wire [31:0] rightline = (bCurBlockX[5:0]>=3)?(rightline_3_18|selecteddata):selecteddata;
    wire [31:0] rightline0, rightline1, rightline2, rightline3, rightline4, rightline5, rightline6, rightline7, rightline8, rightline9, rightline10;
 
    hdl4se_bind2 #(4, 28)       rightline0_gen(curblockline[15:12], 28'b0, rightline0);
    hdl4se_bind2 #(8, 24)       rightline1_gen(curblockline[15:8], 24'b0, rightline1);
    hdl4se_bind2 #(12, 20)      rightline2_gen(curblockline[15:4], 20'b0, rightline2);
    hdl4se_bind2 #(16, 16)      rightline3_gen(curblockline[15:0], 16'b0,rightline3);
    hdl4se_bind3 #(4, 16, 12)   rightline4_gen(4'b0, curblockline[15:0], 12'b0, rightline4);
    hdl4se_bind3 #(8, 16, 8)    rightline5_gen(8'b0, curblockline[15:0], 8'b0, rightline5);
    hdl4se_bind3 #(12, 16, 4)   rightline6_gen(12'b0, curblockline[15:0], 4'b0, rightline6);
    hdl4se_bind2 #(16, 16)      rightline7_gen(16'b0, curblockline[15:0], rightline7);
    hdl4se_bind2 #(20, 12)      rightline8_gen(20'b0, curblockline[11:0], rightline8);
    hdl4se_bind2 #(24, 8)       rightline9_gen(24'b0,  curblockline[7:0], rightline9);
    hdl4se_bind2 #(28, 4)       rightline10_gen(28'b0,  curblockline[3:0], rightline10);
    wire [5:0] blockx_3 = bCurBlockX[5:0] - 6'd3;
    hdl4se_mux16 #(32) selectrightline(
        blockx_3[3:0],
        selecteddata,
        selecteddata,
        selecteddata,
        selecteddata,       
        selecteddata,       
        rightline0,
        rightline1,
        rightline2,
        rightline3,
        rightline4,
        rightline5,
        rightline6,
        rightline7,
        rightline8,
        rightline9,
        rightline10,
        rightline_3_18
    );

    assign bWriteData = (curblockline != 16'b0)?(right?rightline:leftline):selecteddata;
    hdl4se_mux8 #(32) writedatasel(
        bWriteDataSel,
        bNextBlockLo,
        bNextBlockHi,
        32'b0,
        32'b0,
        bCtrlScore,
        bCtrlLevel,
        bCtrlSpeed,
        right?bFlushReadData[63:32]:bFlushReadData[31:0],
        selecteddata
    );
endmodule

这其中的功能主要就是从RAM读出数据,然后写到显示面板中去,当然也在某些地址直接送分数,速度,级别,下一块等信息。比较复杂的是底板方块数据与当前块的叠加显示。这段代码夹杂了很多HDL4SE的基本单元,其实它的纯verilog版本是这样的:

module flushtodisp(
    input           wClk,
    input   [3:0]   bCtrlState,
    output          wCtrlStateComplete,
    output  [5:0]   bFlushReadAddr,
    input   [63:0]  bFlushReadData,
    output          wWrite,
    output  [31:0]  bWriteAddr,
    output  [31:0]  bWriteData,
    input   [31:0]  bCtrlSpeed,
    input   [31:0]  bCtrlLevel,
    input   [31:0]  bCtrlScore,
    input   [63:0]  bNextBlock,
    input   [63:0]  bCurBlock,
    input   [15:0]  bCurBlockPos
);

    wire [31:0]   bNextBlockLo = bNextBlock[31:0];
    wire [31:0]   bNextBlockHi = bNextBlock[63:32];
    wire [31:0]   bCurBlockLo =  bCurBlock[31:0];
    wire [31:0]   bCurBlockHi = bCurBlock[63:32];
    wire [4:0]    bCurBlockX = bCurBlockPos[4:0];
    wire [4:0]    bCurBlockY = bCurBlockPos[12:8];

    reg [5:0] ramreadaddr;
    reg[5:0]  ramreadaddr_delay_1;
    always @(posedge wClk) begin
      ramreadaddr <= (bCtrlState == `ST_FLUSHTODISP) ? ramreadaddr + 1 : 6'b0;
      ramreadaddr_delay_1 <= ramreadaddr;
    end
    assign bFlushReadAddr = ramreadaddr[6:1];
    assign wCtrlStateComplete = ramreadaddr == 6'd60; 
    assign bWriteAddr = 32'hf000_0010 + ramreadaddr_delay_1 * 4;
    assign wWrite = (bCtrlState == `ST_FLUSHTODISP) ? 1 : 0;
    wire [2:0] bWriteDataSel =  (ramreadaddr_delay_1 < 6'd52)?3'd7:(ramreadaddr_delay_1 - 6'd52);
    /* 
        0 -- 47,面板内容,    --> 7
        52, 53:  nextblock0, 1 --> 0, 1
        56 : score             --> 4
        57 : level             --> 5
        58 : speed             --> 6
    */ 
    wire [4:0]  line = wireout_readaddr_delay_1[5:1];
    wire        right = wireout_readaddr_delay_1[0];
    wire [15:0] line3 = ((line + 2) == bCurBlockY) ? 16'hffff:16'b0;
    wire [15:0] line2 = ((line + 1) == bCurBlockY) ? 16'hffff:16'b0;
    wire [15:0] line1 = (line == bCurBlockY) ? 16'hffff:16'b0;
    wire [15:0] line0 = (line == (bCurBlockY+1)) ? 16'hffff:16'b0;
    wire [15:0] curblockline = (line0 & bCurBlock[15:0]) | (line1 & bCurBlock[31:16]) | (line2 & bCurBlock[47:32]) | (line3 & bCurBlock[63:48]);
    wire [31:0] curline = {selecteddata[31:16], curblockline};
    wire [31:0] selecteddata;
    
    reg [31:0] leftline_0_15;
    wire [31:0] leftline = bCurBlockX[5:4]?selecteddata:(leftline_0_15 | selecteddata);
   
    always @* 
    case (bCurBlockX[3:0])
        4'd00: leftline_0_15 = {28'b0, curblockline[15:12]};
        4'd01: leftline_0_15 = {24'b0, curblockline[15:8]};
        4'd02: leftline_0_15 = {20'b0, curblockline[15:4]};
        4'd03: leftline_0_15 = {16'b0, curblockline[15:0]};
        4'd04: leftline_0_15 = {12'b0, curblockline[15:0], 4'b0};
        4'd05: leftline_0_15 = { 8'b0, curblockline[15:0], 8'b0};
        4'd06: leftline_0_15 = { 4'b0, curblockline[15:0], 12'b0};
        4'd07: leftline_0_15 = {curblockline[15:0], 16'b0};
        4'd08: leftline_0_15 = {curblockline[11:0], 20'b0};
        4'd09: leftline_0_15 = {curblockline[7:0], 24'b0};
        4'd10: leftline_0_15 = {curblockline[3:0], 28'b0};
        4'd11: leftline_0_15 = 32'b0;
        4'd12: leftline_0_15 = 32'b0;
        4'd13: leftline_0_15 = 32'b0;
        4'd14: leftline_0_15 = 32'b0;
        4'd15: leftline_0_15 = 32'b0;
    endcase

    reg [31:0] rightline_3_18;
    wire [31:0] rightline = (bCurBlockX[5:0]>=3)?(rightline_3_18|selecteddata):selecteddata;
    wire [5:0] blockx_3 = bCurBlockX[5:0] - 6'd3;

    always @* 
    case (blockx_3)
        4'd00: rightline_3_18 = 32'b0;
        4'd01: rightline_3_18 = 32'b0;
        4'd02: rightline_3_18 = 32'b0;
        4'd03: rightline_3_18 = 32'b0;
        4'd04: rightline_3_18 = 32'b0;        
        4'd05: rightline_3_18 = {28'b0, curblockline[15:12]};
        4'd06: rightline_3_18 = {24'b0, curblockline[15:8]};
        4'd07: rightline_3_18 = {20'b0, curblockline[15:4]};
        4'd08: rightline_3_18 = {16'b0, curblockline[15:0]};
        4'd09: rightline_3_18 = {12'b0, curblockline[15:0], 4'b0};
        4'd10: rightline_3_18 = { 8'b0, curblockline[15:0], 8'b0};
        4'd11: rightline_3_18 = { 4'b0, curblockline[15:0], 12'b0};
        4'd12: rightline_3_18 = {curblockline[15:0], 16'b0};
        4'd13: rightline_3_18 = {curblockline[11:0], 20'b0};
        4'd14: rightline_3_18 = {curblockline[7:0], 24'b0};
        4'd15: rightline_3_18 = {curblockline[3:0], 28'b0};
    endcase

  
    assign bWriteData = (curblockline != 16'b0)?(right?rightline:leftline):selecteddata;
    always @(*)
    case (bWriteDataSel)
        3'd0: selecteddata = bNextBlockLo;
        3'd1: selecteddata = bNextBlockHi;
        3'd2: selecteddata = 32'b0;
        3'd3: selecteddata = 32'b0;
        3'd4: selecteddata = bCtrlScore;
        3'd5: selecteddata = bCtrlLevel;
        3'd6: selecteddata = bCtrlSpeed;
        3'd7: selecteddata = right?bFlushReadData[63:32]:bFlushReadData[31:0];
    endcase
endmodule

之所以写成前面的样子,主要是我们的编译器软件进度还没有跟上来,其中的always块还没有支持,信号连接也没有支持,reg类型的声明也没有支持。
这里原来的c语言代码是通过下面的函数来得到叠加后的每一行数据的:

static unsigned int get_line_data(sTerrisCtrl* pobj, int line, int right)
{
	unsigned int data;
	int i;
	data = 0;
	line = YCOUNT - 1 - line;
	for (i = 0; i < 8; i++) {
		int xx, yy, ii, jj;
		unsigned int c;
		xx = 7 - i + right * 8;
		yy = line;
		c = 0;
		jj = xx - currentblock.posx + BLOCKSIZE / 2;
		ii = yy - currentblock.posy + BLOCKSIZE / 2;
		if (ii >= 0 && ii < BLOCKSIZE && jj >= 0 && jj < BLOCKSIZE) {
			c = currentblock.subblock[ii][jj];
		}

		data <<= 4;
		if (c == 0)
			data |= terrisPanel[line][7 - i + right * 8] & 0xf;
		else
			data |= c & 0xf;

	}
	return data;
}

可以看出c语言中使用了一个for循环来分步实现的,其实编译成目标代码也相当于一个比较复杂的状态机了。verilog中则用一个组合逻辑电路实现,中间的信号生成看着比较复杂,当然也可能与当前编译器不支持有些特征(比如连接)导致的,不过不管如何verilog的RTL编写同样的功能是要复杂一些。
运行时将上述的verilog代码编译成目标代码topmodule.c,然后与控制器代码等一起编译成一个俄罗斯方块游戏,玩起来与前面用c写的没有多大区别。
为了做对照,我们用c语言实现了一个完整的刷新到显示模块:

/*
** HDL4SE: 软件Verilog综合仿真平台
** Copyright (C) 2021-2021, raoxianhong
** LCOM: 轻量级组件对象模型
** Copyright (C) 2021-2021, raoxianhong
** All rights reserved.
**
** Redistribution and use in source and binary forms, with or without
** modification, are permitted provided that the following conditions are met:
**
** * Redistributions of source code must retain the above copyright notice,
**   this list of conditions and the following disclaimer.
** * Redistributions in binary form must reproduce the above copyright notice,
**   this list of conditions and the following disclaimer in the documentation
**   and/or other materials provided with the distribution.
** * The name of the author may be used to endorse or promote products
**   derived from this software without specific prior written permission.
**
** THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
** AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
** IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
** ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
** LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
** CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
** SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
** INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
** CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
** ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
** THE POSSIBILITY OF SUCH DAMAGE.
*/

/*
* flushtodisp.c
  修改记录:
	202106220539: rxh, initial version
*/
#include "stdlib.h" 
#include "stdio.h"
#include "string.h"
#include "object.h"
#include "dlist.h"
#include "bignumber.h"
#include "hdl4secell.h"
#include "terris.h"

/*
00, input           wClk,
01, input[3:0]   bCtrlState,
02, output          wCtrlStateComplete,
03, output[5:0]   bFlushReadAddr,
04, input[63:0]  bFlushReadData,
05, output          wWrite,
06, output[31:0]  bWriteAddr,
07, output[31:0]  bWriteData,
08, input[31:0]  bCtrlSpeed,
09, input[31:0]  bCtrlLevel,
10, input[31:0]  bCtrlScore,
11, input[63:0]  bNextBlock,
12, input[63:0]  bCurBlock,
13, input[15:0]  bCurBlockPos
*/

/* wClk不算 */
#define INPUTPORTCOUNT 8
typedef struct _sTerrisFlushToDisp {
	OBJECT_HEADER
	INTERFACE_DECLARE(IHDL4SEUnit)
	HDL4SEUNIT_VARDECLARE
	DLIST_VARDECLARE
	
	IHDL4SEModule** parent;
	char* name;

	IBigNumber**  inputdata;
	IHDL4SEUnit** input_unit[INPUTPORTCOUNT];
	int           input_index[INPUTPORTCOUNT];
	
	int index;
	int flushreadaddr; /* 模拟读地址寄存器,比index晚一拍 */
	int flushreadaddr_last; /* 模拟读地址寄存器,比flushreadaddr晚一拍 */

}sTerrisFlushToDisp;

OBJECT_FUNCDECLARE(flushtodisp, CLSID_TERRIS_FLUSHTODISP);
HDL4SEUNIT_FUNCDECLARE(flushtodisp, CLSID_TERRIS_FLUSHTODISP, sTerrisFlushToDisp);
DLIST_FUNCIMPL(flushtodisp, CLSID_TERRIS_FLUSHTODISP, sTerrisFlushToDisp);

OBJECT_FUNCIMPL(flushtodisp, sTerrisFlushToDisp, CLSID_TERRIS_FLUSHTODISP);


QUERYINTERFACE_BEGIN(flushtodisp, CLSID_TERRIS_FLUSHTODISP)
QUERYINTERFACE_ITEM(IID_HDL4SEUNIT, IHDL4SEUnit, sTerrisFlushToDisp)
QUERYINTERFACE_ITEM(IID_DLIST, IDList, sTerrisFlushToDisp)
QUERYINTERFACE_END

static const char* flushtodispModuleInfo()
{
	return "1.0.0-20210622.0539 Terris Flush to Disp module";
}

static int flushtodispCreate(const PARAMITEM* pParams, int paramcount, HOBJECT* pObject)
{
	sTerrisFlushToDisp* pobj;
	int i;
	pobj = (sTerrisFlushToDisp*)malloc(sizeof(sTerrisFlushToDisp));
	if (pobj == NULL)
		return -1;
	*pObject = 0;
	HDL4SEUNIT_VARINIT(pobj, CLSID_TERRIS_FLUSHTODISP);
	INTERFACE_INIT(IHDL4SEUnit, pobj, flushtodisp, hdl4se_unit);
	DLIST_VARINIT(pobj, flushtodisp);
	
	pobj->name = NULL;
	pobj->parent = NULL;
	
	for (i = 0;i< INPUTPORTCOUNT;i++)
		pobj->input_unit[i] = NULL;

	pobj->inputdata = bigintegerCreate(64);

	pobj->index = 0;

	for (i = 0; i < paramcount; i++) {
		if (pParams[i].name == PARAMID_HDL4SE_UNIT_NAME) {
			if (pobj->name != NULL)
				free(pobj->name);
			pobj->name = strdup(pParams[i].pvalue);
		} 
		else if (pParams[i].name == PARAMID_HDL4SE_UNIT_PARENT) {
			pobj->parent = (IHDL4SEModule **)pParams[i].pvalue;
		}
	}

	/* 返回生成的对象 */
	OBJECT_RETURN_GEN(flushtodisp, pobj, pObject, CLSID_TERRIS_FLUSHTODISP);
	return EIID_OK;
}

static void flushtodispDestroy(HOBJECT object)
{
	sTerrisFlushToDisp* pobj;
	int i;
	pobj = (sTerrisFlushToDisp*)objectThis(object);
	if (pobj->name != NULL)
		free(pobj->name);
	for (i = 0; i < INPUTPORTCOUNT; i++)
		objectRelease(pobj->input_unit[i]);
	objectRelease(pobj->inputdata);

	memset(pobj, 0, sizeof(sTerrisFlushToDisp));
	free(pobj);
}

static int flushtodispValid(HOBJECT object)
{
	sTerrisFlushToDisp* pobj;
	pobj = (sTerrisFlushToDisp*)objectThis(object);
	return 1;
}

static int flushtodisp_hdl4se_unit_GetName(HOBJECT object, const char** pname)
{
	sTerrisFlushToDisp* pobj;
	pobj = (sTerrisFlushToDisp*)objectThis(object);
	*pname = pobj->name;
	return 0;
}

static int flushtodisp_hdl4se_unit_Connect(HOBJECT object, int index, HOBJECT from, int fromindex)
{
#define CONNECTPORT(ind, innerind) \
	if (index == ind) { \
		if (0 == objectQueryInterface(from, IID_HDL4SEUNIT, (void**)&unit)) { \
			objectRelease(pobj->input_unit[innerind]); \
			pobj->input_unit[innerind] = unit; \
			pobj->input_index[innerind] = fromindex; \
		} \
	}

	sTerrisFlushToDisp* pobj;
	IHDL4SEUnit** unit = NULL;
	pobj = (sTerrisFlushToDisp*)objectThis(object);
	CONNECTPORT( 1, 0); /* bCtrlState */
	CONNECTPORT( 4, 1); /* bFlushReadData */
	CONNECTPORT( 8, 2); /* bCtrlSpeed */
	CONNECTPORT( 9, 3); /* bCtrlLevel */
	CONNECTPORT(10, 4); /* bCtrlScore */
	CONNECTPORT(11, 5); /* bNextBlock */
	CONNECTPORT(12, 6); /* bCurBlock */
	CONNECTPORT(13, 7); /* bCurBlockPos */
	return 0;
}

static int flushtodisp_hdl4se_unit_ConnectPart(HOBJECT object, int index, int start, int width, HOBJECT from, int fromindex)
{
	sTerrisFlushToDisp* pobj;
	IHDL4SEUnit** unit = NULL;
	pobj = (sTerrisFlushToDisp*)objectThis(object);
	return 0;
}

static unsigned int flushtodisp_hdl4se_unit_GetWriteData(sTerrisFlushToDisp* pobj)
{
	/*
	* 按flushreadaddr_last的值
		0 -- 47,面板内容,    --> 7
		52, 53:  nextblock0, 1 --> 0, 1
		56 : score             --> 4
		57 : level             --> 5
		58 : speed             --> 6
	*/
	if (pobj->flushreadaddr_last < 48) {
		unsigned long long data;
		unsigned long long curblockdata;
		unsigned long long curblockline;
		unsigned int blockpos;
		unsigned int blockx, blocky;
		int i;

		int y = pobj->flushreadaddr_last >> 1;
		
		objectCall3(pobj->input_unit[7], GetValue, pobj->input_index[7], 16, pobj->inputdata);
		objectCall1(pobj->inputdata, GetUint32, &blockpos);
		blockx = blockpos & 0xff;
		blocky = blockpos >> 8;
		
		objectCall3(pobj->input_unit[6], GetValue, pobj->input_index[6], 64, pobj->inputdata);
		objectCall1(pobj->inputdata, GetUint64, &curblockdata);

		objectCall3(pobj->input_unit[1], GetValue, pobj->input_index[1], 64, pobj->inputdata);
		objectCall1(pobj->inputdata, GetUint64, &data);

		curblockline = 0;
		for (i = 0; i < 4; i++) {
			if (y == blocky + 2 - i) {
				curblockline = (curblockdata >> (i * 16)) & 0xffff;
				if (blockx < 3)
					curblockline >>= ((3-blockx) * 4);
				else
					curblockline <<= (blockx-3) * 4;
				break;
			}
		}
		data |= curblockline;
		if (pobj->flushreadaddr_last & 1) {
			return data >> 32;
		}
		else {
			return data & 0xffffffff;
		}
	}
	else if (pobj->flushreadaddr_last == 52 || pobj->flushreadaddr_last == 53) {
		/*nextblock*/
		unsigned long long nextblock;
		objectCall3(pobj->input_unit[5], GetValue, pobj->input_index[5], 64, pobj->inputdata);
		objectCall1(pobj->inputdata, GetUint64, &nextblock);
		if (pobj->flushreadaddr_last == 52) {
			return nextblock & 0xffffffff;
		}
		else {
			return nextblock >> 32;
		}
	}
	else if (pobj->flushreadaddr_last == 56) {
		/*score*/
		unsigned int data;
		objectCall3(pobj->input_unit[4], GetValue, pobj->input_index[4], 32, pobj->inputdata);
		objectCall1(pobj->inputdata, GetUint32, &data);
		return data;
	}
	else if (pobj->flushreadaddr_last == 57) {
		/*level*/
		unsigned int data;
		objectCall3(pobj->input_unit[3], GetValue, pobj->input_index[3], 32, pobj->inputdata);
		objectCall1(pobj->inputdata, GetUint32, &data);
		return data;
	}
	else if (pobj->flushreadaddr_last == 58) {
		/*speed*/
		unsigned int data;
		objectCall3(pobj->input_unit[2], GetValue, pobj->input_index[2], 32, pobj->inputdata);
		objectCall1(pobj->inputdata, GetUint32, &data);
		return data;
	}
	return 0;
}

static int flushtodisp_hdl4se_unit_GetValue(HOBJECT object, int index, int width, IBigNumber ** value)
{
	int i;
	sTerrisFlushToDisp* pobj;
	pobj = (sTerrisFlushToDisp*)objectThis(object);
	unsigned int ctrlstate;
	objectCall3(pobj->input_unit[0], GetValue, pobj->input_index[0], 32, pobj->inputdata);
	objectCall1(pobj->inputdata, GetUint32, &ctrlstate);
	if (index == 2) { /* wCtrlStateComplete */
		objectCall1(value, AssignUint32, (pobj->flushreadaddr == 60)?1:0); 
	}
	else if (index == 3) {/* bFlushReadAddr */
		objectCall1(value, AssignUint32, (pobj->flushreadaddr >> 1)); 
	}
	else if (index == 5) {/* wWrite */
		objectCall1(value, AssignUint32, (ctrlstate == ST_FLUSHTODISP ? 1 : 0));
	}
	else if (index == 6) {/* wWriteAddr */
		objectCall1(value, AssignUint32, 0xf0000010 + pobj->flushreadaddr_last * 4);
	}
	else if (index == 7) {/* wWriteData */
		objectCall1(value, AssignUint32, flushtodisp_hdl4se_unit_GetWriteData(pobj));
	}
	return 0;
}

static int flushtodisp_hdl4se_unit_ClkTick(HOBJECT object)
{
	sTerrisFlushToDisp* pobj;
	pobj = (sTerrisFlushToDisp*)objectThis(object);
	unsigned int ctrlstate;
	objectCall3(pobj->input_unit[0], GetValue, pobj->input_index[0], 32, pobj->inputdata);
	objectCall1(pobj->inputdata, GetUint32, &ctrlstate);
	if (ctrlstate == ST_FLUSHTODISP) {
		pobj->index++;
	}
	else {
		pobj->index = 0;
	}
	return 0;
}

static int flushtodisp_hdl4se_unit_Setup(HOBJECT object)
{
	sTerrisFlushToDisp* pobj;
	pobj = (sTerrisFlushToDisp*)objectThis(object);
	pobj->flushreadaddr_last = pobj->flushreadaddr;
	pobj->flushreadaddr = pobj->index;
	return 0;
}

在verilog实现的module前面增加LCOM实现的说明,即可在编译时调用c语言实现的模块:

/* 
    我们实现了刷新到显示这个模块的verilog版本和c语言版本,
    下面的模块属性表表明对应的c语言实现的CLSID,将其中的
    HDL4SE="LCOM"行注释掉,编译器连接的就是verilog版本。
    这一行如果存在,连接的就是c语言版本。
*/
(* 
  HDL4SE="LCOM", 
  CLSID="d588064-fcd3-43cc-b131-1a64c74d9e86", 
  softmodule="hdl4se" 
*) 
module flushtodisp(
    input           wClk,
    input   [3:0]   bCtrlState,
    output          wCtrlStateComplete,
    output  [5:0]   bFlushReadAddr,
    input   [63:0]  bFlushReadData,
    output          wWrite,
    output  [31:0]  bWriteAddr,
    output  [31:0]  bWriteData,
    input   [31:0]  bCtrlSpeed,
    input   [31:0]  bCtrlLevel,
    input   [31:0]  bCtrlScore,
    input   [63:0]  bNextBlock,
    input   [63:0]  bCurBlock,
    input   [15:0]  bCurBlockPos
);
endmodule

可以将verilog语言的版本和c语言的版本对照看看,有助于理解c语言如何实现verilog模块。

10.5 当前块写入底板–如何用c语言写一个让verilog调用的模块

本段我们挑个简单的模块来实现,这个模块就是在方块不能往下走一行时,需要将当前块写到底板上,对应的状态机状态是ST_BLOCKWRITE。我们先实现c语言版本,一边实现一边演示如何用c语言来实现一个能够与verilog相互调用的模块。我们的目标是生成一个LCOM对象,实现IHDL4SEUnit接口,完成这个模块的功能。

10.5.1 创建GUID

首先,先用GUIDGEN生成一个新的GUID,我用的是Microsoft Visual Studio Community 2019 内置的创建GUID工具:
HDL4SE:软件工程师学习Verilog语言(十)_第2张图片
HDL4SE:软件工程师学习Verilog语言(十)_第3张图片
看到这个菜单就已经生成了,选择2.DEF_GUID格式,然后复制到剪贴板,在terris.h中将它粘贴回来,修改为一行CLSID定义,这个CLSID就是在系统中BLOCKWRITE对象的唯一标识。我生成的下面的代码:

DEFINE_GUID(CLSID_TERRIS_FLUSHTODISP, 0xd588064, 0xfcd3, 0x43cc, 0xb1, 0x31, 0x1a, 0x64, 0xc7, 0x4d, 0x9e, 0x86);
DEFINE_GUID(CLSID_TERRIS_BLOCKWRITE, 0xb0d75037, 0x831, 0x49e5, 0xbb, 0xd0, 0xf6, 0xb5, 0xe0, 0x7c, 0xbb, 0x51);

10.5.2 设计要编写的类对应的verilog模型的接口

增加一个blockwrite.v的verilog源代码文件,将上述生成的CLSID小心地调整成verilog中的格式,必须确保每组数字都一样,否则我们的编译器无法找到c语言编写的类:

/* blockwrite.v */

(* 
  HDL4SE="LCOM", 
  CLSID="b0d75037-0831-49e5-bbd0-f6b5e07cbb51", 
  softmodule="hdl4se" 
*)
module blockwrite(
    input           wClk,
    input   [3:0]   bCtrlState,
    output          wCtrlStateComplete,
    output  [5:0]   bBWReadAddr,
    input   [63:0]  bBWReadData,
    output          wBWWrite,
    output  [5:0]   bBWWriteAddr,
    output  [63:0]  bBWWriteData,
    input   [63:0]  bCurBlock,
    input   [15:0]  bCurBlockPos
);
endmodule

这个模块要完成将当前块写入到底板,因此必须接入状态机状态,以便确保本模块工作在ST_BLOCKWRITE状态下。输出一个wCtrlStateComplete,表示这个模块的任务完成,主控模块会监控这个信号,完成后由主控模块进行状态机切换。
值得注意的是,虽然是当前块写入,但是我们的RAM没有按位的写入掩码,因此其实是个读改写的过程,这样对RAM的读写信号都必须接入。
最后还应该接入主控模块提供的当前块形状和当前块位置的信息。这样下来这个模块就有如上面罗列的10个端口。

10.5.3 调整主控模块的连接方式及RAM访问仲裁代码

由于外部又多了一个模块,因此必须调整主控模块的连接方式及RAM访问的仲裁代码,这个比较容易理解,我们直接贴出代码好了:

`include "hdl4secell.v"

/* 用c写的俄罗斯方块控制器V1 */
(* 
  HDL4SE="LCOM", 
  CLSID="158fa52-ca8b-4551-9b87-fc7cff466e2a", 
  softmodule="hdl4se" 
*) 
module teris_ctrl
  (
    input           wClk,
    input           nwReset,
    output          wWrite,
    output [5:0]    bWriteAddr,
    output [63:0]   bWriteData,
    output [5:0]    bReadAddr,
    input  [63:0]   bReadData,
    input  [31:0]   bKeyData,
    input           wStateComplete,
    output          wStateChange,
    output [3:0]    bState,
    output [31:0]   bScore,
    output [31:0]   bSpeed,
    output [31:0]   bLevel,
    output [63:0]   bNextBlock,
    output [63:0]   bCurBlock,
    output [15:0]   bCurBlockPos
  );
endmodule

`define ST_INIT                 0
`define ST_FLUSHTODISP          1
`define ST_CHECKKEY             2
`define ST_CHECKBLOCKCANSETTO   3
`define ST_BLOCKWRITE           4
`define ST_CHECKLINE            5
`define ST_COPYLINES            6

`include "flushtodisp.v"
`include "blockwrite.v"

module main(
    input wClk, nwReset,
    output wWrite,
    output [31:0] bWriteAddr,
    output [31:0] bWriteData,
    output [3:0]  bWriteMask,
    output wRead,
    output [31:0] bReadAddr,
    input [31:0]  bReadData);

    wire        wram_Write;
    wire [5:0]  bram_WriteAddr;
    wire [63:0] bram_WriteData;
    wire [5:0]  bram_ReadAddr;
    wire [63:0] bram_ReadData;

/* 帧存存储器 */
    hdl4se_ram1p  #(64, 5) ram_0(
      wClk,
      wram_Write,
      bram_WriteAddr,
      bram_WriteData,
      bram_ReadAddr,
      bram_ReadData
    );

/* 游戏控制器 */
    wire          wCtrlWrite;
    wire [5:0]    bCtrlWriteAddr;
    wire [63:0]   bCtrlWriteData;
    wire [5:0]    bCtrlReadAddr;
    wire [63:0]   bCtrlReadData;
    wire [31:0]   bCtrlKeyData;
    wire          wCtrlStateComplete;
    wire          wCtrlStateChange;
    wire [3:0]    bCtrlState;
    wire [31:0]   bCtrlSpeed;
    wire [31:0]   bCtrlLevel;
    wire [31:0]   bCtrlScore;
    wire [63:0]   bNextBlock;
    wire [63:0]   bCurBlock;
    wire [15:0]   bCurBlockPos;

	teris_ctrl ctrl(wClk, nwReset, wCtrlWrite, bCtrlWriteAddr, bCtrlWriteData, 
                    bCtrlReadAddr,bCtrlReadData, bCtrlKeyData,
                    wCtrlStateComplete, wCtrlStateChange, bCtrlState,
                    bCtrlScore, bCtrlSpeed, bCtrlLevel, 
                    bNextBlock, bCurBlock, bCurBlockPos);

    wire [5:0]  bFlushReadAddr;
    wire [63:0] bFlushReadData; 
    wire        wFlushCtrlStateComplete;
                
    /* 屏幕刷新 */
    flushtodisp flusher(wClk,
            bCtrlState, wFlushCtrlStateComplete,
            bFlushReadAddr, bFlushReadData,
            wWrite, bWriteAddr, bWriteData,
            bCtrlSpeed, bCtrlLevel, bCtrlScore, 
            bNextBlock, bCurBlock, bCurBlockPos
            );

    wire [5:0] bBWReadAddr, bBWWriteAddr;
    wire [63:0] bBWReadData, bBWWriteData;
    wire wBWCtrlStateComplete, wBWWrite;
    /* 当前块写到底板 */
    blockwrite blockwriter(wClk,
            bCtrlState, wBWCtrlStateComplete,
            bBWReadAddr, bBWReadData,
            wBWWrite, bBWWriteAddr, bBWWriteData,
            bCurBlock, bCurBlockPos
            );
    hdl4se_mux8 #(1) mux_ramWrite(
        bCtrlState,
        1'b0,                       // 0: ST_INIT
	    wFlushCtrlStateComplete,    // 1: ST_FLUSHTODISP,
	    1'b0,                       // 2: ST_CHECKKEY,
	    1'b0,                       // 3: ST_CHECKBLOCKCANSETTO,
	    wBWCtrlStateComplete,       // 4: ST_BLOCKWRITE,
	    1'b0,                       // 5: ST_CHECKLINE,
        1'b0,                       // 6: ST_COPYLINES
        1'b0,                       // 7: empty
        wCtrlStateComplete
        );


    /* ram读写口仲裁 */
    hdl4se_mux8 #(1) mux_ramWrite(
        bCtrlState,
        wCtrlWrite,     // 0: ST_INIT
	    1'b0,           // 1: ST_FLUSHTODISP,
	    wCtrlWrite,     // 2: ST_CHECKKEY,
	    wCtrlWrite,     // 3: ST_CHECKBLOCKCANSETTO,
	    wBWWrite,       // 4: ST_BLOCKWRITE,
	    wCtrlWrite,     // 5: ST_CHECKLINE,
        wCtrlWrite,     // 6: ST_COPYLINES
        1'b0,           // 7: empty
        wram_Write
        );

    hdl4se_mux8 #(6) mux_ramWriteAddr(
        bCtrlState,
        bCtrlWriteAddr, // 0: ST_INIT
	    6'b0,           // 1: ST_FLUSHTODISP,
	    bCtrlWriteAddr, // 2: ST_CHECKKEY,
	    bCtrlWriteAddr, // 3: ST_CHECKBLOCKCANSETTO,
	    bBWWriteAddr,   // 4: ST_BLOCKWRITE,
	    bCtrlWriteAddr, // 5: ST_CHECKLINE,
        bCtrlWriteAddr, // 6: ST_COPYLINES
        6'b0,           // 7: empty
        bram_WriteAddr
        );

    hdl4se_mux8 #(64) mux_ramWriteData(
        bCtrlState,
        bCtrlWriteData, // 0: ST_INIT
	    64'b0,           // 1: ST_FLUSHTODISP,
	    bCtrlWriteData, // 2: ST_CHECKKEY,
	    bCtrlWriteData, // 3: ST_CHECKBLOCKCANSETTO,
	    bBWWriteData,   // 4: ST_BLOCKWRITE,
	    bCtrlWriteData, // 5: ST_CHECKLINE,
        bCtrlWriteData, // 6: ST_COPYLINES
        64'b0,           // 7: empty
        bram_WriteData
        );

    hdl4se_mux8 #(6) mux_ramReadAddr(
        bCtrlState,
        bCtrlReadAddr, // 0: ST_INIT
	    bFlushReadAddr,// 1: ST_FLUSHTODISP,
	    bCtrlReadAddr, // 2: ST_CHECKKEY,
	    bCtrlReadAddr, // 3: ST_CHECKBLOCKCANSETTO,
	    bBWReadAddr,   // 4: ST_BLOCKWRITE,
	    bCtrlReadAddr, // 5: ST_CHECKLINE,
        bCtrlReadAddr, // 6: ST_COPYLINES
        6'b0,          // 7: empty
        bram_ReadAddr
        );

    assign bCtrlReadData = bram_ReadData;
    assign bFlushReadData = bram_ReadData;
    assign bBWReadData = bram_ReadData;

/*我们一直在读按键的状态*/
    assign wRead = 1'b1;
    assign bReadAddr = 32'hF000_0000;
    assign bCtrlKeyData = bReadData; /* 按键信息直接接入到控制器中去 */

endmodule

其中变化比较大的是仲裁模块,我们为将来模块不断在外面实现预留了接口。

10.5.5 实现LCOM类

这是一段LCOM规定的八股文,必须按照这个规矩写,否则LCOM系统就不认。这样做虽然限制了软件工程师的自由,但是也给软件工程化管理带来很大的好处。正是这些八股文在其中,保证了整个软件架构的稳定性。HDL4SE系统到现在连续写了三万多行代码,中间的设计变更过若干次,但是软件架构没有乱,更没有崩溃的迹象,应该说,LCOM在其中起到了骨架的作用,不按LCOM八股文的格式来写程序,编译器就过不去,从技术上维持了软件架构的一致性和稳定性。

10.5.5.1 基本部分

作为LCOM的规范性需要,有一段承上启下的代码作为实现这个对象的基础,如果前面的端口设计算是LCOM八股文中的"起“,这一段就可以算是LCOM八股文中的“承”部分吧:

/*
* blockwrite.c
  修改记录:
	202106221411: rxh, initial version
*/
#include "stdlib.h" 
#include "stdio.h"
#include "string.h"
#include "object.h"
#include "dlist.h"
#include "bignumber.h"
#include "hdl4secell.h"
#include "terris.h"

/*
00, input           wClk,
01, input   [3:0]   bCtrlState,
02,	output          wCtrlStateComplete,
03,	output  [5:0]   bBWReadAddr,
04,	input   [63:0]  bBWReadData,
05,	output          wBWWrite,
06,	output  [5:0]   bBWWriteAddr,
07,	output  [63:0]  bBWWriteData,
08,	input   [63:0]  bCurBlock,
09,	input   [15:0]  bCurBlockPos
*/

/* wClk不算 */
#define INPUTPORTCOUNT 4
typedef struct _sTerrisBlockWrite {
	OBJECT_HEADER
	INTERFACE_DECLARE(IHDL4SEUnit)
	HDL4SEUNIT_VARDECLARE
	DLIST_VARDECLARE
	
	IHDL4SEModule** parent;
	char* name;

	IBigNumber**  inputdata;
	IHDL4SEUnit** input_unit[INPUTPORTCOUNT];
	int           input_index[INPUTPORTCOUNT];
	
	int index;
	int readindex; /* 模拟读地址寄存器,比index晚一拍 */
	int readindex_1;
}sTerrisBlockWrite;

OBJECT_FUNCDECLARE(blockwrite, CLSID_TERRIS_BLOCKWRITE);
HDL4SEUNIT_FUNCDECLARE(blockwrite, CLSID_TERRIS_BLOCKWRITE, sTerrisBlockWrite);
DLIST_FUNCIMPL(blockwrite, CLSID_TERRIS_BLOCKWRITE, sTerrisBlockWrite);

OBJECT_FUNCIMPL(blockwrite, sTerrisBlockWrite, CLSID_TERRIS_BLOCKWRITE);


QUERYINTERFACE_BEGIN(blockwrite, CLSID_TERRIS_BLOCKWRITE)
QUERYINTERFACE_ITEM(IID_HDL4SEUNIT, IHDL4SEUnit, sTerrisBlockWrite)
QUERYINTERFACE_ITEM(IID_DLIST, IDList, sTerrisBlockWrite)
QUERYINTERFACE_END

static const char* blockwriteModuleInfo()
{
	return "1.0.0-20210622.1411 Terris BlockWrite module";
}

static int blockwriteCreate(const PARAMITEM* pParams, int paramcount, HOBJECT* pObject)
{
	sTerrisBlockWrite* pobj;
	int i;
	pobj = (sTerrisBlockWrite*)malloc(sizeof(sTerrisBlockWrite));
	if (pobj == NULL)
		return -1;
	*pObject = 0;
	HDL4SEUNIT_VARINIT(pobj, CLSID_TERRIS_BLOCKWRITE);
	INTERFACE_INIT(IHDL4SEUnit, pobj, blockwrite, hdl4se_unit);
	DLIST_VARINIT(pobj, blockwrite);
	
	pobj->name = NULL;
	pobj->parent = NULL;
	
	for (i = 0;i< INPUTPORTCOUNT;i++)
		pobj->input_unit[i] = NULL;

	pobj->inputdata = bigintegerCreate(64);

	pobj->index = 0;

	for (i = 0; i < paramcount; i++) {
		if (pParams[i].name == PARAMID_HDL4SE_UNIT_NAME) {
			if (pobj->name != NULL)
				free(pobj->name);
			pobj->name = strdup(pParams[i].pvalue);
		} 
		else if (pParams[i].name == PARAMID_HDL4SE_UNIT_PARENT) {
			pobj->parent = (IHDL4SEModule **)pParams[i].pvalue;
		}
	}

	/* 返回生成的对象 */
	OBJECT_RETURN_GEN(blockwrite, pobj, pObject, CLSID_TERRIS_BLOCKWRITE);
	return EIID_OK;
}

static void blockwriteDestroy(HOBJECT object)
{
	sTerrisBlockWrite* pobj;
	int i;
	pobj = (sTerrisBlockWrite*)objectThis(object);
	if (pobj->name != NULL)
		free(pobj->name);
	for (i = 0; i < INPUTPORTCOUNT; i++)
		objectRelease(pobj->input_unit[i]);
	objectRelease(pobj->inputdata);

	memset(pobj, 0, sizeof(sTerrisBlockWrite));
	free(pobj);
}

static int blockwriteValid(HOBJECT object)
{
	sTerrisBlockWrite* pobj;
	pobj = (sTerrisBlockWrite*)objectThis(object);
	return 1;
}

我们为每个输入端口(除wClk之外)记录了一个模块对象和索引值,将来外部将输入连接到其他模块时,用来记录连接信息。

10.5.5.2 模块实例化时的连接信息管理

作为一个verilog的module,在实例化时会与其他的模块,线网,端口等连接。HDL4SE的思路是,在建模过程中,每个模块的输入端口主动连接到上游驱动端,当然对模块内部,则是输出端口主动连接到内部驱动端,连接的办法就是调用IHDL4SEUnit接口中定义的连接函数,我们实现如下:

static int blockwrite_hdl4se_unit_Connect(HOBJECT object, int index, HOBJECT from, int fromindex)
{
#define CONNECTPORT(ind, innerind) \
	if (index == ind) { \
		if (0 == objectQueryInterface(from, IID_HDL4SEUNIT, (void**)&unit)) { \
			objectRelease(pobj->input_unit[innerind]); \
			pobj->input_unit[innerind] = unit; \
			pobj->input_index[innerind] = fromindex; \
		} \
	}

	sTerrisBlockWrite* pobj;
	IHDL4SEUnit** unit = NULL;
	pobj = (sTerrisBlockWrite*)objectThis(object);
	CONNECTPORT( 1, 0); /* bCtrlState */
	CONNECTPORT( 4, 1); /* bBWReadData */
	CONNECTPORT( 8, 2); /* bCurBlock */
	CONNECTPORT( 9, 3); /* bCurBlockPos */
	return 0;
}

这个函数记录了建模过程中的模块实例化时端口连接信息,实际调用过程一般由编译器生成,可能是这样:

/*
对应verilog模块实例化代码:
blockwrite blockwriter(wClk,
            bCtrlState, wBWCtrlStateComplete,
            bBWReadAddr, bBWReadData,
            wBWWrite, bBWWriteAddr, bBWWriteData,
            bCurBlock, bCurBlockPos
            );
*/
modules[  3] = hdl4seCreateUnit2(module, "b0d75037-0831-49e5-bbd0-f6b5e07cbb51", "", "blockwriter");
	objectCall3(modules[3], Connect, 0, unit, 0);
	objectCall3(modules[3], Connect, 1, nets[13], 0);
	objectCall3(nets[27], Connect, 0, modules[3], 2);
	objectCall3(nets[23], Connect, 0, modules[3], 3);
	objectCall3(modules[3], Connect, 4, nets[25], 0);
	objectCall3(nets[28], Connect, 0, modules[3], 5);
	objectCall3(nets[24], Connect, 0, modules[3], 6);
	objectCall3(nets[26], Connect, 0, modules[3], 7);
	objectCall3(modules[3], Connect, 8, nets[18], 0);
	objectCall3(modules[3], Connect, 9, nets[19], 0);

可以看出,这段代码生成了一个blockwrite模块的实例对象,编译器发现blockwrite是一个LCOM对象,于是调用hdl4seCreateUnit2根据CLSID找到并生成c语言编写的模块,然后将每个输入端口连接到实例化指定的对象上去。模块内部则通过Connect函数记录了这个连接表信息。对于连接到输出端口上的对象则调用对象的连接函数连接到本模块的输出端口上来。这样,对象实例化的连接信息就完整保存下来了。

10.5.5.3 运行时下游模块对输出端口进行采样

这是通过本模块的GetValue函数进行的,实现如下:

static unsigned long long blockwrite_hdl4se_unit_GetWriteData(sTerrisBlockWrite* pobj, int blockx, int blocky)
{
	int i;
	int y;
	unsigned long long line, curblock, curblockline;
	objectCall3(pobj->input_unit[1], GetValue, pobj->input_index[1], 64, pobj->inputdata);
	objectCall1(pobj->inputdata, GetUint64, &line);
	objectCall3(pobj->input_unit[2], GetValue, pobj->input_index[2], 64, pobj->inputdata);
	objectCall1(pobj->inputdata, GetUint64, &curblock);
	curblockline = (curblock >> ((3-pobj->readindex_1) * 16)) & 0xffff;
	if (blockx < 3)
		curblockline >>= ((3 - blockx) * 4);
	else
		curblockline <<= (blockx - 3) * 4;
	return line | curblockline;
}

static int blockwrite_hdl4se_unit_GetValue(HOBJECT object, int index, int width, IBigNumber ** value)
{
	sTerrisBlockWrite* pobj;
	pobj = (sTerrisBlockWrite*)objectThis(object);
	unsigned int blockpos;
	unsigned int blockx, blocky;
	objectCall3(pobj->input_unit[3], GetValue, pobj->input_index[3], 16, pobj->inputdata);
	objectCall1(pobj->inputdata, GetUint32, &blockpos);
	blockx = blockpos & 0xff;
	blocky = blockpos >> 8;
	if (index == 2) { /* wCtrlStateComplete */
		objectCall1(value, AssignUint32, (pobj->readindex_1 >= 4)?1:0);
	}
	else if (index == 3) {/* bBWReadAddr */
		objectCall1(value, AssignUint32, pobj->readindex + blocky - 2);
	}
	else if (index == 5) {/* wBWWrite */
		objectCall1(value, AssignUint32, ( (pobj->readindex_1 >= 0) && (pobj->readindex_1 <= 3) ) ? 1 : 0);
	}
	else if (index == 6) {/* bBWWriteAddr */
		objectCall1(value, AssignUint32, pobj->readindex_1 + blocky - 2);
	}
	else if (index == 7) {/* bBWWriteData */
		objectCall1(value, AssignUint64, blockwrite_hdl4se_unit_GetWriteData(pobj, blockx, blocky));
	}
	return 0;
}

建模过程中下游模块的调用其Connect函数将输入端口连接到本模块的输出端口上,运行时下游模块如果需要知道输入的值,就会调用本模块的GetValue函数,来获得对应的值。GetValue是通过端口编号来区分连接端口的。本模块生成输出数据时也可以向上游模块申请输入值,从而驱动整个模块网络的计算。

10.5.5.4 时钟信号处理

对于一个组合逻辑模块,所有输出都是根据输入信息生成的,因此模块内部不需要记录值,它只是在GetValue函数中调用上游模块的GetValue函数获得端口的输入值,然后加工后返回给下游模块。然而对于一个时序电路模块,输出与内部的状态相关,为了确保系统的计算结果与运算顺序无关,内部状态分两个步骤进行处理,首先在IHDL4SEUnit接口的ClkTick函数中进行内部状态的计算,存储到一个临时变量中,由于此时通过GetValue得到的值都是上一周期的状态,因此这一步骤计算的结果与各个模块的计算顺序无关。全部模块执行ClkTick生成新的状态存储到各自的临时变量中后,每个模块在执行Setup函数,将ClkTick生成的临时状态存储到状态变量中。
这就意味着GetValue生成输出值时,如果需要依赖内部状态时,不能引用临时变量,必须用状态变量中的值。本模块有两个内部状态变量,的ClkTick函数和Setup函数定义如下:

static int blockwrite_hdl4se_unit_ClkTick(HOBJECT object)
{
	sTerrisBlockWrite* pobj;
	pobj = (sTerrisBlockWrite*)objectThis(object);
	unsigned int ctrlstate;
	objectCall3(pobj->input_unit[0], GetValue, pobj->input_index[0], 32, pobj->inputdata);
	objectCall1(pobj->inputdata, GetUint32, &ctrlstate);
	if (ctrlstate == ST_BLOCKWRITE) {
		pobj->index++;
	}
	else {
		pobj->index = -1;
	}
	return 0;
}

static int blockwrite_hdl4se_unit_Setup(HOBJECT object)
{
	sTerrisBlockWrite* pobj;
	pobj = (sTerrisBlockWrite*)objectThis(object);
	pobj->readindex_1 = pobj->readindex;
	pobj->readindex = pobj->index;
	return 0;
}

其中index是临时状态变量, readindex和readindex_1是状态变量,上述GetValue中不能引用index,只能引用readindex或者readindex_1。这里边的原因在于,我们无法保证ClkTick和GetValue的调用顺序,因此在ClkTick中计算index的值,但是GetValue由于调用顺序不同,可能有些模块得到的是计算前的值,有些模块得到的是更新后的值,这样结果就依赖调用顺序了,可能导致结果错误。
同时我们也注意到,ClkTick中不能修改内部状态变量,Setup中则不能调用GetValue来得到上游的值(此时上游模块可能已经调用过Setup,得到的值又与顺序相关了),这个是IHDL4SEUnit编程中必须遵守的。

10.5.5.5 与主程序连接

这几个八股文中的“转”做完后,我们在主程序中对该模块进行注册,将相关的verilog文件编译成目标代码(这里也是c语言代码),再跟其他模块连接在一起,一个c语言和verilog协同工作的应用就大功告成了,这部分算是LCOM八股文中的”合“吧:

IHDL4SEUnit** hdl4seCreate_main(IHDL4SEModule** parent, const char* instanceparam, const char* name);
extern int (*A_u_t_o_registor_terrisctrl)();
extern int (*A_u_t_o_registor_terrisctrl1)();
extern int (*A_u_t_o_registor_flushtodisp)();
extern int (*A_u_t_o_registor_blockwrite)();

int main(int argc, char* argv[])
{
	int i;
	int width;
	int count, unitcount;
	IHDL4SEUnit** sim_unit;
	IHDL4SEWaveOutput** vcdfile;
	A_u_t_o_registor_terrisctrl();
	A_u_t_o_registor_terrisctrl1();
	A_u_t_o_registor_flushtodisp();
	A_u_t_o_registor_blockwrite();
	sim = hdl4sesimCreateSimulator();
	topmodule = hdl4seCreate_main(NULL, "", "main");
	......

A_u_t_o开头的函数原来是设计在vxWorks系统下运行的,vxWorks系统下可以访问模块符号表,模块调入内存后,找到它的符号表中A_u_t_o_registor_开始的符号,逐个调用,就能完成模块中的LCOM对象注册,其他系统中如Windows的DLL动态连接库,可以在DllMain中进行对象注册,在Linux的动态连接库so文件中也有类似的手段,反而在静态库中没有好的办法,只能在主程序中手动调用注册函数注册了。基本单元库是在生成任意一个基本单元时统一注册的:

static int hdl4secell_registed = 0;

extern OFUNCPTR A_u_t_o_registor_hdl4se_ram1p;
extern OFUNCPTR A_u_t_o_registor_hdl4se_ram2p;
extern OFUNCPTR A_u_t_o_registor_hdl4se_reg;
extern OFUNCPTR A_u_t_o_registor_hdl4se_unop;
extern OFUNCPTR A_u_t_o_registor_hdl4se_binop;
extern OFUNCPTR A_u_t_o_registor_hdl4se_wire;
extern OFUNCPTR A_u_t_o_registor_hdl4se_const;
extern OFUNCPTR A_u_t_o_registor_hdl4se_module;
extern OFUNCPTR A_u_t_o_registor_hdl4se_bind2;
extern OFUNCPTR A_u_t_o_registor_hdl4se_bind3;
extern OFUNCPTR A_u_t_o_registor_hdl4se_bind4;
extern OFUNCPTR A_u_t_o_registor_hdl4se_split1;
extern OFUNCPTR A_u_t_o_registor_hdl4se_split2;
extern OFUNCPTR A_u_t_o_registor_hdl4se_split4;
extern OFUNCPTR A_u_t_o_registor_hdl4se_mux2;
extern OFUNCPTR A_u_t_o_registor_hdl4se_mux4;
extern OFUNCPTR A_u_t_o_registor_hdl4se_mux8;
extern OFUNCPTR A_u_t_o_registor_hdl4se_mux16;

IHDL4SEUnit** hdl4seCreateUnit(IHDL4SEModule** parent, IIDTYPE clsid, const char* instanceparam, const char* name)
{
	PARAMITEM param[3];
	IHDL4SEUnit** result = NULL;
	param[0].name = PARAMID_HDL4SE_UNIT_INSTANCE_PARAMETERS;
	param[0].pvalue = (void *)instanceparam;
	param[1].name = PARAMID_HDL4SE_UNIT_NAME;
	param[1].pvalue = (void *)name;
	param[2].name = PARAMID_HDL4SE_UNIT_PARENT;
	param[2].pvalue = parent;
	if (hdl4secell_registed == 0) {
		A_u_t_o_registor_hdl4se_ram1p();
		A_u_t_o_registor_hdl4se_ram2p();
		A_u_t_o_registor_hdl4se_reg();
		A_u_t_o_registor_hdl4se_unop();
		A_u_t_o_registor_hdl4se_binop();
		A_u_t_o_registor_hdl4se_wire();
		A_u_t_o_registor_hdl4se_const();
		A_u_t_o_registor_hdl4se_module();
		A_u_t_o_registor_hdl4se_bind2();
		A_u_t_o_registor_hdl4se_bind3();
		A_u_t_o_registor_hdl4se_bind4();
		A_u_t_o_registor_hdl4se_split1();
		A_u_t_o_registor_hdl4se_split2();
		A_u_t_o_registor_hdl4se_split4();
		A_u_t_o_registor_hdl4se_mux2();
		A_u_t_o_registor_hdl4se_mux4();
		A_u_t_o_registor_hdl4se_mux8();
		A_u_t_o_registor_hdl4se_mux16();
		hdl4secell_registed = 1;
	}

注意到如果一个模块同时存在verilog版本和c语言版本,通过控制verilog模块开始的属性表,比如注释掉其中的HDL4SE=“LCOM”,此时使用的就是verilog版本。编译器发现有HDL4SE="LCOM"以及有CLSID属性的,就生成调用c语言模块的代码,否则就编译整个module的verilog实现,作为模块的实现代码。

10.6 小结

用10.5中的步骤,逐个实现了俄罗斯方块游戏中的各个模块,当然中间为了将功能从主控模块调整到外部模块,调整了主控模块的接口。已经实现了各个模块的verilog版本和c语言版本。代码已经上传到git。
这个过程虽然有些勉强,不过说明编译系统和仿真控制系统的总体结构还是可行的。 下一步编译系统支持always块,支持其中的if语句,case语句后,写verilog代码应该会更容易,目前还存在的问题是IEEE.1364-2005的原始BNF语法没有处理表达式中运算符号的优先级关系,导致表达式优先级有问题。

【请参考】
1.HDL4SE:软件工程师学习Verilog语言(九)
2.HDL4SE:软件工程师学习Verilog语言(八)
3.HDL4SE:软件工程师学习Verilog语言(七)
4.HDL4SE:软件工程师学习Verilog语言(六)
5.HDL4SE:软件工程师学习Verilog语言(五)
6.HDL4SE:软件工程师学习Verilog语言(四)
7.HDL4SE:软件工程师学习Verilog语言(三)
8.HDL4SE:软件工程师学习Verilog语言(二)
9.HDL4SE:软件工程师学习Verilog语言(一)
10.LCOM:轻量级组件对象模型
11.LCOM:带数据的接口
12.工具下载:在64位windows下的bison 3.7和flex 2.6.4
13.git: verilog-parser开源项目
14.git: HDL4SE项目
15.git: LCOM项目
16.git: GLFW项目

你可能感兴趣的:(笔记,visual,studio,code,verilog,c语言,有限状态机)