背景:我们编辑了3x3卷积的IPcore,并完成了预编译。程序通过调用3*3卷积的IPcore实现运行。并通过HLS预编译指令实现为硬件结构,现在我们需要对IPcore程序进行HLS。
目的:对卷积IPcore进行HLS
目录
一、testconvBench编写
1.1 linux下用cmake编译运行程序
1.2 隐患与BUG
1.3 testBench编写
卷积尺寸
卷积与结果对比
二、c-simulation
出现bug更改流程
三、几个bug与解决
3.1 reg格式问题
3.2 关于DRAM接口的问题
3.3 DATAFLOW的错误
3.4 调试N_PE的问题
四、Bug位置查找
4.1 processInputChannel
function instantiate
WBRAM
Loop 'L_CH_OUT' in 'processAll_channelOut'
OBRAM没有生成RTL端口
4.2 整个IPcore的HLS console
原程序需要调用OpenCV并且调用次数过多,无法当作HLS的testBench,我们需要编写简单的testBench,先确保IPcore无误且可用。
HLS_test文件夹,里面文件夹src放入相应程序。HLS_test文件夹创建CMakeList.txt文件
cmake_minimum_required(VERSION 2.8)
project(main)
set(CUDA_USE_STATIC_CUDA_RUNTIME OFF)
set(QMAKE_CXXFLAGS "-std=c++11")
AUX_SOURCE_DIRECTORY(./src DIR_SRCS)
add_executable(test_convBench ${DIR_SRCS})
第一个表示cmake的最低版本,project表示编译的是主程序文件来生成可执行文件。set是编译器的类型。add_executable表示生成的可执行文件的名字和位置。
xxr@gpu-SYS-7048GR-TR:~/Desktop/xxr2/HLS_test$ cmake .
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/xxr/Desktop/xxr2/HLS_test
xxr@gpu-SYS-7048GR-TR:~/Desktop/xxr2/HLS_test$ make
Scanning dependencies of target test_convBench
[ 25%] Building CXX object CMakeFiles/test_convBench.dir/src/pBox.cpp.o
[ 50%] Building CXX object CMakeFiles/test_convBench.dir/src/fpgaAcc.cpp.o
[ 75%] Building CXX object CMakeFiles/test_convBench.dir/src/test_convBench.cpp.o
[100%] Linking CXX executable test_convBench
[100%] Built target test_convBench
xxr@gpu-SYS-7048GR-TR:~/Desktop/xxr2/HLS_test$ ./test_convBench
test SUCCESS!
void convolution_3x3(const Weight *weightIn, const pBox *pboxIn, pBox *outpBox)
我们先确定我们卷积的形式。我们将卷积形式定为下表格式,输入和权重随机生成。在PS端和IPcore端都进行3*3卷积验证输出是否正确。
InputSize | KernelSize | Stride | Padding | OutputSize |
25*25*64 |
3*3*64 | 2 |
Valid(no padding) |
12*12*64 |
两个相关的知识点:
出现BUG原因:
经过上面bug,我们需要确定,测试的网络结构不能过大免得OBRAM不够用的情况。
经过运行与检验(卷积的Stride有误conv3*3与convlution对不上,stride=1时可以对上,stride为2的时候不能对上。流程较长,难以排查,需要等到后面输出参数再重新调试)暂时将stride设为1看结果。
InputSize | KernelSize | Stride | Padding | OutputSize |
24*24*64 |
3*3*64 | 1 |
Valid(no padding) |
22*22*64 |
//conv parameters
int inputSize=23; int inChannelNum=32;
int outputSize=21; int OutChannelNum=64;
int kernelSize=3; int Stride=1;
int Input_Pixels=inputSize*inputSize*inChannelNum;
int Output_Pixels=outputSize*outputSize*OutChannelNum;
int weightkernel_Pixels=9*inChannelNum*OutChannelNum;
//conv variable
Weight weightIn;
pBox featureIn;
pBox conv_PL_out;
pBox conv_PS_out;
//initialize conv weight variable
weightIn.out_ChannelNum=OutChannelNum;
weightIn.in_ChannelNum=inChannelNum;
weightIn.kernelSize=kernelSize;
weightIn.stride=Stride;
weightIn.leftPad=0;
weightIn.rightPad=0;
weightIn.pdata=(float *)malloc(sizeof(float)*weightkernel_Pixels);
//weightIn.pbias=(float *)malloc(sizeof(float)*OutChannelNum);
for (int i=0;i
根据卷积生成相应的网络尺寸。并且开辟相应的内存空间。将权重与卷积的值设为随机生成的值。
将IPcore的卷积,与实际的卷积进行对比,来判断相应的结果是否一致。
//conv in PS
convolution(&weightIn,&featureIn,&conv_PS_out);
//conv in PL
convolution_3x3(&weightIn,&featureIn,&conv_PL_out);
//compare in PS and PL
int error=0;
for(int i=0;i
相关内容:FPGA实践教程(一)用HLS将c程序生成IPcore https://blog.csdn.net/weixin_36474809/article/details/80597166
服务器上HLS-test,将相应IPcore运行成功,即可直接放入HLS进行c-simulation。
INFO: [HLS 200-10] Setting target device to 'xc7z035ffg676-2'
INFO: [SIM 211-2] *************** CSIM start ***************
INFO: [SIM 211-4] CSIM will launch GCC as the compiler.
Compiling ../../../../src/test_convBench.cpp in debug mode
Compiling ../../../../src/fpgaAcc.cpp in debug mode
Generating csim.exe
Test Start SUCCESS!
Variable init SUCCESS!
Conv in PS SUCCESS!
33convConv in PL SUCCESS!
Compare DONE SUCCESS!
PS and PL conv match SUCCESS!
INFO: [SIM 211-1] CSim done with 0 errors.
3 errors generated.
ERROR: [HLS 200-70] Compilation errors found:
Pragma processor failed: In file included from src/fpgaAcc.cpp:1:
src/fpgaAcc.cpp:166:11: error: use of undeclared identifier 'reg'
float px=reg(input_ptr[load_pixel_offset+in_channel_pixel_offset]);
^
src/fpgaAcc.cpp:173:15: error: use of undeclared identifier 'reg'
float read = reg(weight_DRAM_ptr[weight_loc]);
^
src/fpgaAcc.cpp:208:12: error: use of undeclared identifier 'reg'
float px=reg(ImageCache::get_IBRAM_Pixel(IBRAM_line_offset,pixel_col_to_load,
^
3 errors generated.
Failed checking during preprocessing.
while executing
"source /home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore/solution1/csynth.tcl"
invoked from within
"hls::main /home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore/solution1/csynth.tcl"
("uplevel" body line 1)
invoked from within
"uplevel 1 hls::main {*}$args"
(procedure "hls_proc" line 5)
invoked from within
"hls_proc $argv"
.h文件表示c文件,.hpp文件表示c++文件,可以将相应的.h文件改为.hpp文件。但是改了之后依然是此bug。
可能与reg的定义之前加了相应的#ifndef __SYNTHESIS__有关。不懂为什么zynqNet要加这个指令。我们将此指令删掉。此bug解决。
INFO: [XFORM 203-603] Inlining function 'MemoryController::writeBackOutputChannel' into 'convolution_3x3' (src/fpgaAcc.cpp:107).
INFO: [HLS 200-111] Finished Standard Transforms Time (s): cpu = 00:00:34 ; elapsed = 00:00:26 . Memory (MB): peak = 361.926 ; gain = 13.668 ; free physical = 607 ; free virtual = 33200
INFO: [HLS 200-10] Checking synthesizability ...
INFO: [XFORM 203-602] Inlining function 'ImageCache::writeNextChannelPixel_2_IBRAM' into 'convolution_3x3' (src/fpgaAcc.cpp:336->src/fpgaAcc.cpp:327->src/fpgaAcc.cpp:84) automatically.
ERROR: [SYNCHK 200-11] src/fpgaAcc.cpp:259: Argument 'weightIn.pdata' of function 'convolution_3x3' (src/fpgaAcc.cpp:45) has an unsynthesizable type (possible cause(s): pointer to pointer or global pointer).
ERROR: [SYNCHK 200-61] src/fpgaAcc.cpp:174: unsupported memory access on variable 'weightIn.pdata' which is (or contains) an array with unknown size at compile time.
INFO: [SYNCHK 200-10] 2 error(s), 0 warning(s).
ERROR: [HLS 200-70] Synthesizability check failed.
command 'ap_source' returned error code
while executing
ERROR: [SYNCHK 200-11] src/fpgaAcc.cpp:259: Argument 'weightIn.pdata' of function 'convolution_3x3' (src/fpgaAcc.cpp:45) has an unsynthesizable type (possible cause(s): pointer to pointer or global pointer).
weightIn.pdata这个包含着不能被HLS综合的类型,例如指针指向的指针,或者全局变量指针。
ERROR: [SYNCHK 200-61] src/fpgaAcc.cpp:174: unsupported memory access on variable 'weightIn.pdata' which is (or contains) an array with unknown size at compile time.
weightIn.pdata是一个(或者包含)不知大小的数组。
因此我们必须添加预编译指令对接口进行综合。
MTCNN的FPGA实现(四)接口的HLS https://blog.csdn.net/weixin_36474809/article/details/84940846
WARNING: [XFORM 203-562] Loop 'L_CH_OUT' (src/fpgaAcc.cpp:241) in function 'processAll_channelOu' has unknown bound because it has multiple exiting blocks.
WARNING: [XFORM 203-713] Function 'processInputChannel..1' (src/fpgaAcc.cpp:226:1) failed dataflow checking: A dataflow region cannot be instantiated from with a pipelined loop (src/fpgaAcc.cpp:226:1). Ignoring pipeline directive to allow the dataflow directive to take precedence. This behavior can be disabled by using 'config_compile -disable_dataflow_pipeline_check'.
Instruction does not dominate all uses!
%tmp_60 = add i32 %WeightsCache_inChan_1, %tmp_59
%memorybus_addr_rd_re = call i1 @_ssdm_op_ReadReq.m_axi.floatP(float* %memorybus_addr, i32 %tmp_60), !dbg !1031
Broken module found, compilation aborted!
Stack dump:
0. Running pass 'Function Pass Manager' on module '/home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore/solution1/.autopilot/db/a.o.2.bc'.
1. Running pass 'Module Verifier' on function '@convolution_3x3'
/mnt/workspace/Xilinx/Vivado/2017.4/bin/loader: line 194: 15937 Aborted (core dumped) "$RDI_PROG" "$@"
Finished C synthesis.
重要报错:
WARNING: [XFORM 203-713] Function 'processInputChannel..1' (src/fpgaAcc.cpp:226:1) failed dataflow checking: A dataflow region cannot be instantiated from with a pipelined loop (src/fpgaAcc.cpp:226:1). Ignoring pipeline directive to allow the dataflow directive to take precedence. This behavior can be disabled by using 'config_compile -disable_dataflow_pipeline_check'.
Instruction does not dominate all uses!
%tmp_60 = add i32 %WeightsCache_inChan_1, %tmp_59
%memorybus_addr_rd_re = call i1 @_ssdm_op_ReadReq.m_axi.floatP(float* %memorybus_addr, i32 %tmp_60), !dbg !1031
Broken module found, compilation aborted!
Stack dump:
0. Running pass 'Function Pass Manager' on module '/home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore/solution1/.autopilot/db/a.o.2.bc'.
1. Running pass 'Module Verifier' on function '@convolution_3x3'
/mnt/workspace/Xilinx/Vivado/2017.4/bin/loader: line 194: 15937 Aborted (core dumped) "$RDI_PROG" "$@"
关于此,我们发现很有可能许多优化指令都没有添加成功。因为面板之中有一些这种报错,不知道是否添加成功。
根据后面的console面板信息,我们发现可能是成功的,因为两点
WARNING: [XFORM 203-631] Renaming function 'ProcessingElement::processAll_channelOut' to 'processAll_channelOu' (src/fpgaAcc.cpp:192:43)
INFO: [XFORM 203-811] Inferring bus burst read of variable length on port 'memorybus' (src/fpgaAcc.cpp:178:15).
WARNING: [XFORM 203-562] Loop 'L_CH_OUT' (src/fpgaAcc.cpp:241) in function 'processAll_channelOu' has unknown bound because it has multiple exiting blocks.
Instruction does not dominate all uses!
%tmp_64 = add i32 %WeightsCache_inChan_1, %tmp_63
%memorybus_addr_rd_re = call i1 @_ssdm_op_ReadReq.m_axi.floatP(float* %memorybus_addr, i32 %tmp_64), !dbg !1031
Broken module found, compilation aborted!
Stack dump:
0. Running pass 'Function Pass Manager' on module '/home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore/solution1/.autopilot/db/a.o.2.bc'.
1. Running pass 'Module Verifier' on function '@convolution_3x3'
/mnt/workspace/Xilinx/Vivado/2017.4/bin/loader: line 194: 35285 Aborted (core dumped) "$RDI_PROG" "$@"
Finished C synthesis.
在processAll_channelOut之中,是否展开系数N_PE这个优化指令能被运用上,我们直接将N_PE改为16,依然此报错。
将pipeline II=1删掉,依然同样报错。
嵌套IPcore过大,需要将其改小,单独的单元来进行测试。我们将processInputChannel设为top function,然后获得实验结果:
Starting C synthesis ...
/mnt/workspace/Xilinx/Vivado/2017.4/bin/vivado_hls /home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore/solution1/csynth.tcl
INFO: [HLS 200-10] Running '/mnt/workspace/Xilinx/Vivado/2017.4/bin/unwrapped/lnx64.o/vivado_hls'
INFO: [HLS 200-10] For user 'osrc' on host 'osrc-virtual-machine' (Linux_x86_64 version 4.13.0-32-generic) on Wed Dec 12 10:37:09 CST 2018
INFO: [HLS 200-10] On os Ubuntu 16.04.3 LTS
INFO: [HLS 200-10] In directory '/home/osrc/Desktop/document/conv_Core/HLS_Conv'
INFO: [HLS 200-10] Opening project '/home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore'.
INFO: [HLS 200-10] Adding design file 'src/fpgaAcc.cpp' to the project
INFO: [HLS 200-10] Adding design file 'src/fpgaAcc.hpp' to the project
INFO: [HLS 200-10] Adding design file 'src/pBox.cpp' to the project
INFO: [HLS 200-10] Adding design file 'src/pBox.h' to the project
INFO: [HLS 200-10] Adding test bench file 'src/test_convBench.cpp' to the project
INFO: [HLS 200-10] Opening solution '/home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore/solution1'.
INFO: [SYN 201-201] Setting up clock 'default' with a period of 10ns.
INFO: [HLS 200-10] Setting target device to 'xc7z035ffg676-2'
INFO: [HLS 200-10] Analyzing design file 'src/pBox.cpp' ...
INFO: [HLS 200-10] Analyzing design file 'src/fpgaAcc.cpp' ...
INFO: [HLS 200-10] Validating synthesis directives ...
INFO: [HLS 200-111] Finished Checking Pragmas Time (s): cpu = 00:00:31 ; elapsed = 00:00:19 . Memory (MB): peak = 361.637 ; gain = 13.375 ; free physical = 395 ; free virtual = 32671
INFO: [HLS 200-111] Finished Linking Time (s): cpu = 00:00:33 ; elapsed = 00:00:21 . Memory (MB): peak = 361.637 ; gain = 13.375 ; free physical = 393 ; free virtual = 32671
INFO: [HLS 200-10] Starting code transformations ...
INFO: [XFORM 203-603] Inlining function 'ImageCache::calcu_IBRAM_row_offset' into 'ProcessingElement::loadPixel_buffer' (src/fpgaAcc.cpp:209).
INFO: [XFORM 203-603] Inlining function 'ImageCache::get_IBRAM_Pixel' into 'ProcessingElement::loadPixel_buffer' (src/fpgaAcc.cpp:213).
INFO: [XFORM 203-603] Inlining function 'ProcessingElement::loadPixel_buffer' into 'ProcessingElement::processInputChannel' (src/fpgaAcc.cpp:230).
INFO: [XFORM 203-603] Inlining function 'WeightsCache::get_WBRAM_addr' into 'WeightsCache::get_9_weights_to_buffer' (src/fpgaAcc.cpp:307).
INFO: [XFORM 203-603] Inlining function 'WeightsCache::get_9_weights_to_buffer' into 'ProcessingElement::processAll_channelOut' (src/fpgaAcc.cpp:247).
INFO: [XFORM 203-603] Inlining function 'ProcessingElement::macc2d' into 'ProcessingElement::processAll_channelOut' (src/fpgaAcc.cpp:249).
INFO: [XFORM 203-603] Inlining function 'OutputCache::setOutChannel' into 'OutputCache::accumulateChannel' (src/fpgaAcc.cpp:384).
INFO: [XFORM 203-603] Inlining function 'OutputCache::setOutChannel' into 'ProcessingElement::processAll_channelOut' (src/fpgaAcc.cpp:252).
INFO: [XFORM 203-603] Inlining function 'OutputCache::getOutChannel' into 'OutputCache::accumulateChannel' (src/fpgaAcc.cpp:382).
INFO: [XFORM 203-603] Inlining function 'OutputCache::accumulateChannel' into 'ProcessingElement::processAll_channelOut' (src/fpgaAcc.cpp:254).
WARNING: [XFORM 203-623] Cannot instantiate function 'ProcessingElement::processInputChannel'(src/fpgaAcc.cpp:225:1) for 'cur_ci' since none of the actual argument(s) of 'cur_ci' are constant or global.
INFO: [HLS 200-111] Finished Standard Transforms Time (s): cpu = 00:00:35 ; elapsed = 00:00:23 . Memory (MB): peak = 361.910 ; gain = 13.648 ; free physical = 383 ; free virtual = 32663
INFO: [HLS 200-10] Checking synthesizability ...
INFO: [HLS 200-111] Finished Checking Synthesizability Time (s): cpu = 00:00:35 ; elapsed = 00:00:23 . Memory (MB): peak = 361.910 ; gain = 13.648 ; free physical = 380 ; free virtual = 32661
INFO: [XFORM 203-502] Unrolling all sub-loops inside loop 'L_CH_OUT' (src/fpgaAcc.cpp:241) in function 'ProcessingElement::processAll_channelOut' for pipelining.
INFO: [XFORM 203-501] Unrolling loop 'L_CH_OUT' (src/fpgaAcc.cpp:241) in function 'ProcessingElement::processAll_channelOut' partially with a factor of 16.
INFO: [XFORM 203-501] Unrolling loop 'Loop-1.1' (src/fpgaAcc.cpp:308) in function 'ProcessingElement::processAll_channelOut' completely.
INFO: [XFORM 203-501] Unrolling loop 'L_MACC_multiply' (src/fpgaAcc.cpp:190) in function 'ProcessingElement::processAll_channelOut' completely.
INFO: [XFORM 203-501] Unrolling loop 'L_MACC_accumulate' (src/fpgaAcc.cpp:195) in function 'ProcessingElement::processAll_channelOut' completely.
INFO: [XFORM 203-101] Partitioning array 'pixel_buffer' in dimension 1 completely.
INFO: [XFORM 203-101] Partitioning array 'weights_local' (src/fpgaAcc.cpp:244) in dimension 1 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM' in dimension 1 completely.
INFO: [XFORM 203-101] Partitioning array 'multresult' (src/fpgaAcc.cpp:187) in dimension 1 completely.
INFO: [XFORM 203-101] Partitioning array 'OutputCache::OBRAM' in dimension 1 with a cyclic factor 8.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.0' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.1' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.2' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.3' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.4' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.5' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.6' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.7' in dimension 2 completely.
WARNING: [XFORM 203-623] Cannot instantiate function 'ProcessingElement::processInputChannel'(src/fpgaAcc.cpp:225:1) for 'cur_ci' since none of the actual argument(s) of 'cur_ci' are constant or global.
INFO: [HLS 200-111] Finished Pre-synthesis Time (s): cpu = 00:00:37 ; elapsed = 00:00:25 . Memory (MB): peak = 489.633 ; gain = 141.371 ; free physical = 353 ; free virtual = 32635
WARNING: [XFORM 203-631] Renaming function 'ProcessingElement::processAll_channelOut' to 'processAll_channelOu' (src/fpgaAcc.cpp:241:43)
WARNING: [XFORM 203-562] Loop 'L_CH_OUT' (src/fpgaAcc.cpp:241) in function 'processAll_channelOu' has unknown bound because it has multiple exiting blocks.
INFO: [HLS 200-111] Finished Architecture Synthesis Time (s): cpu = 00:00:38 ; elapsed = 00:00:27 . Memory (MB): peak = 489.633 ; gain = 141.371 ; free physical = 349 ; free virtual = 32632
INFO: [HLS 200-10] Starting hardware synthesis ...
INFO: [HLS 200-10] Synthesizing 'ProcessingElement::processInputChannel' ...
WARNING: [SYN 201-103] Top function name 'ProcessingElement::processInputChannel' is not a legal RTL name.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-103] Top function name 'ProcessingElement::processInputChannel' is not a legal RTL name and is changed to 'ProcessingElement_processInputChannel'; this may result in automatic C/RTL co-simulation failure.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'processAll_channelOu'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-61] Pipelining loop 'L_CH_OUT'.
WARNING: [SCHED 204-69] Unable to schedule 'store' operation (src/fpgaAcc.cpp:395->src/fpgaAcc.cpp:384->src/fpgaAcc.cpp:254) of variable 'new_ch', src/fpgaAcc.cpp:383->src/fpgaAcc.cpp:254 on array 'OBRAM_0' due to limited memory ports. Please consider using a memory core with more ports or partitioning the array 'OBRAM_0'.
INFO: [SCHED 204-61] Pipelining result : Target II = 1, Final II = 2, Depth = 10.
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 27.98 seconds; current allocated memory: 89.244 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 0.59 seconds; current allocated memory: 90.872 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'ProcessingElement_processInputChannel'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111] Elapsed time: 0.4 seconds; current allocated memory: 90.982 MB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111] Elapsed time: 0.08 seconds; current allocated memory: 91.045 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'processAll_channelOu'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SYN 201-210] Renamed object name 'processAll_channelOu_OBRAM_0' to 'processAll_channebkb' due to the length limit 20
INFO: [SYN 201-210] Renamed object name 'processAll_channelOu_OBRAM_1' to 'processAll_channecud' due to the length limit 20
INFO: [SYN 201-210] Renamed object name 'processAll_channelOu_OBRAM_2' to 'processAll_channedEe' due to the length limit 20
INFO: [SYN 201-210] Renamed object name 'processAll_channelOu_OBRAM_3' to 'processAll_channeeOg' due to the length limit 20
INFO: [SYN 201-210] Renamed object name 'processAll_channelOu_OBRAM_4' to 'processAll_channefYi' due to the length limit 20
INFO: [SYN 201-210] Renamed object name 'processAll_channelOu_OBRAM_5' to 'processAll_channeg8j' due to the length limit 20
INFO: [SYN 201-210] Renamed object name 'processAll_channelOu_OBRAM_6' to 'processAll_channehbi' due to the length limit 20
INFO: [SYN 201-210] Renamed object name 'processAll_channelOu_OBRAM_7' to 'processAll_channeibs' due to the length limit 20
INFO: [SYN 201-210] Renamed object name 'ProcessingElement_processInputChannel_fadd_32ns_32ns_32_4_full_dsp_1' to 'ProcessingElementjbC' due to the length limit 20
INFO: [RTGEN 206-100] Generating core module 'ProcessingElementjbC': 8 instance(s).
INFO: [RTGEN 206-100] Finished creating RTL model for 'processAll_channelOu'.
INFO: [HLS 200-111] Elapsed time: 0.92 seconds; current allocated memory: 93.369 MB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-10] -- Generating RTL for module 'ProcessingElement_processInputChannel'
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [RTGEN 206-500] Setting interface mode on port 'ProcessingElement_processInputChannel/cur_row_times_stride' to 'ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on port 'ProcessingElement_processInputChannel/cur_col_times_stride' to 'ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on port 'ProcessingElement_processInputChannel/cur_ci' to 'ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on port 'ProcessingElement_processInputChannel/out_channelNum' to 'ap_none'.
INFO: [RTGEN 206-500] Setting interface mode on function 'ProcessingElement_processInputChannel' to 'ap_ctrl_hs'.
WARNING: [RTGEN 206-101] Global array 'OBRAM_0' will not be exposed as RTL port.
WARNING: [RTGEN 206-101] Global array 'OBRAM_1' will not be exposed as RTL port.
WARNING: [RTGEN 206-101] Global array 'OBRAM_2' will not be exposed as RTL port.
WARNING: [RTGEN 206-101] Global array 'OBRAM_3' will not be exposed as RTL port.
WARNING: [RTGEN 206-101] Global array 'OBRAM_4' will not be exposed as RTL port.
WARNING: [RTGEN 206-101] Global array 'OBRAM_5' will not be exposed as RTL port.
WARNING: [RTGEN 206-101] Global array 'OBRAM_6' will not be exposed as RTL port.
WARNING: [RTGEN 206-101] Global array 'OBRAM_7' will not be exposed as RTL port.
WARNING: [RTGEN 206-101] Port 'ProcessingElement_processInputChannel/cur_row_times_stride' has no fanin or fanout and is left dangling.
Please use C simulation to confirm this function argument can be read from or written to.
WARNING: [RTGEN 206-101] Port 'ProcessingElement_processInputChannel/cur_col_times_stride' has no fanin or fanout and is left dangling.
Please use C simulation to confirm this function argument can be read from or written to.
INFO: [RTGEN 206-100] Finished creating RTL model for 'ProcessingElement_processInputChannel'.
INFO: [HLS 200-111] Elapsed time: 1.04 seconds; current allocated memory: 97.566 MB.
INFO: [RTMG 210-278] Implementing memory 'processAll_channebkb_ram (RAM_T2P_BRAM)' using block RAMs with power-on initialization.
INFO: [HLS 200-111] Finished generating all RTL models Time (s): cpu = 00:00:42 ; elapsed = 00:00:32 . Memory (MB): peak = 489.633 ; gain = 141.371 ; free physical = 320 ; free virtual = 32614
INFO: [SYSC 207-301] Generating SystemC RTL for ProcessingElement_processInputChannel.
INFO: [VHDL 208-304] Generating VHDL RTL for ProcessingElement_processInputChannel.
INFO: [VLOG 209-307] Generating Verilog RTL for ProcessingElement_processInputChannel.
INFO: [HLS 200-112] Total elapsed time: 32.18 seconds; peak allocated memory: 97.566 MB.
Finished C synthesis.
其中需要注意的问题:
WARNING: [XFORM 203-623] Cannot instantiate function 'ProcessingElement::processInputChannel'(src/fpgaAcc.cpp:225:1) for 'cur_ci' since none of the actual argument(s) of 'cur_ci' are constant or global.
此报错出现了两次,但是zynqNet在只HLS inputchannel函数时候也会出此报错。
在整个IPcore实现时没有出现此报错。
WBRAM的报错相同。
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
包括在进行数组分开的时候,WBRAM已经与zynqNet展现出不同,下标少了。
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.7' in dimension 2 completely.
zynqNet的INFO为:
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.15.2' in dimension 2 completely.
但是后面出现了同样的报错:可能是进行分开之后WBRAM的名字发生了改变。
WARNING: [SYN 201-303] Cannot apply memory assignment of 'RAM_S2P_BRAM' (src/fpgaAcc.cpp:305->src/fpgaAcc.cpp:247): 'WBRAM_0_0' does not exist or is optimized away.
MTCNN会比zynqNet多了一个警告:
WARNING: [XFORM 203-562] Loop 'L_CH_OUT' (src/fpgaAcc.cpp:241) in function 'processAll_channelOu' has unknown bound because it has multiple exiting blocks.
可能因为OBRAM分成的与尺寸不匹配,最终OBRAM比zynqNet的console多了:
WARNING: [RTGEN 206-101] Global array 'OBRAM_0' will not be exposed as RTL port.
WARNING: [RTGEN 206-101] Global array 'OBRAM_1' will not be exposed as RTL port. 。。。
Starting C synthesis ...
/mnt/workspace/Xilinx/Vivado/2017.4/bin/vivado_hls /home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore/solution1/csynth.tcl
INFO: [HLS 200-10] Running '/mnt/workspace/Xilinx/Vivado/2017.4/bin/unwrapped/lnx64.o/vivado_hls'
INFO: [HLS 200-10] For user 'osrc' on host 'osrc-virtual-machine' (Linux_x86_64 version 4.13.0-32-generic) on Tue Dec 11 18:46:57 CST 2018
INFO: [HLS 200-10] On os Ubuntu 16.04.3 LTS
INFO: [HLS 200-10] In directory '/home/osrc/Desktop/document/conv_Core/HLS_Conv'
INFO: [HLS 200-10] Opening project '/home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore'.
INFO: [HLS 200-10] Adding design file 'src/fpgaAcc.cpp' to the project
INFO: [HLS 200-10] Adding design file 'src/fpgaAcc.hpp' to the project
INFO: [HLS 200-10] Adding design file 'src/pBox.cpp' to the project
INFO: [HLS 200-10] Adding design file 'src/pBox.h' to the project
INFO: [HLS 200-10] Adding test bench file 'src/test_convBench.cpp' to the project
INFO: [HLS 200-10] Opening solution '/home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore/solution1'.
INFO: [SYN 201-201] Setting up clock 'default' with a period of 10ns.
INFO: [HLS 200-10] Setting target device to 'xc7z035ffg676-2'
INFO: [HLS 200-10] Analyzing design file 'src/pBox.cpp' ...
INFO: [HLS 200-10] Analyzing design file 'src/fpgaAcc.cpp' ...
INFO: [HLS 200-10] Validating synthesis directives ...
INFO: [HLS 200-111] Finished Checking Pragmas Time (s): cpu = 00:00:31 ; elapsed = 00:00:20 . Memory (MB): peak = 361.641 ; gain = 13.375 ; free physical = 320 ; free virtual = 32652
INFO: [HLS 200-111] Finished Linking Time (s): cpu = 00:00:33 ; elapsed = 00:00:21 . Memory (MB): peak = 361.641 ; gain = 13.375 ; free physical = 318 ; free virtual = 32651
INFO: [HLS 200-10] Starting code transformations ...
INFO: [XFORM 203-603] Inlining function 'MemoryController::setLayerConfig' into 'convolution_3x3' (src/fpgaAcc.cpp:77).
INFO: [XFORM 203-603] Inlining function 'ImageCache::setLayerConfig' into 'convolution_3x3' (src/fpgaAcc.cpp:78).
INFO: [XFORM 203-603] Inlining function 'WeightsCache::setLayerConfig' into 'convolution_3x3' (src/fpgaAcc.cpp:79).
INFO: [XFORM 203-603] Inlining function 'WeightsCache::get_WBRAM_addr' into 'WeightsCache::get_9_weights_to_buffer' (src/fpgaAcc.cpp:307).
INFO: [XFORM 203-603] Inlining function 'WeightsCache::get_WBRAM_addr' into 'WeightsCache::load_WBRAM_from_DRAM' (src/fpgaAcc.cpp:284).
INFO: [XFORM 203-603] Inlining function 'MemoryController::load_weight_2_reg' into 'WeightsCache::load_WBRAM_from_DRAM' (src/fpgaAcc.cpp:291).
INFO: [XFORM 203-603] Inlining function 'WeightsCache::load_WBRAM_from_DRAM' into 'convolution_3x3' (src/fpgaAcc.cpp:83).
INFO: [XFORM 203-603] Inlining function 'MemoryController::setPixelLoadRowOffset' into 'convolution_3x3' (src/fpgaAcc.cpp:94).
INFO: [XFORM 203-603] Inlining function 'MemoryController::setPixelLoadRowOffset' into 'convolution_3x3' (src/fpgaAcc.cpp:87).
INFO: [XFORM 203-603] Inlining function 'MemoryController::setPixelLoadRowOffset' into 'convolution_3x3' (src/fpgaAcc.cpp:85).
INFO: [XFORM 203-603] Inlining function 'MemoryController::setPixelLoadOffset' into 'ImageCache::loadRowDRAM_2_IBRAM' (src/fpgaAcc.cpp:330).
INFO: [XFORM 203-603] Inlining function 'MemoryController::loadInputChannelPixel' into 'ImageCache::loadPixelDRAM_2_IBRAM' (src/fpgaAcc.cpp:339).
INFO: [XFORM 203-603] Inlining function 'ImageCache::loadPixelDRAM_2_IBRAM' into 'ImageCache::loadRowDRAM_2_IBRAM' (src/fpgaAcc.cpp:331).
INFO: [XFORM 203-603] Inlining function 'ImageCache::loadRowDRAM_2_IBRAM' into 'convolution_3x3' (src/fpgaAcc.cpp:95).
INFO: [XFORM 203-603] Inlining function 'ImageCache::loadRowDRAM_2_IBRAM' into 'convolution_3x3' (src/fpgaAcc.cpp:88).
INFO: [XFORM 203-603] Inlining function 'ImageCache::loadRowDRAM_2_IBRAM' into 'convolution_3x3' (src/fpgaAcc.cpp:86).
INFO: [XFORM 203-603] Inlining function 'MemoryController::setPixelOutOffset' into 'convolution_3x3' (src/fpgaAcc.cpp:99).
INFO: [XFORM 203-603] Inlining function 'ImageCache::calcu_IBRAM_row_offset' into 'ProcessingElement::loadPixel_buffer' (src/fpgaAcc.cpp:209).
INFO: [XFORM 203-603] Inlining function 'ImageCache::get_IBRAM_Pixel' into 'ProcessingElement::loadPixel_buffer' (src/fpgaAcc.cpp:213).
INFO: [XFORM 203-603] Inlining function 'ProcessingElement::loadPixel_buffer' into 'ProcessingElement::processInputChannel' (src/fpgaAcc.cpp:230).
INFO: [XFORM 203-603] Inlining function 'WeightsCache::get_9_weights_to_buffer' into 'ProcessingElement::processAll_channelOut' (src/fpgaAcc.cpp:247).
INFO: [XFORM 203-603] Inlining function 'ProcessingElement::macc2d' into 'ProcessingElement::processAll_channelOut' (src/fpgaAcc.cpp:249).
INFO: [XFORM 203-603] Inlining function 'OutputCache::setOutChannel' into 'OutputCache::accumulateChannel' (src/fpgaAcc.cpp:384).
INFO: [XFORM 203-603] Inlining function 'OutputCache::setOutChannel' into 'ProcessingElement::processAll_channelOut' (src/fpgaAcc.cpp:252).
INFO: [XFORM 203-603] Inlining function 'OutputCache::getOutChannel' into 'OutputCache::accumulateChannel' (src/fpgaAcc.cpp:382).
INFO: [XFORM 203-603] Inlining function 'OutputCache::accumulateChannel' into 'ProcessingElement::processAll_channelOut' (src/fpgaAcc.cpp:254).
INFO: [XFORM 203-603] Inlining function 'MemoryController::writeBackOutputChannel' into 'convolution_3x3' (src/fpgaAcc.cpp:109).
INFO: [HLS 200-111] Finished Standard Transforms Time (s): cpu = 00:00:34 ; elapsed = 00:00:23 . Memory (MB): peak = 361.930 ; gain = 13.664 ; free physical = 307 ; free virtual = 32642
INFO: [HLS 200-10] Checking synthesizability ...
INFO: [XFORM 203-602] Inlining function 'ImageCache::writeNextChannelPixel_2_IBRAM' into 'convolution_3x3' (src/fpgaAcc.cpp:340->src/fpgaAcc.cpp:331->src/fpgaAcc.cpp:86) automatically.
INFO: [HLS 200-111] Finished Checking Synthesizability Time (s): cpu = 00:00:35 ; elapsed = 00:00:23 . Memory (MB): peak = 361.930 ; gain = 13.664 ; free physical = 303 ; free virtual = 32639
INFO: [XFORM 203-502] Unrolling all sub-loops inside loop 'L_CH_OUT' (src/fpgaAcc.cpp:241) in function 'ProcessingElement::processAll_channelOut' for pipelining.
INFO: [XFORM 203-501] Unrolling loop 'L_CH_OUT' (src/fpgaAcc.cpp:241) in function 'ProcessingElement::processAll_channelOut' partially with a factor of 8.
INFO: [XFORM 203-501] Unrolling loop 'Loop-1.1' (src/fpgaAcc.cpp:308) in function 'ProcessingElement::processAll_channelOut' completely.
INFO: [XFORM 203-501] Unrolling loop 'L_MACC_multiply' (src/fpgaAcc.cpp:190) in function 'ProcessingElement::processAll_channelOut' completely.
INFO: [XFORM 203-501] Unrolling loop 'L_MACC_accumulate' (src/fpgaAcc.cpp:195) in function 'ProcessingElement::processAll_channelOut' completely.
INFO: [XFORM 203-101] Partitioning array 'pixel_buffer' (src/fpgaAcc.cpp:228) in dimension 1 completely.
INFO: [XFORM 203-101] Partitioning array 'weights_local' (src/fpgaAcc.cpp:244) in dimension 1 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM' in dimension 1 completely.
INFO: [XFORM 203-101] Partitioning array 'multresult' (src/fpgaAcc.cpp:187) in dimension 1 completely.
INFO: [XFORM 203-101] Partitioning array 'OutputCache::OBRAM' in dimension 1 with a cyclic factor 8.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.0' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.1' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.2' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.3' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.4' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.5' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.6' in dimension 2 completely.
INFO: [XFORM 203-101] Partitioning array 'WeightsCache::WBRAM.7' in dimension 2 completely.
INFO: [XFORM 203-602] Inlining function 'ImageCache::writeNextChannelPixel_2_IBRAM' into 'convolution_3x3' (src/fpgaAcc.cpp:340->src/fpgaAcc.cpp:331->src/fpgaAcc.cpp:86) automatically.
INFO: [XFORM 203-622] Instantiating function 'ProcessingElement::processInputChannel'(src/fpgaAcc.cpp:221) to 'ProcessingElement::processInputChannel.0' at call site (src/fpgaAcc.cpp:103) by setting 'cur_ci' to 'cur_channel_in'.
INFO: [XFORM 203-721] Changing loop 'Loop_load_pixel_2_PE_row_loop_proc' (src/fpgaAcc.cpp:207) to a process function for dataflow in function 'ProcessingElement::processInputChannel.0'.
INFO: [XFORM 203-712] Applying dataflow to function 'ProcessingElement::processInputChannel.0' (src/fpgaAcc.cpp:224:1), detected/extracted 2 process function(s):
'ProcessingElement::processInputChannel.0_Loop_load_pixel_2_PE_row_loop_proc5'
'ProcessingElement::processAll_channelOut'.
INFO: [HLS 200-111] Finished Pre-synthesis Time (s): cpu = 00:00:37 ; elapsed = 00:00:26 . Memory (MB): peak = 489.637 ; gain = 141.371 ; free physical = 275 ; free virtual = 32614
WARNING: [XFORM 203-542] Cannot flatten a loop nest 'Loop-1.1' (src/fpgaAcc.cpp:283:18) in function 'convolution_3x3' :
the outer loop is not a perfect loop.
WARNING: [XFORM 203-542] Cannot flatten a loop nest 'Loop-1' (src/fpgaAcc.cpp:280:18) in function 'convolution_3x3' :
the outer loop is not a perfect loop.
WARNING: [XFORM 203-542] Cannot flatten a loop nest 'L_DRAM_PRELOADROW_X' (src/fpgaAcc.cpp:329:77) in function 'convolution_3x3' :
the outer loop is not a perfect loop.
WARNING: [XFORM 203-542] Cannot flatten a loop nest 'L_DRAM_PRELOADROW_X' (src/fpgaAcc.cpp:329:77) in function 'convolution_3x3' :
the outer loop is not a perfect loop.
WARNING: [XFORM 203-542] Cannot flatten a loop nest 'L_DRAM_PRELOADROW_X' (src/fpgaAcc.cpp:329:77) in function 'convolution_3x3' :
the outer loop is not a perfect loop.
WARNING: [XFORM 203-542] Cannot flatten a loop nest 'Loop-4.1' (src/fpgaAcc.cpp:93:3) in function 'convolution_3x3' :
the outer loop is not a perfect loop.
WARNING: [XFORM 203-542] Cannot flatten a loop nest 'row_loop' (src/fpgaAcc.cpp:91:85) in function 'convolution_3x3' :
more than one sub loop.
WARNING: [XFORM 203-631] Renaming function 'ProcessingElement::processInputChannel.0_Loop_load_pixel_2_PE_row_loop_proc5' to 'processInputChannel.' (src/fpgaAcc.cpp:207:3)
WARNING: [XFORM 203-631] Renaming function 'ProcessingElement::processInputChannel.0' to 'processInputChannel..1' (src/fpgaAcc.cpp:226:1)
WARNING: [XFORM 203-631] Renaming function 'ProcessingElement::processAll_channelOut' to 'processAll_channelOu' (src/fpgaAcc.cpp:192:43)
INFO: [XFORM 203-811] Inferring bus burst read of variable length on port 'memorybus' (src/fpgaAcc.cpp:178:15).
WARNING: [XFORM 203-562] Loop 'L_CH_OUT' (src/fpgaAcc.cpp:241) in function 'processAll_channelOu' has unknown bound because it has multiple exiting blocks.
WARNING: [XFORM 203-713] Function 'processInputChannel..1' (src/fpgaAcc.cpp:226:1) failed dataflow checking: A dataflow region cannot be instantiated from with a pipelined loop (src/fpgaAcc.cpp:226:1). Ignoring pipeline directive to allow the dataflow directive to take precedence. This behavior can be disabled by using 'config_compile -disable_dataflow_pipeline_check'.
Instruction does not dominate all uses!
%tmp_60 = add i32 %WeightsCache_inChan_1, %tmp_59
%memorybus_addr_rd_re = call i1 @_ssdm_op_ReadReq.m_axi.floatP(float* %memorybus_addr, i32 %tmp_60), !dbg !1031
Broken module found, compilation aborted!
Stack dump:
0. Running pass 'Function Pass Manager' on module '/home/osrc/Desktop/document/conv_Core/HLS_Conv/conv3x3_IPcore/solution1/.autopilot/db/a.o.2.bc'.
1. Running pass 'Module Verifier' on function '@convolution_3x3'
/mnt/workspace/Xilinx/Vivado/2017.4/bin/loader: line 194: 15937 Aborted (core dumped) "$RDI_PROG" "$@"
Finished C synthesis.
重要的错误来自两点:
unknow bound和dataflow不能实现。
WARNING: [XFORM 203-562] Loop 'L_CH_OUT' (src/fpgaAcc.cpp:241) in function 'processAll_channelOu' has unknown bound because it has multiple exiting blocks.
WARNING: [XFORM 203-713] Function 'processInputChannel..1' (src/fpgaAcc.cpp:226:1) failed dataflow checking: A dataflow region cannot be instantiated from with a pipelined loop (src/fpgaAcc.cpp:226:1). Ignoring pipeline directive to allow the dataflow directive to take precedence. This behavior can be disabled by using 'config_compile -disable_dataflow_pipeline_check'.
后续需要对这些BUG进行调试。初步判断BUG为processAll_channelOut之中的for循环展开的问题与OBRAM的分开有差别的问题。