目录
1、本文背景
2、高级设计
2.1数学概述:
3、硬件设计
3.1 输入图像
3.2 VGA/摄像头
3.3卷积第一层
3.4 池化层
3.4 卷积第二层
3.5部分和
3.6第一个全连接层
3.7第二个全连接层
4、软件设计
5、系统设计
6、测试
7、硬件错误和问题
8、结果
10、可用性
11、结论
12、知识产权注意事项
13、改进和未来工作
14、Verilog代码和C代码
神经网络是一种基于大脑神经网络的机器学习模型。一系列节点排列在“层”中,通过操作和权重相互连接。该模型已证明在图像分类任务中取得了成功,这些任务如今具有许多应用,从自动驾驶汽车到面部识别。标准 CNN 可以具有浮点权重和特征图——这些需要大量的内存资源和计算能力来实现必要的乘法器等。
二元神经网络利用二值化特征图和权重,这大大减少了所需的存储和计算资源量,并使在资源受限系统(如 FPGA)上的硬件中合成它们成为可能。我们实现的网络基于使用 Tensorflow 机器学习库在 Python 中实现的软件模型。Python 代码由康奈尔大学博士生 Ritchie Zhao 提供。Verilog 代码在硬件中实现了用于构建软件模型的各个层和功能。该系统旨在对数字进行分类,并使用 MNIST 数据集的一个子集来训练模型,并产生了大约 40% 的测试准确率。这可以通过使用非二值化特征图和实现附加功能(例如批量归一化)来改进。
Verilog 模型用于执行推理任务,但不训练用于计算的权重。相反,使用的权重由 Python 实现生成,并在 Verilog 模型中硬编码。当神经网络用于分类时,训练权重很耗时并且不是实时完成的。因此,我们选择将模型重点放在分类任务上,并使用预训练的权重进行计算。我们最初计划使用 HPS 传递 FPGA 使用的权重;然而,这导致使用了过多的逻辑单元并且设计不适合设备。
计算不同输出特征图所涉及的数学主要限于乘法和加法运算。由于我们设计中的权重是二进制值,乘法运算可以替换为三元运算符,这些运算符决定一个值在“乘”以 1 或 -1 后是否必须加上或减去(0 的权重被视为 -1) . 这大大减少了实现设计所需的 DSP 模块数量。卷积操作是通过在输入特征图上“滑动”过滤器来执行的。重叠索引彼此相乘并相加以形成相应输出索引处的值。二值化是通过确定被二值化的值的符号并相应地将输出值分配为 -1 或 1 来实现的。虽然真正的二值化涉及将输出转换为 1 或 0 而不是 1 或 -1,但此网络所需的计算使其更有效地转换为 1 或 -1。对于本报告的其余部分,对二值化的引用是指将数字转换为 1 或 -1,而不是 1 或 0。池化操作涉及检查给定值集中的最大值并将输出分配给该最大值。下面的图片描述了所有这些过程。
图 1:卷积示例
图 2:池化示例
图 3:二值化示例
2.2总体概述:
二元神经网络由两个卷积层、两个池化层和两个全连接层组成。输入图像是一个 7 x 7 的两位黑白图像。图像在底部和右侧填充 -1s 以创建一个 8 x 8 的图像,该图像被输入网络。第一个卷积层将输入图像与 16 个 3 x 3 滤波器进行卷积,以产生 16 个 8 x 8 输出映射,这些映射被二值化为仅包含 1 和 -1。然后将这 16 个映射合并以形成 16 个 4 x 4 的输出映射,然后将其馈入第二个卷积层。第二个卷积层包含 512 个 3 x 3 滤波器。每张图像都与 32 个独特的过滤器进行卷积,以产生 32 个 4 x 4 的输出特征图。然后将这些二值化和池化,将它们变成 2 x 2 的输出映射,然后传递到全连接层。第一个全连接层将传入的 32 个 2 x 2 特征映射展平为一个 128 个条目的数组。然后将该数组与一个 128 x 32 的滤波器数组进行矩阵相乘,以生成大小为 32 的输出数组。然后将该输出数组二值化并乘以最终全连接层中的 32 x 10 滤波器矩阵,以生成一个 10 项数组。此数组中的每个条目对应于输入图像是与该数组索引对应的数字的图像的概率。例如,数组中的第 0 个条目表示输入图像为 0 的可能性。如果数组中的第 0 个条目具有数组中的最大值,则 BNN 将推断输入为数字 0。然后将该数组与一个 128 x 32 的滤波器数组进行矩阵相乘,以生成大小为 32 的输出数组。然后将该输出数组二值化并乘以最终全连接层中的 32 x 10 滤波器矩阵,以生成一个 10 项数组。此数组中的每个条目对应于输入图像是与该数组索引对应的数字的图像的概率。例如,数组中的第 0 个条目表示输入图像为 0 的可能性。如果数组中的第 0 个条目具有数组中的最大值,则 BNN 将推断输入为数字 0。然后将该数组与一个 128 x 32 的滤波器数组进行矩阵相乘,以生成大小为 32 的输出数组。然后将该输出数组二值化并乘以最终全连接层中的 32 x 10 滤波器矩阵,以生成一个 10 项数组。此数组中的每个条目对应于输入图像是与该数组索引对应的数字的图像的概率。例如,数组中的第 0 个条目表示输入图像为 0 的可能性。如果数组中的第 0 个条目具有数组中的最大值,则 BNN 将推断输入为数字 0。此数组中的每个条目对应于输入图像是与该数组索引对应的数字的图像的概率。例如,数组中的第 0 个条目表示输入图像为 0 的可能性。如果数组中的第 0 个条目具有数组中的最大值,则 BNN 将推断输入为数字 0。此数组中的每个条目对应于输入图像是与该数组索引对应的数字的图像的概率。例如,数组中的第 0 个条目表示输入图像为 0 的可能性。如果数组中的第 0 个条目具有数组中的最大值,则 BNN 将推断输入为数字 0。
所有的特征图和权重数组都存储在寄存器中,卷积和矩阵乘法是使用三元运算符实现的。使用 DSP 模块会导致设计所需的乘法器短缺。特征映射的两位大小和 1 位权重数组导致最小的存储要求,消除了对 M10K 块等内存单元的需要。每个层的所有权重都在 Verilog 中硬编码。我们最初计划使用 PIO 端口将 HPS 馈入重量;然而,这导致使用了更多 FPGA 中可用的 ALM。
来自 MNIST 测试集的十个输入图像对应于十个数字中的每一个,在 FPGA 上以 Verilog 进行硬编码。FPGA 接收来自 HPS 的输入选择信号,该信号用于在各种图像中挑选作为输入并馈入二值化卷积网络以生成数字预测输出。来自 MNIST 测试集的输入图像平均池化为 7 x 7 大小的 1 位灰度矩阵。我们为每个条目使用 2 位,因为输入被二值化为 1 或 -1,2'b01 代表黑色像素,2'b11 代表白色像素。然后,在将图像输入到第一个卷积层之前,我们用 -1s 填充底行和右列以形成 8 x 8 矩阵。这使得矩阵大小均匀,更容易在更多层中使用。
我们最初的计划是使用 NTSC 摄像机捕捉实时图像或手写数字作为输入,并实时执行数字分类。我们从 Avalon Bus Master to HPS 页面上的 Bruce 视频代码开始,它通过 Qsys 中的 Video_In_Subsystem 模块将视频输入存储到片上 SRAM,并且有一个总线主控将像素从 SRAM 复制到双端口 SDRAM,其中然后,VGA 控制器模块将 SDRAM 数据显示在 VGA 屏幕上。我们使用了代码和 Qsys 视频子系统模块。我们能够将 8 位 RGB 颜色转换为 2 位灰度,如下图所示,使用 Video_In_Clipper 和 Video_In_Scaler Qsys 模块将输入大小从 320x240 修剪为 224x224,然后使用池化在 HPS 上创建 7x7 图像. 后来发现这个方案不可行,当我们在 FPGA 上运行 ALM 时,我们最常使用它来构建实际的二值化神经网络。因此,我们选择在 FPGA 上对来自 MNIST 数据集的一些现有输入图像进行硬编码,并发送一个选择信号以从中选择各种图像。
图 4:224x224 2 位灰度到 7x7 1 位灰度
第一个卷积层使用 16 个 3 x 3 的滤波器,每个条目的大小为 1 位。输入图像是一个 8 x 8 矩阵,条目大小为 1 位,并与每个过滤器卷积以生成 16 个大小为 8 x 8 的输出特征图。输入图像的边为零,使其成为 10 x 10 矩阵。当与 3 x 3 矩阵进行卷积时,会产生一个 8 x 8 矩阵。
卷积是通过使用三元运算符来实现的,以确定过滤器中的位是 1 还是 0,从而将输入 fmap 中的值与临时和相加或相减。为了节省空间,我们使用 1 位权重(1 或 0)和三元运算符而不是两个位权重来表示 1 和 -1。临时总和存储在临时特征输出中。这对输出特征图中的每个条目重复进行,并为 16 个 3 x 3 过滤器中的每一个并行发生。一旦计算出所有临时和值,这些值的符号位用于将 +1 或 -1 分配给输出特征图中的相应条目。基本上,如果临时和为正且大于 0,我们将其分配给 +1。否则,我们将其分配给 -1。请注意,我们使用此实现将 -1 分配给临时总和 0。这一层使用两个组合always块实现,一个实现填充,一个计算卷积。每个块都包含嵌套的 for 循环,允许并行计算所有临时总和。在代码的主体中,一个生成循环用于实现 16 个这样的卷积单元,以允许并行计算 16 个输出特征图中的每一个。
网络中有两个最大池化层,每个卷积层后面有一个。池化层将输出特征图缩小了两倍。第一个池化层将 8 x 8 的特征映射转换为 4 x 4 的映射,而第二个池化层将 4 x 4 的特征映射转换为 2 x 2 的映射。这是通过在四个值的平方中取最大值并将该值指定为一个条目来代替输出特征图中的所有四个值来完成的,从而减小尺寸。两层都使用 for 循环来生成硬件以同时处理输入特征图中的所有元素。
第二个卷积层的实现方式与第一个大致相同。两个组合always块用于填充图像并计算卷积的临时总和,然后将其存储在输出特征图中。与第一个卷积块不同,这里的输出不会立即二值化,因为必须首先计算部分和。16 个特征映射中的每一个与 32 个独特的过滤器的卷积为每个输入特征映射创建 32 个输出特征映射。然后将这 32 个输出相加并进行二值化以创建 32 个最终输出映射。在主要代码体中,生成块中嵌套的 for 循环用于并行实现所有卷积。
部分求和层接收由第二个卷积层计算的 16*32 4 x 4 个特征映射,并将与输入的 16 个特征映射中的每一个对应的 32 个映射相加到该层。部分和是使用 32 4 x 4 累积临时和数组计算的。状态机用于首先在第一个状态下将数组中的所有值初始化为 0,并在下一个状态中迭代传递到层的 16 x 32 x 4 x 4 数组中的 16 行。嵌套 for 循环用于并行计算 32 x 4 x 4 部分和 - 在此状态下 16 个时钟周期后,部分和已计算完毕,状态机移至下一个状态。在这里,部分和被二值化并分配给 32 x 4 x 4 输出特征图,该特征图被传递到第二个池化层。
图 5:部分和
全连接层接收第二个池化层输出的 32 x 2 x 2 矩阵,并将其展平以形成一维 128 长度的数组。这乘以一个 128 x 32 的矩阵以形成一个长度为 32 的数组。这一层也是使用状态机和一个长度为 32 的临时和数组来实现的。在第一个状态中,临时和值都被初始化为 0。在下一个状态中,三元运算符用于确定权重矩阵中的值是否为a 1 或 0 和存储在扁平特征图的相应索引中的值分别从临时和中添加或减去。重复 128 次迭代 - 二维权重数组中的行数。一个 for 循环用于并行执行 32 个这样的操作。在这之后的状态中,
图 6:第一个全连接层
第二个全连接层的结构与第一个相同。它从前一层获取长度为 32 的数组,并使用与前面描述的相同的状态机结构将其与大小为 32 的权重矩阵乘以 10。输出矩阵是一个大小为 10 的数组,具有 8 位条目 - 值未二值化以提供有关数字分类的更多信息。
图 7:第二个全连接层
图 8:模型摘要
二值化神经网络的最终输出是一个长度为 10 的数组。此最终输出数组的给定索引处的值对应于处理的图像是该索引号的图像的可能性。例如,如果索引 0 处的值是数组中的最小值,则表明处理后的图像为 0 的可能性最低。同样,如果索引 5 处的值是数组中的最高值,这意味着 BNN 推断图像最有可能是数字 5。我们通过了这 10 个最终输出使用 8 位宽 PIO 端口将值从 FPGA 传输到 HPS。然后 HPS 处理 10 个最终输出并将数字转换为概率尺度,以确定图像的前三个最可能的分类。串行控制台上 HPS 的输出如上图所示。为了计算概率,我们首先将所有正的最终输出值相加以获得正推理指数的总和。然后可以通过将索引 n 处的最终输出值除以正推理索引的总和来计算数字 n 的概率。
图 9:HPS 串行控制台输出
下图显示了我们设计的 Qsys 实现。PIO 端口从 HPS 连接到轻量级 axi 主总线,并导出到不同存储器地址的 FPGA 架构。Pio_switch 是我们用来选择在 hps 上硬编码的各种输入图像作为 BNN 的新输入的输出信号。一旦 pio_swich 被选择并输出到 FPGA,HPS 将 pio_start 从低电平切换到高电平以重新启动 BNN 数字识别计算。在 BNN 重启时,Pio_end 被设置为低,只有在 BNN 完成最终输出数组的计算时,FPGA 才会设置为高。通过记录复位时的时间和 pio_end 变高的时间,我们可以通过开始和结束时间的时间差来计算我们的 BNN 计算时间,我们发现大约为 4-5us。
FPGA完成计算后,三个PIO端口(时钟信号pio_hps_image_clk,数据信号pio_out_data和片选信号pio_out_cs)依次接收FPGA到HPS的10个最终输出。片选线通常保持低电平以重置索引。当片选为高电平时,最终输出阵列的相应索引将在时钟信号的每个上升沿加载到数据信号中。此后,索引递增。为了开始接收最终输出,HPS 将片选拉高,翻转时钟信号,然后在数据端口读取并存储该值,从而将最终输出数组的值存储在索引 0 处。然后重复此过程 9次接收所有最终输出数组数据值。
图 10:Qsys PIO 端口
我们在 Modelsim 上测试了我们设计的初始迭代,并采用了单元测试来确保我们的每个模块都按预期工作。我们实现了每个模块并传入已知的输入值和模拟结果以验证输出是否符合预期。一旦我们为所有涉及的层完成了这些,我们就开始实例化所有层并将它们相互连接起来。然后我们将所有权重值和输入图像设置为已知值,并监控整个网络的流量。
图 11:Modelsim 输出
一旦我们的设计仿真正确,我们将其移到 FPGA 上,并使用 LED 和 PIO 端口查看每一层的输出,以确保设计在硬件中执行与在仿真中一样。由于 Modelsim 仅模拟并行执行,因此我们必须对 FPGA 上的设计重复所有测试,以实际验证我们的层是否按预期工作。我们发现的一些错误是顺序操作的并行实现,例如累积和导致 FPGA 上的计算不准确。在 Modelsim 中,这些模拟正确,因为软件中的执行实际上是顺序的,但在生成实际电路时情况并非如此。
在 FPGA 上调试时,通过将输出映射到 LED 或通过 PIO 端口将其发送到 HPS 后将其打印在串行控制台上来测试每一层的实现。将硬件计算值与软件实现的 Python 模型进行比较,以验证每一层是否按预期运行。虽然调试模型的最有效方法是通过 PIO 端口传递输出值并在串行控制台上打印出来,但我们最终在 FPGA 上运行了算术逻辑模块 (ALM)。此时,我们必须切换到将输出映射到板上 LED 以验证计算出的值是否准确。
虽然我们最初希望完全并行实施设计,但系统的某些元素使这不可行。网络的某些组件,例如部分求和模块,需要多个周期才能正确运行。对于此模块,必须依次执行 16 次加法运算才能计算出累加和。这 16 个操作不能并行执行,因此需要几个时钟周期才能执行。我们遇到的其他问题是在连接 PIO 端口以在 FPGA 和 HPS 之间传递数据以及将 FPGA 输出映射到板上的 LED 时,板上的 ALM 反复耗尽。添加端口或 LED 映射有时会导致实现设计所需的 ALM 资源大幅增加,从而导致设计不适合电路板。我们通过找到使用较少 ALM 的变通方法来解决这些问题 - 例如,我们没有从 HPS 传递权重,而是在 Verilog 文件中对其进行了硬编码。由于权重在分类过程中的任何时候都不会改变,因此这对功能没有任何影响。
下图显示了我们最终演示的 LCD 显示屏。将显示前三个计算出的概率,以及传递到网络的 8 x 8 输入图像。完整的二值化神经网络能够准确地执行对图像进行分类所需的计算。将每一层的输出与软件中相应实现的输出进行比较,以验证是否正在执行预期的计算。软件准确度的预期准确度为 33% - 由于硬件模型模仿软件模型的计算,因此硬件分类器的预期准确度也可以假设为 33%。
图 12:显示输出:数字 1
图 13:显示输出:数字 5
软件模型的计算速度是通过将表示计算已经完成的完成信号传回 HPS 并测量从 HPS 发送到 FPGA 的开始信号到从 FPGA 发送完成信号之间的时间来衡量的回到 HPS。发现该 FPGA BNN 计算时间约为 0.004 ms 或 4us。另一方面,在 PC 上运行的相同 BNN 的 Python 实现大约需要 44us。这个时间测量是通过在 y_conv 上运行 Tensorflow Eval 函数所需的持续时间来计算的:y_conv.eval(feed_dict=test_dict),其中 y_conv 是 BNN 的最后一个张量层。在 1 个批次大小中,我们测量了处理 1 个输入所需的时间,大约为 64.4 毫秒,我们还测量了处理 180 个输入所需的时间,大约为 72.4 毫秒。因为 CNN 的处理时间是加载权重和计算的总时间,为了粗略估计计算权重的时间,我们使用时间差和 (72.4ms-64.4ms)/180 数据 = 44us/数据。请注意,我们在四核 PC 上运行 Python 代码。PC下测量时差存在不稳定性,各种因素会导致时间测量发生变化
9、资源使用
下表总结了我们设计的最终实现所使用的一些不同资源。可以看出,BNN 仅使用 FPGA 上可用总内存的一小部分,并且三元运算符的使用最大限度地减少了对乘法器/DSP 模块的需求。最常用的资源是 ALM,但当不包括用于将输出数据传输到 HPS、在设计中传达开始和结束信号等的 PIO 端口时,其中一半以上的资源仍然可以在板上使用. 这些结果证实了 BNN 的低资源需求。
图 14:资源使用情况摘要
当前的设计不是非常灵活,因为输入图像必须硬编码到 Verilog 代码中才能进行处理。由于权重也是硬编码的,因此对这些权重的任何更改也需要修改和重新编译代码。通过使用 PIO 端口或 SRAM 存储器将权重从 HPS 传输到 FPGA,可以使设计更具可配置性;然而,在我们当前的实现中,引入这些元素中的任何一个都会导致设计不适合 FPGA。虽然数字分类本质上并不是一项非常广泛适用的任务,但图像分类今天有很多用途。硬件分类器的加速使其更适合时间是主要约束条件的实时分类任务。
在大多数情况下,我们的实施符合我们的期望。我们最初希望获得更高的准确度;直到开发过程的后期,我们才注意到 Python 实现中的错误。纠正这个错误对于使 Python 设计真正二进制至关重要,但也导致准确度下降了大约 0.4(从大约 0.8 到 0.4)。对网络硬件的更改可以适应提高准确性所需的更改,但实施这些更改需要时间超过我们的截止日期。因此,我们选择继续实施较低精度的模型。
我们希望在我们的模型中包含的一个功能是一个摄像头接口,它允许实时捕获、合并图像并馈送到 BNN。虽然我们拥有实现此类系统所需的 Verilog 和 HPS 代码,但将此功能纳入设计会导致所需的 ALM 总数超过板上可用的数量 - 在添加这些更改之前,我们的设计使用了大约 28,000 个 ALM,添加它们后,计数跃升至 38,000 左右。
实施的网络基于博士生 Ritchie Zhao 实施的框架。提供的代码也部分基于康奈尔大学高级课程的课堂作业。虽然没有专利或商标问题,但也没有专利机会,因为我们的硬件所基于的软件设计不是我们自己的设计。我们的 FPGA 代码是使用 ECE 5760 课程网页上提供的一些资源构建的。例如,我们用来与 VGA 显示器接口的代码来自类网站上的示例程序。除了参考相关语法和操作的在线资源之外,我们没有使用来自公共领域的任何其他代码。我们知道,我们的设计没有引起任何法律考虑。
如果我们重新完成这个项目,我们将改变的事情可能包括修改网络的设计,以支持来自每一层的二进制权重和非二进制输出特征图,因为这可以提高准确性。然而,虽然我们当前的实现使用很少的寄存器,但使用了很大比例的可用 ALM,因此这种实现可能不可行。另一个潜在的变化可能是改变网络的大小。目前,第一个卷积层有 16 个输出特征图,第二个卷积层和第一个全连接层有 32 个输出特征图。这些数字可以分别减少到 8、16 和 16。虽然这可能会导致精度下降,但较小的尺寸可以使设计适合电路板,而不会占用大量可用资源,
该模型的进一步改进可能包括扩展分类以处理来自不同数据集的图像,例如 CIFAR10,而不仅仅是数字。用于处理此类图像的神经网络比我们实现的神经网络更复杂,通常需要更多的内存和计算资源。由于我们已经在用这个网络推动 FPGA 计算资源的极限,我们可能需要使用更大的板来实现任何更复杂的东西。
//verilog
// synthesis VERILOG_INPUT_VERSION SYSTEMVERILOG_2005
module DE1_SoC_Computer (
// FPGA Pins
// Clock pins
CLOCK_50,
CLOCK2_50,
CLOCK3_50,
CLOCK4_50,
// ADC
ADC_CS_N,
ADC_DIN,
ADC_DOUT,
ADC_SCLK,
// Audio
AUD_ADCDAT,
AUD_ADCLRCK,
AUD_BCLK,
AUD_DACDAT,
AUD_DACLRCK,
AUD_XCK,
// SDRAM
DRAM_ADDR,
DRAM_BA,
DRAM_CAS_N,
DRAM_CKE,
DRAM_CLK,
DRAM_CS_N,
DRAM_DQ,
DRAM_LDQM,
DRAM_RAS_N,
DRAM_UDQM,
DRAM_WE_N,
// I2C Bus for Configuration of the Audio and Video-In Chips
FPGA_I2C_SCLK,
FPGA_I2C_SDAT,
// 40-Pin Headers
GPIO_0,
GPIO_1,
// Seven Segment Displays
HEX0,
HEX1,
HEX2,
HEX3,
HEX4,
HEX5,
// IR
IRDA_RXD,
IRDA_TXD,
// Pushbuttons
KEY,
// LEDs
LEDR,
// PS2 Ports
PS2_CLK,
PS2_DAT,
PS2_CLK2,
PS2_DAT2,
// Slider Switches
SW,
// Video-In
TD_CLK27,
TD_DATA,
TD_HS,
TD_RESET_N,
TD_VS,
// VGA
VGA_B,
VGA_BLANK_N,
VGA_CLK,
VGA_G,
VGA_HS,
VGA_R,
VGA_SYNC_N,
VGA_VS,
// HPS Pins
// DDR3 SDRAM
HPS_DDR3_ADDR,
HPS_DDR3_BA,
HPS_DDR3_CAS_N,
HPS_DDR3_CKE,
HPS_DDR3_CK_N,
HPS_DDR3_CK_P,
HPS_DDR3_CS_N,
HPS_DDR3_DM,
HPS_DDR3_DQ,
HPS_DDR3_DQS_N,
HPS_DDR3_DQS_P,
HPS_DDR3_ODT,
HPS_DDR3_RAS_N,
HPS_DDR3_RESET_N,
HPS_DDR3_RZQ,
HPS_DDR3_WE_N,
// Ethernet
HPS_ENET_GTX_CLK,
HPS_ENET_INT_N,
HPS_ENET_MDC,
HPS_ENET_MDIO,
HPS_ENET_RX_CLK,
HPS_ENET_RX_DATA,
HPS_ENET_RX_DV,
HPS_ENET_TX_DATA,
HPS_ENET_TX_EN,
// Flash
HPS_FLASH_DATA,
HPS_FLASH_DCLK,
HPS_FLASH_NCSO,
// Accelerometer
HPS_GSENSOR_INT,
// General Purpose I/O
HPS_GPIO,
// I2C
HPS_I2C_CONTROL,
HPS_I2C1_SCLK,
HPS_I2C1_SDAT,
HPS_I2C2_SCLK,
HPS_I2C2_SDAT,
// Pushbutton
HPS_KEY,
// LED
HPS_LED,
// SD Card
HPS_SD_CLK,
HPS_SD_CMD,
HPS_SD_DATA,
// SPI
HPS_SPIM_CLK,
HPS_SPIM_MISO,
HPS_SPIM_MOSI,
HPS_SPIM_SS,
// UART
HPS_UART_RX,
HPS_UART_TX,
// USB
HPS_CONV_USB_N,
HPS_USB_CLKOUT,
HPS_USB_DATA,
HPS_USB_DIR,
HPS_USB_NXT,
HPS_USB_STP
);
//=======================================================
// PARAMETER declarations
//=======================================================
//=======================================================
// PORT declarations
//=======================================================
// FPGA Pins
// Clock pins
input CLOCK_50;
input CLOCK2_50;
input CLOCK3_50;
input CLOCK4_50;
// ADC
inout ADC_CS_N;
output ADC_DIN;
input ADC_DOUT;
output ADC_SCLK;
// Audio
input AUD_ADCDAT;
inout AUD_ADCLRCK;
inout AUD_BCLK;
output AUD_DACDAT;
inout AUD_DACLRCK;
output AUD_XCK;
// SDRAM
output [12: 0] DRAM_ADDR;
output [ 1: 0] DRAM_BA;
output DRAM_CAS_N;
output DRAM_CKE;
output DRAM_CLK;
output DRAM_CS_N;
inout [15: 0] DRAM_DQ;
output DRAM_LDQM;
output DRAM_RAS_N;
output DRAM_UDQM;
output DRAM_WE_N;
// I2C Bus for Configuration of the Audio and Video-In Chips
output FPGA_I2C_SCLK;
inout FPGA_I2C_SDAT;
// 40-pin headers
inout [35: 0] GPIO_0;
inout [35: 0] GPIO_1;
// Seven Segment Displays
output [ 6: 0] HEX0;
output [ 6: 0] HEX1;
output [ 6: 0] HEX2;
output [ 6: 0] HEX3;
output [ 6: 0] HEX4;
output [ 6: 0] HEX5;
// IR
input IRDA_RXD;
output IRDA_TXD;
// Pushbuttons
input [ 3: 0] KEY;
// LEDs
output [ 9: 0] LEDR;
// PS2 Ports
inout PS2_CLK;
inout PS2_DAT;
inout PS2_CLK2;
inout PS2_DAT2;
// Slider Switches
input [ 9: 0] SW;
// Video-In
input TD_CLK27;
input [ 7: 0] TD_DATA;
input TD_HS;
output TD_RESET_N;
input TD_VS;
// VGA
output [ 7: 0] VGA_B;
output VGA_BLANK_N;
output VGA_CLK;
output [ 7: 0] VGA_G;
output VGA_HS;
output [ 7: 0] VGA_R;
output VGA_SYNC_N;
output VGA_VS;
// HPS Pins
// DDR3 SDRAM
output [14: 0] HPS_DDR3_ADDR;
output [ 2: 0] HPS_DDR3_BA;
output HPS_DDR3_CAS_N;
output HPS_DDR3_CKE;
output HPS_DDR3_CK_N;
output HPS_DDR3_CK_P;
output HPS_DDR3_CS_N;
output [ 3: 0] HPS_DDR3_DM;
inout [31: 0] HPS_DDR3_DQ;
inout [ 3: 0] HPS_DDR3_DQS_N;
inout [ 3: 0] HPS_DDR3_DQS_P;
output HPS_DDR3_ODT;
output HPS_DDR3_RAS_N;
output HPS_DDR3_RESET_N;
input HPS_DDR3_RZQ;
output HPS_DDR3_WE_N;
// Ethernet
output HPS_ENET_GTX_CLK;
inout HPS_ENET_INT_N;
output HPS_ENET_MDC;
inout HPS_ENET_MDIO;
input HPS_ENET_RX_CLK;
input [ 3: 0] HPS_ENET_RX_DATA;
input HPS_ENET_RX_DV;
output [ 3: 0] HPS_ENET_TX_DATA;
output HPS_ENET_TX_EN;
// Flash
inout [ 3: 0] HPS_FLASH_DATA;
output HPS_FLASH_DCLK;
output HPS_FLASH_NCSO;
// Accelerometer
inout HPS_GSENSOR_INT;
// General Purpose I/O
inout [ 1: 0] HPS_GPIO;
// I2C
inout HPS_I2C_CONTROL;
inout HPS_I2C1_SCLK;
inout HPS_I2C1_SDAT;
inout HPS_I2C2_SCLK;
inout HPS_I2C2_SDAT;
// Pushbutton
inout HPS_KEY;
// LED
inout HPS_LED;
// SD Card
output HPS_SD_CLK;
inout HPS_SD_CMD;
inout [ 3: 0] HPS_SD_DATA;
// SPI
output HPS_SPIM_CLK;
input HPS_SPIM_MISO;
output HPS_SPIM_MOSI;
inout HPS_SPIM_SS;
// UART
input HPS_UART_RX;
output HPS_UART_TX;
// USB
inout HPS_CONV_USB_N;
input HPS_USB_CLKOUT;
inout [ 7: 0] HPS_USB_DATA;
input HPS_USB_DIR;
input HPS_USB_NXT;
output HPS_USB_STP;
//=======================================================
// REG/WIRE declarations
//=======================================================
//wire [15: 0] hex3_hex0;
//wire [15: 0] hex5_hex4;
//assign HEX0 = ~hex3_hex0[ 6: 0]; // hex3_hex0[ 6: 0];
//assign HEX1 = ~hex3_hex0[14: 8];
//assign HEX2 = ~hex3_hex0[22:16];
//assign HEX3 = ~hex3_hex0[30:24];
//assign HEX4 = 7'b1111111;
//assign HEX5 = 7'b1111111;
//assign HEX0 = test[6:0]; // hex3_hex0[ 6: 0];
//HexDigit Digit0(HEX0, final_out[1][7:4]);//hex3_hex0[3:0]);
//HexDigit Digit1(HEX1, final_out[1][3:0]);
//HexDigit Digit2(HEX2, hex3_hex0[11:8]);
//HexDigit Digit3(HEX3, hex3_hex0[15:12]);
// MAY need to cycle this switch on power-up to get video
assign TD_RESET_N = SW[1];
// get some signals exposed
// connect bus master signals to i/o for probes
//assign GPIO_0[0] = TD_HS ;
//assign GPIO_0[1] = TD_VS ;
//assign GPIO_0[2] = TD_DATA[6] ;
//assign GPIO_0[3] = TD_CLK27 ;
//assign GPIO_0[4] = TD_RESET_N ;
//=======================================================
// Bus controller for AVALON bus-master
//=======================================================
wire [31:0] vga_bus_addr, video_in_bus_addr ; // Avalon addresses
reg [31:0] bus_addr ;
wire [31:0] vga_out_base_address = 32'h0000_0000 ; // Avalon address
wire [31:0] video_in_base_address = 32'h0800_0000 ; // Avalon address
reg [3:0] bus_byte_enable ; // four bit byte read/write mask
reg bus_read ; // high when requesting data
reg bus_write ; // high when writing data
reg [31:0] bus_write_data ; // data to send to Avalog bus
wire bus_ack ; // Avalon bus raises this when done
wire [31:0] bus_read_data ; // data from Avalon bus
reg [31:0] timer ;
reg [3:0] state ;
reg last_vs, wait_one;
reg [19:0] vs_count ;
reg last_hs, wait_one_hs ;
reg [19:0] hs_count ;
// pixel address is
logic [9:0] vga_x_cood, vga_y_cood, video_in_x_cood, video_in_y_cood ;
reg [7:0] current_pixel_color1, current_pixel_color2 ;
// compute address
// 640 x 480, ceil(log2 640) = 10
assign vga_bus_addr = vga_out_base_address + {22'b0,video_in_x_cood + vga_x_cood} +
({22'b0,video_in_y_cood + vga_y_cood}<<10) ;
//video in: 320 by 240, x:0-319, y:0-239
// 320 x 240, ceil(log2 320) = 9
//video in: 224 by 224, x:0-223, y:0-223
// 320 x 240, ceil(log2 224) = 7.8 = 8
assign video_in_bus_addr = video_in_base_address + {22'b0,video_in_x_cood} +
({22'b0,video_in_y_cood}<<8) ;
logic [7:0] greyscale8;
logic [1:0] greyscale;
//765 432 10
assign greyscale = (bus_read_data[6:5]>>1) + (bus_read_data[3:2]>>1);
assign greyscale8 = {
{2{1'b0, greyscale}}, greyscale};
logic [9:0] vga_x_cood_2, vga_y_cood_2;
logic [31:0] vga_bus_addr_2;
assign vga_bus_addr_2 = vga_out_base_address + {22'b0,video_in_x_cood + vga_x_cood_2} +
({22'b0,video_in_y_cood + vga_y_cood_2}<<10) ;
//logic [1:0] image_array [320][240];
always @(posedge CLOCK2_50) begin //CLOCK_50
// reset state machine and read/write controls
if (~KEY[0]) begin
state <= 0 ;
bus_read <= 0 ; // set to one if a read opeation from bus
bus_write <= 0 ; // set to on if a write operation to bus
// base address of upper-left corner of the screen
vga_x_cood <= 10'd0 ;
vga_y_cood <= 10'd0 ;
vga_x_cood_2 <= 10'd230;
vga_y_cood_2 <= 0 ;
video_in_x_cood <= 0 ;
video_in_y_cood <= 0 ;
bus_byte_enable <= 4'b0001;
timer <= 0;
end
else begin
timer <= timer + 1;
end
// write to the bus-master
// and put in a small delay to aviod bus hogging
// timer delay can be set to 2**n-1, so 3, 7, 15, 31
// bigger numbers mean slower frame update to VGA
if (state==0 && SW[0] && (timer & 3)==0 ) begin //
state <= 1;
// read all the pixels in the video input
video_in_x_cood <= video_in_x_cood + 10'd1 ;
if (video_in_x_cood > 10'd223) begin
video_in_x_cood <= 0 ;
video_in_y_cood <= video_in_y_cood + 10'd1 ;
if (video_in_y_cood > 10'd223) begin
video_in_y_cood <= 10'd0 ;
end
end
// one byte data
bus_byte_enable <= 4'b0001;
// read first pixel
bus_addr <= video_in_bus_addr ;
// signal the bus that a read is requested
bus_read <= 1'b1 ;
end
// finish the read
// You MUST do this check
if (state==1 && bus_ack==1) begin
state <= 2 ; //state <= 2 ;
bus_read <= 1'b0;
if (!SW[2]) begin
current_pixel_color1 <= bus_read_data ;
end
else begin
current_pixel_color1 <= 2;
end
end
// write a pixel to VGA memory
if (state==2) begin
state <= 9 ;
bus_write <= 1'b1;
bus_addr <= vga_bus_addr ;
bus_write_data <= current_pixel_color1 ;
//image_array[video_in_x_cood][video_in_y_cood] <= greyscale;
bus_byte_enable <= 4'b0001;
end
// and finish write
if (state==9 && bus_ack==1) begin
state <= 0 ;
bus_write <= 1'b0;
end
end
//===============================
logic pio_start;
logic pio_end;
logic [2:0] pio_switch;
//different input images corresponding to different numbers
always @ (*) begin
if (pio_switch==3'd1) begin
//LEDR[7:0] = final_out[3];
//1. idx 171 is 0
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1, 1, 1,-1,-1},
'{-1,-1, 1, 1, 1, 1,-1,-1},
'{-1, 1, 1,-1,-1, 1,-1,-1},
'{-1, 1, 1, 1, 1,-1,-1,-1},
'{-1,-1, 1, 1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (pio_switch==3'd2) begin
//LEDR[7:0] = final_out[3];
//2. idx 1 is a 1
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1, 1,-1,-1,-1,-1},
'{-1,-1,-1, 1,-1,-1,-1,-1},
'{-1,-1,-1, 1,-1,-1,-1,-1},
'{-1,-1,-1, 1,-1,-1,-1,-1},
'{-1,-1,-1, 1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (pio_switch==3'd3) begin
//LEDR[7:0] = final_out[3];
//3. idx 39 is a 2
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1, 1,-1,-1,-1},
'{-1,-1,-1,-1, 1,-1,-1,-1},
'{-1,-1,-1, 1, 1,-1,-1,-1},
'{-1, 1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (pio_switch==3'd4) begin
//LEDR[7:0] = final_out[3];
//4. idx 85 is 4
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1, 1,-1,-1,-1,-1,-1},
'{-1, 1, 1,-1, 1, 1,-1,-1},
'{-1,-1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1, 1, 1,-1,-1,-1},
'{-1,-1,-1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (pio_switch==3'd5) begin
//LEDR[7:0] = final_out[3];
//5. idx 119 is 5
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1, 1, 1,-1,-1,-1,-1},
'{-1,-1, 1,-1,-1,-1,-1,-1},
'{-1,-1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1,-1, 1,-1,-1},
'{-1,-1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (pio_switch==3'd6) begin
//LEDR[7:0] = final_out[3];
//6. idx 137 is 7
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1, 1,-1,-1,-1},
'{-1,-1,-1,-1, 1,-1,-1,-1},
'{-1,-1,-1,-1, 1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else begin
//LEDR[7:0] = final_out[3];
input_image=
'{
'{-1,-1,-1,-1,-1,-1,1,-1},
'{-1,-1,-1,-1,1,1,-1,-1},
'{-1,-1,-1,-1,1,-1,-1,-1},
'{-1,-1,-1,1,-1,-1,-1,-1},
'{-1,-1,1,1,-1,-1,-1,-1},
'{-1,-1,1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
/*
if (SW[9]) begin
LEDR[7:0] = final_out[3];
//1. idx 171 is 0
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1, 1, 1,-1,-1},
'{-1,-1, 1, 1, 1, 1,-1,-1},
'{-1, 1, 1,-1,-1, 1,-1,-1},
'{-1, 1, 1, 1, 1,-1,-1,-1},
'{-1,-1, 1, 1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (SW[8]) begin
LEDR[7:0] = final_out[3];
//2. idx 1 is a 1
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1, 1,-1,-1,-1,-1},
'{-1,-1,-1, 1,-1,-1,-1,-1},
'{-1,-1,-1, 1,-1,-1,-1,-1},
'{-1,-1,-1, 1,-1,-1,-1,-1},
'{-1,-1,-1, 1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (SW[7]) begin
LEDR[7:0] = final_out[3];
//3. idx 39 is a 2
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1, 1,-1,-1,-1},
'{-1,-1,-1,-1, 1,-1,-1,-1},
'{-1,-1,-1, 1, 1,-1,-1,-1},
'{-1, 1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (SW[6]) begin
LEDR[7:0] = final_out[3];
//4. idx 85 is 4
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1, 1,-1,-1,-1,-1,-1},
'{-1, 1, 1,-1, 1, 1,-1,-1},
'{-1,-1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1, 1, 1,-1,-1,-1},
'{-1,-1,-1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (SW[5]) begin
LEDR[7:0] = final_out[3];
//5. idx 119 is 5
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1, 1, 1,-1,-1,-1,-1},
'{-1,-1, 1,-1,-1,-1,-1,-1},
'{-1,-1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1,-1, 1,-1,-1},
'{-1,-1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (SW[4]) begin
LEDR[7:0] = final_out[3];
//6. idx 137 is 7
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1, 1, 1, 1,-1,-1,-1},
'{-1,-1,-1,-1, 1,-1,-1,-1},
'{-1,-1,-1,-1, 1,-1,-1,-1},
'{-1,-1,-1,-1, 1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (SW[3]) begin
LEDR[7:0] = final_out[3];
input_image=
'{
'{-1,-1,-1,-1,-1,-1,1,-1},
'{-1,-1,-1,-1,1,1,-1,-1},
'{-1,-1,-1,-1,1,-1,-1,-1},
'{-1,-1,-1,1,-1,-1,-1,-1},
'{-1,-1,1,1,-1,-1,-1,-1},
'{-1,-1,1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (SW[2]) begin
LEDR[7:0] = final_out[2];
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,1},
'{-1,-1,-1,-1,1,1,-1,-1},
'{-1,-1,-1,-1,1,-1,-1,-1},
'{-1,-1,-1,1,-1,-1,-1,-1},
'{-1,-1,1,1,-1,-1,-1,-1},
'{-1,-1,1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else if (SW[1]) begin
LEDR[7:0] = final_out[1];
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,1,1,-1,1},
'{-1,-1,-1,-1,1,-1,-1,-1},
'{-1,-1,-1,1,-1,-1,-1,-1},
'{-1,-1,1,1,-1,-1,-1,-1},
'{-1,-1,1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
else begin
LEDR[7:0] = 8'b0; //final_out[0];
input_image=
'{
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,1,1,-1,-1},
'{-1,-1,-1,-1,1,-1,-1,-1},
'{-1,-1,-1,1,-1,-1,-1,-1},
'{-1,-1,1,1,-1,-1,-1,-1},
'{-1,-1,1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1},
'{-1,-1,-1,-1,-1,-1,-1,-1}
};
end
*/
end
//===========================================================================================================
// read outputs to HPS
//===========================================================================================================
logic signed [7:0] pio_out_data;
logic pio_out_cs;
integer out_count;
always @ (posedge pio_hps_image_clk) begin
if (pio_out_cs) begin
if (out_count<10) begin
pio_out_data <= final_out[out_count];
out_count <= out_count + 1;
end
end
else out_count <= 0;
end
//===========================================================================================================
// weight initialization
//===========================================================================================================
//first conv filters - 16 3x3s : logic filter [16][3][3];
always @ (*) begin
filter =
'{ '{ '{1,1,0},'{1,0,0},'{0,1,0} },
'{ '{1,0,1},'{0,0,0},'{1,0,0} },
'{ '{0,1,1},'{1,1,1},'{1,0,1} },
'{ '{1,0,1},'{1,1,1},'{1,1,0} },
'{ '{0,0,0},'{1,0,1},'{1,1,1} },
'{ '{1,0,0},'{1,1,1},'{1,0,1} },
'{ '{0,1,1},'{1,1,0},'{0,1,1} },
'{ '{0,0,1},'{1,0,1},'{1,1,0} },
'{ '{0,1,1},'{1,1,1},'{0,0,0} },
'{ '{0,1,0},'{0,1,1},'{0,0,0} },
'{ '{1,1,1},'{0,1,1},'{1,1,1} },
'{ '{1,1,0},'{0,1,1},'{1,0,1} },
'{ '{0,0,1},'{0,1,0},'{0,1,0} },
'{ '{0,1,1},'{1,0,0},'{0,0,0} },
'{ '{0,0,0},'{1,0,0},'{0,0,0} },
'{ '{1,1,1},'{1,1,1},'{1,1,1} } };
end
//second conv filters - 16*32 3x3s in 16 row 32 column format: logic filters_conv2 [16][32][3][3];
always @ (*) begin
filters_conv2 =
'{
'{ '{ '{1,0,0},'{0,1,1},'{0,0,1} },
'{ '{1,0,1},'{1,1,1},'{1,1,1} },
'{ '{1,1,1},'{1,1,0},'{0,0,0} },
'{ '{1,1,0},'{1,0,1},'{1,0,0} },
'{ '{0,0,1},'{1,1,0},'{1,1,0} },
'{ '{1,1,0},'{0,0,1},'{0,0,1} },
'{ '{1,1,0},'{1,1,1},'{1,1,1} },
'{ '{0,1,1},'{1,0,1},'{0,0,0} },
'{ '{0,0,1},'{1,0,1},'{1,0,0} },
'{ '{0,0,1},'{1,0,1},'{1,1,0} },
'{ '{1,0,1},'{1,1,1},'{1,1,1} },
'{ '{0,0,1},'{1,0,1},'{0,1,0} },
'{ '{0,0,1},'{0,0,0},'{0,0,0} },
'{ '{1,0,0},'{1,0,1},'{1,0,0} },
'{ '{1,1,0},'{0,1,0},'{1,1,0} },
'{ '{1,1,0},'{0,1,0},'{0,1,0} },
'{ '{0,1,1},'{0,0,0},'{1,0,0} },
'{ '{1,1,0},'{1,1,1},'{0,1,0} },
'{ '{0,0,1},'{1,0,0},'{1,1,1} },
'{ '{1,1,0},'{0,1,0},'{0,1,0} },
'{ '{1,1,0},'{0,1,0},'{1,1,1} },
'{ '{0,1,0},'{1,1,1},'{0,0,0} },
'{ '{1,0,1},'{0,1,0},'{1,0,1} },
'{ '{1,0,1},'{0,1,1},'{0,0,1} },
'{ '{0,0,1},'{1,0,0},'{1,0,0} },
'{ '{0,1,0},'{0,1,1},'{0,0,1} },
'{ '{0,1,1},'{0,1,1},'{0,1,1} },
'{ '{0,0,1},'{1,0,1},'{1,1,1} },
'{ '{0,0,0},'{1,1,0},'{1,0,0} },
'{ '{0,0,0},'{0,0,1},'{0,1,1} },
'{ '{0,1,1},'{0,0,0},'{1,1,1} },
'{ '{1,0,1},'{1,1,0},'{1,0,1} } },
'{ '{ '{0,0,0},'{0,1,1},'{0,0,0} },
'{ '{1,0,1},'{1,0,1},'{0,1,1} },
'{ '{1,0,1},'{0,1,1},'{0,1,1} },
'{ '{0,1,1},'{1,0,1},'{1,0,1} },
'{ '{0,1,1},'{1,1,0},'{0,0,0} },
'{ '{1,1,0},'{1,0,1},'{1,1,1} },
'{ '{0,0,1},'{1,0,1},'{0,1,1} },
'{ '{1,0,0},'{1,1,0},'{1,1,0} },
'{ '{1,0,0},'{0,0,1},'{1,1,1} },
'{ '{0,0,0},'{1,1,0},'{0,0,1} },
'{ '{1,0,1},'{1,0,0},'{1,1,1} },
'{ '{0,1,1},'{0,0,1},'{1,1,0} },
'{ '{1,1,0},'{1,1,1},'{1,0,0} },
'{ '{1,0,1},'{0,0,0},'{0,0,1} },
'{ '{1,0,1},'{1,1,1},'{1,1,1} },
'{ '{1,1,0},'{0,1,1},'{1,1,1} },
'{ '{1,0,0},'{0,0,1},'{1,0,0} },
'{ '{1,0,1},'{0,1,1},'{0,1,0} },
'{ '{1,1,0},'{1,1,0},'{1,0,1} },
'{ '{1,1,1},'{1,0,1},'{1,1,0} },
'{ '{0,0,0},'{0,1,1},'{1,1,1} },
'{ '{0,0,1},'{1,1,0},'{1,1,0} },
'{ '{1,0,1},'{1,0,1},'{1,0,1} },
'{ '{1,0,1},'{1,1,0},'{1,1,0} },
'{ '{1,0,1},'{1,1,1},'{0,1,1} },
'{ '{1,1,0},'{0,1,1},'{1,1,1} },
'{ '{1,1,1},'{1,1,0},'{0,1,0} },
'{ '{1,1,0},'{1,1,1},'{1,1,0} },
'{ '{0,1,0},'{1,0,1},'{1,1,1} },
'{ '{1,1,1},'{0,1,0},'{0,0,1} },
'{ '{0,1,0},'{1,1,1},'{1,1,1} },
'{ '{1,1,0},'{0,1,1},'{0,1,0} } },
'{ '{ '{0,1,1},'{0,0,0},'{1,1,0} },
'{ '{1,1,0},'{1,0,0},'{1,0,0} },
'{ '{0,0,0},'{1,1,1},'{1,0,0} },
'{ '{0,1,1},'{1,0,1},'{0,0,1} },
'{ '{0,0,1},'{1,0,0},'{0,0,0} },
'{ '{1,0,1},'{1,0,1},'{1,1,0} },
'{ '{0,0,1},'{1,0,1},'{0,1,0} },
'{ '{1,1,0},'{0,0,1},'{0,1,1} },
'{ '{1,1,1},'{1,1,0},'{1,0,0} },
'{ '{1,1,1},'{0,1,0},'{1,1,1} },
'{ '{0,0,1},'{1,1,1},'{0,1,1} },
'{ '{1,0,1},'{0,0,0},'{0,0,1} },
'{ '{1,0,0},'{1,1,0},'{1,1,1} },
'{ '{0,0,1},'{1,1,1},'{1,0,0} },
'{ '{0,0,0},'{1,1,1},'{0,1,0} },
'{ '{1,1,1},'{0,0,0},'{1,0,1} },
'{ '{0,1,1},'{1,0,1},'{0,0,0} },
'{ '{0,1,1},'{0,0,1},'{0,0,0} },
'{ '{0,0,0},'{1,1,1},'{0,1,1} },
'{ '{0,1,0},'{1,0,1},'{1,1,1} },
'{ '{1,1,1},'{1,0,0},'{0,0,1} },
'{ '{0,0,0},'{0,0,1},'{0,1,1} },
'{ '{0,1,0},'{0,0,0},'{1,1,1} },
'{ '{0,0,1},'{0,0,0},'{1,1,1} },
'{ '{0,1,0},'{1,0,1},'{1,0,0} },
'{ '{1,1,1},'{0,1,0},'{1,1,1} },
'{ '{0,0,1},'{0,1,0},'{1,0,1} },
'{ '{0,1,1},'{1,1,0},'{1,0,1} },
'{ '{1,0,1},'{0,1,0},'{1,0,1} },
'{ '{1,0,1},'{0,0,0},'{1,1,0} },
'{ '{1,1,0},'{1,0,1},'{0,0,0} },
'{ '{0,0,0},'{0,1,0},'{0,1,0} } },
'{ '{ '{0,1,0},'{0,0,0},'{1,0,1} },
'{ '{0,1,1},'{0,0,0},'{0,1,1} },
'{ '{0,1,1},'{0,0,1},'{1,0,1} },
'{ '{0,0,1},'{1,1,1},'{1,0,0} },
'{ '{0,1,1},'{0,1,1},'{0,0,1} },
'{ '{1,0,0},'{1,0,1},'{1,1,1} },
'{ '{1,0,1},'{1,1,1},'{0,1,0} },
'{ '{1,1,1},'{0,0,0},'{0,0,0} },
'{ '{0,1,0},'{1,1,1},'{1,0,1} },
'{ '{1,1,1},'{0,0,0},'{0,0,0} },
'{ '{0,1,1},'{0,0,0},'{1,0,1} },
'{ '{1,0,1},'{1,0,0},'{0,0,0} },
'{ '{0,1,1},'{1,1,1},'{1,1,0} },
'{ '{1,1,1},'{1,0,1},'{0,1,1} },
'{ '{1,1,0},'{0,0,1},'{1,1,0} },
'{ '{1,0,0},'{0,1,1},'{1,1,0} },
'{ '{0,1,0},'{1,0,1},'{0,1,1} },
'{ '{0,1,1},'{1,1,1},'{1,1,0} },
'{ '{1,1,1},'{1,0,1},'{1,1,1} },
'{ '{1,1,0},'{1,1,1},'{0,1,0} },
'{ '{0,0,0},'{1,1,0},'{1,1,0} },
'{ '{1,1,1},'{0,0,1},'{1,0,0} },
'{ '{1,1,0},'{1,0,0},'{1,0,0} },
'{ '{1,1,1},'{1,0,1},'{0,0,0} },
'{ '{1,1,0},'{1,1,1},'{1,1,1} },
'{ '{0,0,0},'{1,1,1},'{1,0,1} },
'{ '{1,0,1},'{1,1,0},'{1,0,1} },
'{ '{1,1,1},'{0,1,1},'{1,1,1} },
'{ '{0,1,1},'{1,0,1},'{1,1,1} },
'{ '{0,0,0},'{0,0,1},'{1,0,0} },
'{ '{1,0,1},'{0,1,1},'{0,0,0} },
'{ '{1,1,1},'{0,1,1},'{0,0,1} } },
'{ '{ '{0,1,0},'{1,1,0},'{1,1,1} },
'{ '{1,0,0},'{0,1,1},'{0,0,0} },
'{ '{0,1,0},'{0,0,0},'{1,0,0} },
'{ '{0,0,1},'{1,1,1},'{0,1,1} },
'{ '{0,0,1},'{1,1,0},'{1,1,1} },
'{ '{0,1,0},'{1,1,0},'{0,1,0} },
'{ '{0,1,1},'{1,1,0},'{0,0,1} },
'{ '{0,1,1},'{0,0,0},'{1,0,0} },
'{ '{1,1,0},'{1,1,1},'{0,1,0} },
'{ '{1,1,1},'{1,0,1},'{1,1,1} },
'{ '{0,1,1},'{0,1,1},'{1,0,1} },
'{ '{1,0,0},'{0,1,0},'{1,1,1} },
'{ '{1,0,0},'{0,0,1},'{1,0,0} },
'{ '{0,1,1},'{1,0,0},'{1,1,0} },
'{ '{1,1,1},'{0,1,1},'{0,0,0} },
'{ '{1,0,0},'{1,0,1},'{1,1,0} },
'{ '{1,1,1},'{0,0,1},'{0,0,1} },
'{ '{0,0,0},'{1,0,0},'{0,1,0} },
'{ '{0,0,1},'{1,0,1},'{1,0,0} },
'{ '{0,1,0},'{1,0,1},'{0,1,1} },
'{ '{1,0,1},'{0,1,1},'{1,1,1} },
'{ '{0,1,1},'{1,1,1},'{0,1,0} },
'{ '{0,1,0},'{1,1,1},'{0,1,1} },
'{ '{0,1,1},'{0,0,1},'{1,1,1} },
'{ '{0,1,1},'{0,1,0},'{0,1,0} },
'{ '{1,0,0},'{1,0,0},'{0,1,0} },
'{ '{0,1,1},'{0,1,1},'{1,0,1} },
'{ '{1,1,1},'{1,0,0},'{1,1,0} },
'{ '{0,1,0},'{1,0,1},'{0,1,0} },
'{ '{1,1,0},'{1,1,1},'{1,1,1} },
'{ '{0,0,0},'{0,0,0},'{0,1,1} },
'{ '{0,0,0},'{0,0,1},'{0,1,0} } },
'{ '{ '{1,0,0},'{0,0,0},'{0,1,1} },
'{ '{0,1,1},'{1,0,0},'{1,1,0} },
'{ '{0,1,0},'{0,1,0},'{1,1,1} },
'{ '{1,1,1},'{1,0,1},'{1,1,1} },
'{ '{1,1,1},'{0,1,0},'{0,1,0} },
'{ '{0,1,0},'{1,0,1},'{0,0,0} },
'{ '{0,0,0},'{0,1,1},'{0,0,0} },
'{ '{0,1,1},'{1,1,1},'{0,1,0} },
'{ '{1,0,0},'{1,0,1},'{0,1,1} },
'{ '{1,1,1},'{1,0,0},'{0,0,1} },
'{ '{0,0,0},'{1,0,0},'{1,0,1} },
'{ '{1,0,1},'{1,1,0},'{0,0,1} },
'{ '{1,1,0},'{1,0,0},'{1,0,0} },
'{ '{0,0,0},'{0,0,0},'{0,1,1} },
'{ '{0,0,1},'{1,1,0},'{0,1,1} },
'{ '{0,1,1},'{0,1,0},'{0,1,0} },
'{ '{0,1,0},'{0,0,1},'{0,1,0} },
'{ '{1,1,0},'{0,0,0},'{1,0,1} },
'{ '{1,1,1},'{1,1,1},'{1,1,0} },
'{ '{0,1,1},'{1,0,1},'{1,0,0} },
'{ '{0,1,1},'{0,1,1},'{1,1,1} },
'{ '{1,0,0},'{0,0,0},'{1,0,0} },
'{ '{1,1,1},'{0,0,1},'{1,0,0} },
'{ '{1,1,1},'{0,0,1},'{1,1,0} },
'{ '{1,0,0},'{1,1,1},'{0,0,0} },
'{ '{1,0,0},'{0,0,1},'{0,1,0} },
'{ '{1,1,1},'{0,0,1},'{0,0,1} },
'{ '{1,1,1},'{0,1,1},'{1,0,0} },
'{ '{0,0,1},'{1,0,1},'{0,0,1} },
'{ '{0,0,1},'{0,0,0},'{1,0,0} },
'{ '{1,0,1},'{0,1,0},'{0,0,0} },
'{ '{1,0,1},'{1,1,1},'{1,0,1} } },
'{ '{ '{0,0,1},'{1,1,0},'{1,1,1} },
'{ '{1,0,1},'{0,0,1},'{1,1,1} },
'{ '{0,1,1},'{1,1,1},'{0,0,0} },
'{ '{0,0,1},'{0,0,0},'{0,1,0} },
'{ '{0,1,0},'{1,0,0},'{0,1,0} },
'{ '{0,1,1},'{0,0,1},'{0,0,0} },
'{ '{0,1,1},'{0,0,0},'{0,0,1} },
'{ '{0,1,1},'{1,1,1},'{0,1,1} },
'{ '{1,0,0},'{1,1,1},'{0,0,1} },
'{ '{0,1,0},'{1,0,1},'{0,1,0} },
'{ '{1,0,1},'{0,1,0},'{0,1,0} },
'{ '{0,0,1},'{1,1,1},'{0,0,1} },
'{ '{1,0,0},'{0,0,1},'{1,1,0} },
'{ '{0,1,0},'{1,1,1},'{1,0,1} },
'{ '{0,1,0},'{1,0,1},'{1,0,0} },
'{ '{0,0,0},'{0,1,0},'{1,0,1} },
'{ '{1,1,1},'{1,0,1},'{1,0,0} },
'{ '{1,1,1},'{1,0,1},'{0,0,1} },
'{ '{0,1,0},'{1,1,1},'{1,1,0} },
'{ '{0,1,1},'{0,0,1},'{1,1,0} },
'{ '{0,1,0},'{1,1,1},'{1,1,1} },
'{ '{0,0,0},'{1,1,0},'{1,0,0} },
'{ '{1,0,1},'{1,0,1},'{1,0,0} },
'{ '{1,1,0},'{1,0,0},'{1,0,0} },
'{ '{1,1,1},'{0,1,0},'{0,1,1} },
'{ '{1,1,1},'{0,0,1},'{0,0,0} },
'{ '{1,0,1},'{1,0,1},'{0,0,0} },
'{ '{0,1,0},'{0,1,1},'{1,1,1} },
'{ '{0,1,0},'{0,0,0},'{0,1,0} },
'{ '{0,0,1},'{0,0,0},'{0,0,1} },
'{ '{1,0,1},'{1,0,0},'{1,0,1} },
'{ '{1,0,1},'{0,0,1},'{1,0,1} } },
'{ '{ '{0,0,1},'{0,0,0},'{1,1,0} },
'{ '{0,1,0},'{0,1,1},'{0,1,0} },
'{ '{1,1,0},'{1,1,0},'{0,0,0} },
'{ '{1,1,1},'{1,1,0},'{0,0,1} },
'{ '{0,0,1},'{0,0,0},'{1,1,1} },
'{ '{0,1,0},'{1,1,0},'{1,0,0} },
'{ '{0,1,0},'{1,0,0},'{0,0,0} },
'{ '{0,1,0},'{1,0,0},'{1,0,0} },
'{ '{0,1,1},'{1,0,0},'{1,1,1} },
'{ '{1,1,0},'{1,1,1},'{1,0,1} },
'{ '{0,1,0},'{0,1,0},'{1,0,1} },
'{ '{1,1,1},'{1,0,0},'{1,0,0} },
'{ '{1,1,0},'{0,1,1},'{1,0,1} },
'{ '{0,0,1},'{1,1,1},'{0,1,1} },
'{ '{0,0,1},'{0,1,0},'{1,1,0} },
'{ '{0,0,0},'{0,0,1},'{0,1,1} },
'{ '{0,1,1},'{1,0,1},'{1,0,0} },
'{ '{1,0,1},'{0,1,0},'{1,1,0} },
'{ '{1,0,1},'{0,1,0},'{1,0,0} },
'{ '{1,1,0},'{0,1,1},'{0,0,0} },
'{ '{1,1,0},'{1,0,1},'{1,1,1} },
'{ '{0,1,0},'{1,0,0},'{1,1,0} },
'{ '{1,0,0},'{1,1,0},'{1,1,1} },
'{ '{0,0,0},'{0,1,1},'{1,0,0} },
'{ '{0,1,1},'{0,1,1},'{0,1,1} },
'{ '{1,0,0},'{1,1,1},'{1,0,1} },
'{ '{0,0,0},'{0,0,0},'{0,0,1} },
'{ '{1,1,1},'{1,0,1},'{0,0,1} },
'{ '{0,1,1},'{0,1,0},'{1,1,0} },
'{ '{1,1,1},'{1,1,1},'{1,1,0} },
'{ '{1,1,0},'{1,0,1},'{0,0,1} },
'{ '{1,1,1},'{1,1,1},'{0,0,1} } },
'{ '{ '{0,1,0},'{0,0,0},'{0,0,0} },
'{ '{1,0,1},'{1,0,1},'{1,1,0} },
'{ '{1,1,0},'{1,1,1},'{0,0,0} },
'{ '{1,0,1},'{0,1,0},'{1,0,0} },
'{ '{1,0,1},'{1,1,0},'{0,0,0} },
'{ '{1,0,0},'{0,0,1},'{0,0,1} },
'{ '{0,1,1},'{0,0,1},'{0,1,1} },
'{ '{0,1,1},'{1,0,0},'{0,1,1} },
'{ '{1,0,1},'{0,0,0},'{0,0,0} },
'{ '{0,0,1},'{1,1,1},'{0,0,0} },
'{ '{0,1,0},'{0,0,0},'{0,1,0} },
'{ '{0,0,1},'{1,1,0},'{1,0,0} },
'{ '{1,1,0},'{0,0,1},'{0,0,1} },
'{ '{0,1,0},'{0,0,1},'{1,0,1} },
'{ '{1,0,0},'{1,0,0},'{0,1,1} },
'{ '{0,0,1},'{0,1,1},'{0,1,1} },
'{ '{1,1,1},'{0,0,0},'{1,1,0} },
'{ '{1,1,1},'{1,1,1},'{1,1,0} },
'{ '{1,0,0},'{1,1,1},'{0,0,0} },
'{ '{1,1,1},'{1,0,0},'{0,0,1} },
'{ '{0,0,1},'{1,0,0},'{1,1,1} },
'{ '{1,0,0},'{1,0,0},'{0,0,1} },
'{ '{0,0,1},'{1,0,0},'{0,1,0} },
'{ '{1,1,0},'{0,0,0},'{1,1,0} },
'{ '{0,0,0},'{1,1,1},'{0,1,1} },
'{ '{0,0,1},'{1,0,1},'{1,0,0} },
'{ '{0,0,1},'{1,1,0},'{1,1,0} },
'{ '{0,1,1},'{0,0,0},'{0,1,1} },
'{ '{1,0,0},'{0,0,1},'{0,1,1} },
'{ '{0,0,1},'{1,0,1},'{1,1,1} },
'{ '{0,1,0},'{0,0,0},'{0,0,1} },
'{ '{0,1,0},'{0,0,0},'{1,0,1} } },
'{ '{ '{1,0,0},'{1,0,0},'{0,0,1} },
'{ '{0,0,1},'{1,0,0},'{0,1,0} },
'{ '{1,1,1},'{0,1,1},'{0,0,1} },
'{ '{0,1,0},'{0,1,1},'{1,1,1} },
'{ '{1,0,1},'{1,1,1},'{0,1,1} },
'{ '{0,0,0},'{0,1,1},'{1,0,1} },
'{ '{1,1,0},'{1,1,1},'{0,0,0} },
'{ '{1,1,0},'{1,1,1},'{0,1,0} },
'{ '{1,0,0},'{1,0,0},'{0,0,0} },
'{ '{1,0,1},'{0,0,0},'{0,1,0} },
'{ '{0,0,1},'{1,1,0},'{1,1,1} },
'{ '{0,0,1},'{1,0,0},'{0,0,1} },
'{ '{1,0,1},'{1,1,1},'{1,0,0} },
'{ '{0,1,1},'{0,1,0},'{1,1,0} },
'{ '{0,1,1},'{0,0,1},'{1,0,0} },
'{ '{1,0,0},'{0,0,1},'{1,1,1} },
'{ '{0,1,1},'{1,1,1},'{1,1,1} },
'{ '{0,0,0},'{1,0,1},'{0,0,1} },
'{ '{1,0,0},'{1,1,1},'{0,0,0} },
'{ '{1,0,0},'{1,1,0},'{0,0,1} },
'{ '{1,0,1},'{1,0,0},'{0,0,0} },
'{ '{0,1,1},'{0,1,0},'{0,0,1} },
'{ '{1,0,1},'{1,1,1},'{1,0,1} },
'{ '{0,1,1},'{1,0,1},'{1,0,1} },
'{ '{1,1,0},'{1,0,1},'{0,1,1} },
'{ '{0,1,0},'{0,0,1},'{1,1,0} },
'{ '{0,0,1},'{1,1,1},'{0,0,1} },
'{ '{1,0,0},'{1,1,0},'{0,0,0} },
'{ '{0,0,1},'{0,1,1},'{0,1,1} },
'{ '{1,1,0},'{1,1,0},'{0,1,1} },
'{ '{0,1,1},'{1,0,1},'{0,0,1} },
'{ '{1,0,1},'{0,1,1},'{1,1,0} } },
'{ '{ '{1,0,1},'{1,1,1},'{1,1,1} },
'{ '{1,1,1},'{0,0,1},'{0,0,1} },
'{ '{0,0,0},'{1,1,1},'{0,0,1} },
'{ '{1,1,0},'{1,0,1},'{1,1,1} },
'{ '{1,0,1},'{0,0,1},'{0,1,1} },
'{ '{0,1,0},'{1,1,0},'{1,0,0} },
'{ '{0,1,1},'{0,1,0},'{1,1,1} },
'{ '{1,0,1},'{1,0,0},'{1,0,0} },
'{ '{1,1,1},'{1,0,0},'{1,1,0} },
'{ '{0,0,1},'{1,0,1},'{0,0,1} },
'{ '{1,0,1},'{0,1,1},'{1,1,0} },
'{ '{1,1,1},'{1,0,1},'{1,1,1} },
'{ '{1,0,1},'{1,0,1},'{1,1,0} },
'{ '{1,1,1},'{1,0,0},'{1,0,0} },
'{ '{0,1,0},'{0,0,1},'{0,0,0} },
'{ '{0,1,0},'{0,0,0},'{0,1,1} },
'{ '{1,0,0},'{1,1,1},'{0,0,1} },
'{ '{1,1,0},'{0,1,1},'{0,0,0} },
'{ '{0,0,0},'{1,1,0},'{1,0,1} },
'{ '{0,0,0},'{0,0,1},'{0,1,0} },
'{ '{1,1,0},'{0,0,0},'{0,0,0} },
'{ '{1,0,0},'{1,0,1},'{0,0,0} },
'{ '{0,1,0},'{1,1,1},'{0,0,1} },
'{ '{0,0,0},'{1,1,1},'{1,0,1} },
'{ '{1,0,0},'{0,0,1},'{0,0,1} },
'{ '{0,0,1},'{1,1,0},'{0,1,0} },
'{ '{0,0,0},'{1,0,1},'{1,0,1} },
'{ '{1,1,0},'{0,0,0},'{1,0,1} },
'{ '{1,1,1},'{1,1,1},'{1,1,1} },
'{ '{1,1,0},'{0,0,1},'{0,0,0} },
'{ '{0,1,0},'{0,0,1},'{0,1,1} },
'{ '{1,1,1},'{1,1,0},'{1,0,0} } },
'{ '{ '{1,1,0},'{1,1,0},'{1,1,0} },
'{ '{0,1,0},'{1,0,1},'{1,0,1} },
'{ '{0,1,0},'{1,1,0},'{0,0,0} },
'{ '{0,1,0},'{1,1,0},'{0,1,0} },
'{ '{0,1,1},'{0,0,1},'{1,0,0} },
'{ '{1,1,1},'{0,0,0},'{0,1,0} },
'{ '{0,0,1},'{0,0,1},'{1,0,0} },
'{ '{1,1,1},'{1,0,1},'{1,1,1} },
'{ '{1,1,0},'{1,0,1},'{0,0,0} },
'{ '{1,0,1},'{0,1,0},'{0,0,1} },
'{ '{1,0,1},'{0,1,0},'{0,1,1} },
'{ '{1,0,1},'{0,0,0},'{1,0,0} },
'{ '{0,0,1},'{0,0,1},'{1,0,0} },
'{ '{1,1,1},'{1,1,1},'{0,1,0} },
'{ '{0,1,1},'{1,0,0},'{0,0,1} },
'{ '{0,1,1},'{0,1,1},'{0,1,0} },
'{ '{1,1,1},'{0,0,0},'{1,0,1} },
'{ '{0,1,0},'{0,0,1},'{0,0,1} },
'{ '{1,1,1},'{0,0,1},'{1,1,0} },
'{ '{0,1,0},'{0,0,0},'{1,1,0} },
'{ '{1,0,0},'{1,0,0},'{1,0,1} },
'{ '{0,0,0},'{0,1,0},'{0,1,0} },
'{ '{0,1,0},'{0,1,1},'{0,1,0} },
'{ '{1,1,0},'{0,1,0},'{0,0,0} },
'{ '{1,0,0},'{1,1,0},'{0,0,1} },
'{ '{1,1,0},'{0,0,1},'{1,1,1} },
'{ '{1,0,0},'{1,1,1},'{1,0,1} },
'{ '{1,0,1},'{0,0,1},'{1,0,1} },
'{ '{0,0,1},'{1,0,1},'{1,0,0} },
'{ '{1,0,0},'{0,0,0},'{1,1,1} },
'{ '{0,0,0},'{0,0,1},'{1,1,0} },
'{ '{1,1,1},'{1,1,1},'{0,0,0} } },
'{ '{ '{1,0,0},'{0,0,1},'{0,0,0} },
'{ '{1,1,0},'{0,1,0},'{1,0,0} },
'{ '{1,0,0},'{1,0,0},'{1,1,0} },
'{ '{0,1,1},'{1,1,1},'{0,1,0} },
'{ '{1,0,0},'{1,0,0},'{0,0,0} },
'{ '{0,1,1},'{0,1,1},'{1,1,1} },
'{ '{1,1,1},'{0,1,0},'{1,1,1} },
'{ '{1,0,1},'{0,1,1},'{0,1,1} },
'{ '{1,1,0},'{0,1,1},'{1,1,0} },
'{ '{1,0,1},'{0,1,0},'{0,1,0} },
'{ '{0,1,1},'{0,1,0},'{1,1,1} },
'{ '{1,0,0},'{1,1,1},'{1,0,0} },
'{ '{0,1,0},'{0,1,0},'{0,1,1} },
'{ '{1,0,0},'{1,0,1},'{1,1,1} },
'{ '{0,1,1},'{0,1,0},'{1,1,0} },
'{ '{1,1,1},'{1,1,1},'{1,1,1} },
'{ '{0,0,0},'{1,0,1},'{0,0,1} },
'{ '{1,1,1},'{1,0,1},'{0,1,0} },
'{ '{1,0,1},'{1,1,1},'{1,1,0} },
'{ '{1,0,0},'{0,0,0},'{1,1,1} },
'{ '{1,1,0},'{0,0,0},'{1,0,1} },
'{ '{0,1,1},'{1,1,1},'{1,0,1} },
'{ '{1,1,0},'{1,0,0},'{0,0,1} },
'{ '{0,0,1},'{0,0,1},'{1,1,1} },
'{ '{0,1,0},'{0,1,1},'{1,0,0} },
'{ '{1,1,1},'{1,1,0},'{1,1,0} },
'{ '{0,1,1},'{0,0,0},'{0,1,1} },
'{ '{0,0,0},'{1,0,0},'{0,1,1} },
'{ '{0,1,1},'{1,0,1},'{0,1,1} },
'{ '{0,0,1},'{1,0,0},'{1,0,0} },
'{ '{1,0,1},'{0,0,1},'{0,1,1} },
'{ '{0,0,1},'{1,1,0},'{1,0,0} } },
'{ '{ '{0,0,1},'{0,1,0},'{0,0,1} },
'{ '{0,1,0},'{0,0,0},'{1,0,1} },
'{ '{0,1,1},'{0,0,1},'{1,0,0} },
'{ '{0,0,0},'{1,1,0},'{0,0,0} },
'{ '{0,1,0},'{1,1,1},'{1,0,0} },
'{ '{0,1,0},'{0,1,0},'{0,0,1} },
'{ '{0,0,1},'{0,0,0},'{0,1,0} },
'{ '{0,0,0},'{1,1,1},'{1,1,1} },
'{ '{1,0,0},'{0,1,1},'{0,1,0} },
'{ '{1,1,0},'{0,1,0},'{0,1,0} },
'{ '{0,0,1},'{1,0,1},'{1,1,0} },
'{ '{1,0,1},'{1,1,0},'{0,1,1} },
'{ '{1,0,0},'{1,0,1},'{1,1,1} },
'{ '{0,1,1},'{1,1,1},'{1,0,0} },
'{ '{0,1,1},'{1,1,0},'{0,1,0} },
'{ '{0,1,1},'{0,0,0},'{0,0,1} },
'{ '{0,1,0},'{1,1,0},'{0,1,1} },
'{ '{1,0,1},'{0,1,1},'{1,0,0} },
'{ '{0,0,1},'{1,1,1},'{1,1,1} },
'{ '{0,1,0},'{1,1,0},'{0,0,0} },
'{ '{1,1,1},'{1,1,1},'{1,0,0} },
'{ '{0,0,1},'{1,1,0},'{0,1,0} },
'{ '{1,0,1},'{1,0,0},'{0,1,1} },
'{ '{1,1,0},'{1,0,1},'{0,1,0} },
'{ '{0,1,0},'{0,0,1},'{1,0,1} },
'{ '{1,1,1},'{1,1,1},'{0,0,0} },
'{ '{1,1,1},'{0,0,0},'{1,1,1} },
'{ '{1,0,0},'{1,1,1},'{1,1,0} },
'{ '{1,0,1},'{1,0,1},'{0,0,0} },
'{ '{1,0,0},'{0,0,0},'{1,1,0} },
'{ '{0,1,1},'{1,1,0},'{1,1,0} },
'{ '{1,0,0},'{1,0,1},'{1,1,1} } },
'{ '{ '{1,0,0},'{1,1,1},'{1,1,1} },
'{ '{0,1,0},'{1,0,0},'{0,0,1} },
'{ '{0,1,1},'{0,0,0},'{1,1,1} },
'{ '{1,1,1},'{0,0,0},'{0,1,1} },
'{ '{0,0,0},'{0,1,1},'{0,0,1} },
'{ '{1,1,0},'{1,1,1},'{1,1,1} },
'{ '{0,0,0},'{1,1,0},'{1,1,1} },
'{ '{1,1,1},'{0,0,1},'{1,0,0} },
'{ '{1,0,0},'{1,1,1},'{1,0,1} },
'{ '{0,1,0},'{1,0,0},'{1,1,0} },
'{ '{1,1,0},'{1,1,0},'{1,0,1} },
'{ '{0,1,0},'{0,0,1},'{0,1,0} },
'{ '{1,1,1},'{1,1,1},'{0,1,1} },
'{ '{0,0,1},'{1,1,0},'{0,1,1} },
'{ '{0,0,0},'{0,0,0},'{1,1,1} },
'{ '{1,1,1},'{1,1,1},'{0,0,0} },
'{ '{1,1,0},'{1,0,1},'{1,0,0} },
'{ '{0,0,1},'{1,1,0},'{0,1,1} },
'{ '{1,1,0},'{1,0,0},'{1,0,0} },
'{ '{1,0,1},'{1,0,1},'{1,1,0} },
'{ '{0,1,1},'{1,1,0},'{0,0,1} },
'{ '{0,1,1},'{1,0,0},'{0,1,0} },
'{ '{0,1,0},'{1,0,0},'{1,0,0} },
'{ '{0,1,0},'{0,0,1},'{0,0,1} },
'{ '{0,0,0},'{0,0,1},'{1,1,0} },
'{ '{1,1,0},'{1,0,1},'{0,0,0} },
'{ '{1,0,1},'{1,1,0},'{1,0,1} },
'{ '{1,0,0},'{1,0,0},'{0,1,0} },
'{ '{0,1,0},'{1,0,0},'{0,1,1} },
'{ '{1,0,1},'{0,0,1},'{1,1,1} },
'{ '{0,0,0},'{1,1,1},'{0,0,1} },
'{ '{0,1,1},'{1,1,0},'{1,0,0} } },
'{ '{ '{1,0,0},'{0,1,1},'{0,1,1} },
'{ '{0,0,0},'{1,0,0},'{1,0,1} },
'{ '{1,0,1},'{0,1,0},'{0,1,1} },
'{ '{0,1,1},'{0,1,0},'{0,0,1} },
'{ '{0,1,0},'{0,0,1},'{1,1,1} },
'{ '{1,0,1},'{0,1,0},'{0,0,0} },
'{ '{0,1,0},'{1,0,1},'{1,1,1} },
'{ '{0,1,0},'{1,1,0},'{0,0,0} },
'{ '{1,1,1},'{1,1,0},'{0,1,1} },
'{ '{0,1,1},'{1,1,0},'{0,1,1} },
'{ '{1,1,0},'{0,1,0},'{1,1,1} },
'{ '{1,1,1},'{0,1,1},'{0,0,1} },
'{ '{0,1,1},'{0,0,1},'{0,0,1} },
'{ '{1,0,0},'{0,0,0},'{1,1,1} },
'{ '{1,0,1},'{0,0,1},'{1,1,1} },
'{ '{0,0,1},'{0,1,1},'{0,0,1} },
'{ '{1,0,0},'{0,1,0},'{1,0,1} },
'{ '{0,1,1},'{0,1,0},'{0,0,0} },
'{ '{0,1,1},'{1,0,0},'{1,0,1} },
'{ '{0,0,0},'{1,0,0},'{0,0,1} },
'{ '{1,0,1},'{1,0,0},'{0,1,0} },
'{ '{1,1,1},'{1,1,0},'{1,1,1} },
'{ '{1,1,0},'{0,1,1},'{0,0,1} },
'{ '{0,1,1},'{1,1,1},'{1,0,1} },
'{ '{0,1,1},'{1,0,0},'{0,1,1} },
'{ '{0,1,0},'{1,0,1},'{1,0,0} },
'{ '{1,1,1},'{1,0,0},'{0,1,1} },
'{ '{1,1,1},'{1,0,0},'{1,0,1} },
'{ '{0,1,1},'{1,1,1},'{1,1,0} },
'{ '{0,0,1},'{0,0,1},'{0,0,1} },
'{ '{0,0,1},'{1,0,1},'{1,1,0} },
'{ '{0,1,0},'{1,1,1},'{0,1,0} } }
};
end
//weights for fc layer - 128 row 32 columns: logic wa [128][32];
integer h, q;
always @(*) begin
wa =
'{
'{0,1,0,0,0,1,1,1,1,1,1,0,0,0,1,0,1,1,0,1,0,1,1,0,1,0,0,1,1,0,1,1},
'{0,0,0,1,1,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,1,1,1,0,1,1},
'{0,0,1,1,1,0,0,1,0,1,1,0,0,1,0,1,0,1,0,1,1,0,1,1,0,1,0,1,1,0,1,1},
'{1,0,1,1,0,1,1,1,0,0,0,0,1,0,1,1,1,0,0,1,0,1,0,1,1,1,0,1,1,0,0,0},
'{0,0,0,1,0,1,1,1,0,0,0,1,0,0,0,0,1,1,0,0,1,0,1,1,1,0,1,0,1,0,0,1},
'{0,1,0,1,0,0,1,1,1,0,1,1,0,0,1,1,0,1,0,1,0,1,1,0,0,1,1,1,0,1,1,1},
'{1,1,1,0,1,0,0,1,1,0,1,1,1,1,1,1,1,0,1,0,1,0,1,0,1,0,0,0,0,0,0,0},
'{1,1,0,0,0,0,1,0,0,1,1,0,1,1,1,0,0,0,1,0,0,0,1,1,1,0,1,1,1,0,1,1},
'{0,1,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,1,1,0,0,0,0,1,0,1,0,0,1,0},
'{1,0,0,1,1,1,0,0,0,0,1,1,0,1,1,0,0,1,0,1,0,1,0,1,0,1,0,1,1,1,1,1},
'{0,1,1,0,0,1,0,1,1,0,0,0,0,1,0,1,0,1,0,1,1,1,1,0,0,1,1,1,0,1,0,0},
'{0,0,0,1,1,1,0,1,0,0,1,0,0,1,1,0,1,0,1,0,0,1,1,1,1,0,0,0,1,0,0,0},
'{0,1,0,0,0,0,0,1,1,1,1,0,1,1,0,1,0,0,0,0,1,0,1,0,0,0,1,1,0,0,0,0},
'{1,0,1,0,0,0,0,0,1,1,1,0,0,0,0,0,1,1,1,1,1,0,1,1,0,1,0,0,1,1,1,0},
'{1,0,1,0,1,1,1,0,0,0,1,0,1,1,1,1,0,1,1,0,0,1,1,1,0,0,0,0,1,1,0,1},
'{1,0,1,1,1,1,0,1,1,0,0,1,0,1,0,0,1,0,0,0,0,0,1,1,0,1,0,1,1,1,0,0},
'{0,1,0,0,0,1,0,0,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,0,1,1,0,1,0,0,0,0},
'{0,0,0,0,1,1,0,0,1,0,1,1,1,0,1,0,1,0,0,1,1,1,1,0,1,1,1,0,1,1,1,1},
'{0,1,0,0,0,1,1,0,1,0,1,1,1,0,0,1,0,1,1,1,1,1,0,0,0,1,1,1,0,1,1,0},
'{0,0,0,1,1,1,1,1,0,1,1,1,1,1,0,1,0,1,0,1,1,1,1,1,1,1,0,0,1,1,0,0},
'{0,1,1,1,1,1,0,0,0,0,0,1,0,0,1,0,1,0,1,1,1,1,0,0,0,0,0,0,1,1,1,1},
'{0,1,0,0,0,0,1,1,0,1,0,1,0,1,0,0,1,0,0,1,1,0,0,1,1,0,0,0,0,1,0,1},
'{1,0,1,0,0,1,0,1,0,0,0,1,1,0,1,0,1,1,1,1,0,0,0,0,1,0,1,1,0,0,1,1},
'{1,1,0,0,1,1,1,0,1,0,0,1,0,1,1,0,1,1,0,0,0,1,1,0,1,1,0,0,1,1,1,1},
'{1,0,0,0,0,0,0,0,1,1,0,0,0,1,0,0,1,0,0,1,1,1,0,0,1,1,1,1,0,0,1,1},
'{0,1,1,0,1,1,0,0,1,1,1,1,1,1,1,1,0,0,0,1,0,1,0,0,0,1,0,1,0,1,0,1},
'{1,1,0,1,0,0,1,0,1,0,1,0,1,1,1,0,1,1,0,1,1,1,0,1,1,1,0,0,1,0,1,1},
'{1,1,1,0,1,0,1,1,0,1,1,1,0,0,0,1,0,0,0,0,0,1,0,1,0,1,0,0,1,1,1,1},
'{1,1,1,0,1,1,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,1,0,1,0,0},
'{1,0,0,1,0,0,0,1,0,0,0,1,0,1,1,0,1,1,1,1,0,0,1,0,0,1,1,1,1,0,0,1},
'{0,1,1,1,1,0,0,0,1,0,0,1,1,1,0,0,1,0,0,1,0,0,0,1,0,1,0,0,1,1,0,1},
'{0,0,1,0,0,1,1,0,1,1,0,0,1,1,1,1,0,1,0,0,0,0,0,1,0,1,0,1,1,1,0,1},
'{1,0,0,0,0,0,1,1,1,1,0,0,1,0,0,0,0,1,1,1,0,1,1,0,1,0,1,1,1,1,0,1},
'{0,0,0,1,0,0,1,0,1,0,0,0,0,1,0,1,0,0,1,1,1,0,1,1,0,0,1,0,0,0,1,1},
'{1,0,1,1,0,0,1,0,0,1,1,1,1,0,0,1,1,1,0,1,1,1,1,0,1,1,1,0,1,0,0,1},
'{1,0,0,0,0,1,0,0,0,1,0,1,1,1,1,0,0,0,1,1,0,0,0,0,1,0,1,0,1,0,0,1},
'{0,0,0,1,1,0,1,0,1,1,0,0,0,1,0,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,0,0},
'{0,1,1,0,1,0,0,0,0,1,1,0,1,0,0,1,1,1,1,0,0,1,1,1,0,0,1,0,0,1,0,1},
'{1,0,0,1,1,1,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,1,1,1,1,0,1,0,1,1,1,1},
'{0,0,0,1,1,0,1,0,1,0,0,1,0,0,1,0,0,0,1,1,1,0,0,1,1,0,1,1,0,1,1,0},
'{1,0,1,1,0,1,0,0,0,0,0,1,0,1,1,1,0,0,1,0,1,1,1,1,1,1,1,0,1,1,0,1},
'{1,0,0,1,0,1,0,0,0,1,1,1,0,1,1,1,1,0,0,0,0,1,1,0,1,1,1,0,1,1,0,1},
'{1,1,0,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,1,1,1,0,0,1,1,1,0,0,0,1,1},
'{0,0,0,0,0,0,0,1,0,1,0,1,1,0,0,0,1,0,0,1,0,0,1,0,0,0,1,1,0,0,1,0},
'{0,0,0,1,0,0,0,1,1,0,1,1,1,0,0,0,1,0,0,0,1,1,0,1,0,1,1,0,1,1,0,0},
'{0,1,1,0,0,0,1,1,0,1,1,0,1,1,0,1,0,1,0,0,1,1,1,0,1,1,0,1,1,0,1,1},
'{1,0,0,1,0,1,0,0,1,0,1,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,1,1,0,1,1,0},
'{0,0,0,1,0,0,0,0,0,0,1,1,1,0,0,0,1,1,0,1,1,1,0,1,1,1,1,0,1,0,0,1},
'{0,0,0,0,0,0,0,1,0,0,1,1,1,1,0,1,0,0,0,0,1,1,0,0,1,0,1,0,0,0,1,0},
'{0,1,1,0,0,1,1,1,0,1,0,0,0,1,0,1,1,0,1,0,1,1,1,1,1,1,1,0,1,0,1,0},
'{0,1,1,0,1,1,1,0,1,1,0,0,1,0,0,0,1,0,0,0,1,0,1,1,1,1,1,0,1,0,1,1},
'{1,1,1,1,0,1,1,0,0,1,0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,1,0,1,0,0,0},
'{0,0,0,1,1,1,1,1,1,0,0,1,0,1,0,1,0,0,1,0,0,0,1,0,0,1,1,1,1,1,1,0},
'{1,1,0,1,0,1,1,1,0,0,0,0,0,1,0,1,1,0,0,1,0,1,0,1,1,0,1,1,0,0,1,1},
'{0,0,0,1,0,1,1,1,1,0,1,1,0,1,1,0,1,1,1,1,0,1,1,0,1,0,1,0,0,0,1,1},
'{0,1,0,0,1,0,0,1,1,0,1,0,1,1,1,0,1,1,0,1,1,0,0,1,1,0,0,1,0,0,1,1},
'{1,0,1,0,1,0,1,0,0,0,1,0,0,0,1,0,0,1,0,1,1,0,1,0,0,0,0,1,1,1,1,1},
'{1,1,0,0,0,0,0,0,1,1,0,1,1,1,0,0,0,1,1,1,1,0,1,0,1,0,1,1,0,0,0,0},
'{1,0,1,0,1,0,1,1,1,0,1,1,1,0,0,0,1,0,0,1,1,0,0,0,0,0,1,0,1,1,0,1},
'{0,1,1,1,0,0,1,0,1,1,0,0,0,1,1,1,1,1,0,0,1,0,1,1,0,1,0,1,1,1,1,0},
'{1,1,1,1,1,1,1,1,0,1,1,0,1,0,0,1,0,1,1,1,1,1,0,0,1,0,1,0,0,1,1,0},
'{1,1,1,0,1,1,1,0,1,1,1,1,1,0,0,0,1,0,1,0,1,1,1,1,0,0,0,1,0,0,1,1},
'{0,1,1,1,1,0,1,0,1,1,1,0,1,0,1,1,1,1,0,0,1,1,1,0,0,0,1,0,1,0,1,1},
'{0,0,0,1,1,1,1,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,1,0,0,1,1,0,0,1,1,1},
'{0,1,0,0,0,0,1,1,0,1,0,0,1,1,0,1,0,0,1,1,0,1,0,1,1,1,1,0,1,1,1,1},
'{0,1,1,1,0,0,0,1,1,0,1,0,1,0,1,1,1,1,1,1,0,0,1,0,0,0,1,1,0,0,1,0},
'{1,1,1,0,0,0,1,0,1,0,1,0,1,1,1,0,0,1,1,1,0,1,1,1,1,1,1,1,1,0,1,1},
'{1,0,0,1,1,1,0,1,0,0,0,1,1,1,1,1,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,0},
'{1,0,1,0,0,1,1,0,0,1,0,1,0,1,0,0,1,1,0,1,0,0,0,1,0,1,0,1,1,0,0,1},
'{0,1,0,1,0,1,1,0,1,0,0,1,1,0,0,0,1,0,0,1,1,0,1,1,0,1,0,0,1,1,1,0},
'{0,1,1,1,0,1,1,0,0,1,0,1,0,1,1,0,0,0,0,1,0,1,1,1,1,0,0,1,1,0,1,1},
'{1,1,1,1,1,1,0,1,0,1,1,0,0,1,0,1,1,0,1,0,0,0,1,0,0,0,0,1,0,1,1,0},
'{0,0,1,1,0,0,0,0,1,0,1,0,1,0,1,1,0,1,1,0,1,0,0,1,0,1,0,1,0,1,0,1},
'{1,0,1,1,1,0,0,1,0,1,1,0,0,0,1,1,0,1,1,1,1,0,1,0,1,0,0,1,0,1,1,0},
'{1,0,1,0,1,0,0,1,0,0,0,1,1,1,1,1,0,0,1,1,1,1,0,1,0,1,0,1,1,0,1,1},
'{0,0,1,0,1,0,0,1,1,0,0,1,0,0,0,1,0,1,0,0,1,1,1,0,0,1,1,1,1,0,0,0},
'{1,0,1,1,1,0,0,1,0,1,1,1,1,0,1,0,0,1,0,1,1,1,1,0,1,1,1,1,1,0,0,1},
'{0,0,1,0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,1,0,1,1,1,1,1,0,1,0,0,1},
'{0,1,1,0,0,1,0,0,1,0,0,1,0,0,1,1,1,0,0,0,1,0,1,1,0,0,0,0,1,1,1,0},
'{0,0,0,0,1,1,1,0,1,1,0,1,1,1,1,1,1,1,1,0,1,0,1,1,0,1,1,1,0,1,1,0},
'{1,1,0,1,1,0,1,1,0,1,1,1,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,1,1},
'{1,1,0,0,0,1,1,0,0,1,1,0,1,0,0,0,0,0,0,1,1,1,0,0,0,1,0,1,0,1,0,0},
'{1,1,0,0,1,1,0,0,0,1,1,0,0,1,0,0,1,1,0,0,1,0,1,1,1,0,1,1,0,1,1,0},
'{0,0,1,1,0,0,0,1,1,0,0,1,0,1,0,0,1,1,1,1,1,0,0,0,0,0,0,0,1,1,0,1},
'{0,1,1,1,1,1,0,1,1,1,0,0,0,0,0,1,1,1,0,0,0,1,0,1,1,0,1,0,0,0,1,0},
'{0,0,1,0,0,1,0,1,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0},
'{1,1,1,0,1,0,0,1,0,1,1,0,1,1,1,0,0,0,0,1,0,0,1,0,0,1,1,1,1,1,1,1},
'{1,1,1,1,1,1,0,0,1,0,0,1,1,0,0,0,1,0,1,1,0,0,0,1,1,1,0,1,1,0,0,1},
'{1,0,0,1,1,0,0,0,0,1,1,0,0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,1,0,0,0,1},
'{0,1,0,1,0,1,0,0,1,0,0,1,0,1,1,1,1,1,1,1,1,1,1,0,0,1,0,0,1,0,0,1},
'{0,1,1,0,1,0,1,1,1,1,1,1,1,1,1,0,0,1,1,1,1,1,0,1,0,1,1,1,0,0,1,0},
'{0,0,1,0,1,0,1,0,0,0,0,1,1,0,0,1,1,0,0,1,0,0,0,0,0,1,0,1,0,0,0,1},
'{1,1,1,0,1,1,1,1,0,1,1,1,1,1,1,0,1,1,0,0,1,0,1,0,1,0,0,1,0,0,1,0},
'{0,1,0,0,1,0,1,0,0,1,0,0,1,0,0,0,1,1,0,0,1,1,1,1,0,1,0,1,0,0,0,1},
'{1,1,0,0,0,0,0,1,0,1,0,0,0,1,0,1,0,1,0,0,0,0,1,1,1,0,0,1,1,0,1,0},
'{1,1,1,1,0,0,1,1,0,1,1,0,0,1,0,1,0,1,0,1,0,1,1,0,1,1,1,1,0,0,0,1},
'{1,1,1,1,0,1,0,0,1,1,0,1,1,1,1,1,0,0,1,1,0,0,0,1,0,1,1,1,0,1,1,1},
'{0,1,0,1,1,1,1,0,0,1,0,0,0,0,0,0,1,1,0,1,0,0,1,1,1,0,0,1,0,0,0,0},
'{1,1,0,0,1,1,1,1,1,0,1,0,1,1,1,1,1,0,0,1,0,0,1,1,1,0,1,1,1,0,0,0},
'{1,0,0,0,1,1,1,0,1,0,0,1,1,1,1,1,1,0,1,1,0,1,1,1,0,1,0,0,0,0,1,1},
'{1,1,1,1,0,1,0,0,1,1,0,1,0,1,0,1,1,0,0,1,1,0,1,1,1,0,1,1,0,0,0,1},
'{0,0,1,1,1,1,0,0,1,0,1,1,0,1,1,1,0,1,1,1,1,1,0,0,1,1,1,0,1,0,1,0},
'{0,0,0,0,1,1,1,0,1,0,1,0,0,0,1,0,1,1,0,1,1,0,0,1,1,0,1,0,1,0,0,0},
'{0,0,1,1,1,0,1,1,1,1,0,1,1,0,1,1,0,1,0,0,1,1,1,1,1,0,0,1,1,0,0,0},
'{1,0,1,1,0,1,1,1,1,0,1,1,0,0,0,0,0,1,0,0,0,1,0,0,0,1,1,1,1,0,1,0},
'{0,0,1,0,1,1,0,1,1,1,1,1,1,0,0,1,0,0,0,1,0,1,0,1,1,1,1,1,0,0,1,1},
'{0,1,0,1,1,1,1,1,1,1,1,0,1,0,0,0,0,0,0,1,0,1,1,1,1,0,1,1,1,0,1,1},
'{1,1,0,1,0,0,0,1,0,1,0,1,0,0,1,1,1,1,0,1,0,1,0,0,0,0,0,1,0,1,0,1},
'{0,1,0,0,1,0,1,1,0,1,0,1,1,0,0,0,1,1,0,0,0,1,1,1,0,1,0,1,1,1,1,0},
'{1,0,1,1,0,0,1,1,0,1,1,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0},
'{0,0,0,0,0,1,1,1,1,0,1,0,0,0,1,1,1,0,1,0,1,1,1,0,1,1,1,0,0,0,0,0},
'{0,1,1,1,1,1,1,1,0,0,1,1,1,0,0,1,0,1,0,1,1,0,0,0,1,1,0,0,1,1,1,1},
'{1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,0,1,0,0,1,1,0,1,0,0,0,0,1,0,1},
'{0,0,1,0,0,1,1,1,1,1,0,1,1,0,0,1,1,0,0,1,0,0,1,1,0,1,1,0,1,0,0,1},
'{1,1,1,1,1,0,1,0,1,0,1,0,1,1,1,1,0,1,1,1,1,1,1,0,0,1,1,1,0,1,1,0},
'{1,1,1,0,0,1,0,1,0,0,1,1,1,0,0,0,0,0,0,1,0,0,0,1,1,1,1,0,1,0,0,0},
'{0,1,1,1,0,0,0,0,0,1,1,1,0,0,1,1,0,0,0,1,1,1,0,1,1,0,0,0,0,0,1,1},
'{1,1,0,1,1,0,1,1,0,1,0,1,1,0,0,1,0,0,1,1,1,0,0,1,0,1,0,1,0,1,0,1},
'{1,1,0,0,1,1,0,1,1,0,1,1,0,0,1,0,0,0,1,0,0,1,1,0,1,1,1,0,1,0,1,1},
'{0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,1,1,0,1,0,1,1,0,1,0,0,0,1},
'{1,0,0,1,1,0,0,0,1,1,1,1,0,0,1,1,0,1,1,1,0,1,0,1,0,0,1,1,0,0,1,1},
'{1,0,0,0,0,0,0,1,1,1,1,1,0,0,1,1,0,1,0,1,0,1,0,1,0,1,0,0,0,1,1,1},
'{1,0,1,0,0,1,1,0,0,0,1,0,1,1,0,1,0,1,0,1,1,0,0,0,1,1,1,1,0,1,0,1},
'{1,0,0,1,1,0,0,0,1,0,1,0,1,1,1,0,1,1,0,1,1,0,0,0,0,1,0,0,1,1,0,0},
'{0,0,1,0,0,1,1,1,0,1,1,1,1,0,0,1,1,0,0,1,0,1,1,0,1,0,1,1,0,0,1,1},
'{1,0,0,0,1,1,0,0,1,0,0,1,1,1,1,1,1,1,0,0,1,0,1,1,1,1,0,0,1,0,0,1},
'{1,1,1,0,1,1,1,1,1,0,1,1,0,0,1,0,1,1,0,0,1,0,1,1,1,1,0,0,0,1,1,0},
'{1,0,1,1,1,0,1,0,1,1,1,0,0,1,1,1,0,1,1,1,1,1,1,1,0,1,0,1,0,1,1,1}
};
end
//weights for final mapping - 32 rows 10 columns: logic wa1 [32][10];
integer r, k;
always @(*) begin
wa1 =
'{
'{1,0,1,1,0,1,1,0,1,1},
'{1,0,0,0,1,1,1,0,0,0},
'{1,0,1,1,1,1,0,0,1,0},
'{0,0,0,1,1,0,1,0,0,1},
'{1,0,0,0,0,1,1,1,0,1},
'{0,1,0,1,0,1,0,1,1,0},
'{1,1,0,0,1,0,1,1,0,1},
'{0,1,0,0,1,0,0,1,1,1},
'{1,1,1,0,1,0,1,0,1,1},
'{0,0,0,0,1,0,1,0,1,1},
'{1,0,1,1,0,1,1,1,1,0},
'{0,1,0,0,1,1,1,1,1,1},
'{1,1,1,1,0,0,0,0,1,0},
'{0,1,0,1,0,0,1,1,1,0},
'{1,0,0,1,0,1,0,0,1,1},
'{0,0,1,1,1,1,0,1,0,1},
'{0,1,0,0,0,1,0,1,1,1},
'{1,0,1,1,1,0,1,0,1,1},
'{1,0,0,0,0,1,1,1,0,0},
'{0,1,1,0,1,1,0,1,1,1},
'{1,1,0,1,0,0,0,0,0,1},
'{0,1,0,1,0,0,1,1,1,1},
'{1,1,0,0,0,1,1,0,1,1},
'{0,1,1,1,0,0,1,0,0,0},
'{0,1,0,0,1,0,1,1,0,1},
'{1,1,0,1,1,1,0,1,1,0},
'{0,1,1,0,1,0,1,1,0,0},
'{1,0,1,0,1,1,1,0,1,1},
'{0,1,1,1,0,1,0,1,0,1},
'{1,0,0,1,1,0,0,0,0,0},
'{1,0,1,0,0,1,0,1,1,1},
'{1,0,1,0,1,1,0,0,1,0}
};
end
//===========================================================================================================
// 1. first convolutional layer
//===========================================================================================================
logic signed [1:0] input_image [8][8]; //Input image
logic filter [16][3][3]; //Input filter
logic signed [1:0] out_map [16][8][8]; //Output image
genvar m;
generate
for (m=0; m<16; m++ ) begin: conv1
conv_1 one (.fmap(input_image), .filter(filter[m]), .partial_sums(out_map[m]), .clk_50(CLOCK_50)); //, .start(start_conv1), .finish(finish_conv1[m])
end
endgenerate
//pooling to produce 4x4 from 8x8
logic signed [1:0] pool_conv1 [16][4][4];
pool1 first (.pool_conv1(pool_conv1), .out_map(out_map), .clk_50(CLOCK_50)); //, .start(finish_conv1[1]), .finish(finish_pool1)
//===========================================================================================================
// 2. second convolutional layer
//===========================================================================================================
//4D array containing all the 16*32 3x3 filters used in the convolution
logic filters_conv2 [16][32][3][3]; //Input filter
//4D array which contains the 16*32 4x4 partial sums generated by convolving 16 input fmaps with 32 filters each
logic signed [4:0] partials_conv2 [16][32][4][4]; //Output partial from conv2
logic finish_ps;
logic finish_pool2;
genvar n, o;
generate
//convole each of the 16 input fmaps with a unique set of 32 filters to generate 32 sets of 16 partial sums
for (n=0; n<16; n++) begin: conv2
for (o=0; o<32; o++) begin: conv2_inner
conv_2 second (.fmap(pool_conv1[n]), .filter(filters_conv2[n][o]), .partial_sums(partials_conv2[n][o]), .clk_50(CLOCK_50)
);
end
end
endgenerate
//calculate partial sums
partial_sums conv_layer2 (.outmap_conv2d(outmap_conv2d), .partials_conv2(partials_conv2), .clk_50 (CLOCK_50), .start(pio_start), .finish(finish_ps)); //KEY[3]
//3D array containing the 32 4x4 output fmaps generated at this layer
logic signed [1:0] outmap_conv2d [32][4][4];
//3D array containing the 32 2x2 output fmaps generated at this layer
logic signed [1:0] pool_conv2 [32][2][2];
//pooling to produce 2x2 from 4x4
pool2 second (.pool_conv2(pool_conv2), .outmap_conv2d(outmap_conv2d), .clk_50(CLOCK_50), .start(finish_ps), .finish(finish_pool2)); //finish_ps
//===========================================================================================================
// fully connected layer
//===========================================================================================================
logic finish_fc1;
logic start_fc2;
//128x32 array of binary weights
logic wa [128][32];
//1x32 array output from this layer
logic signed [1:0] fc_out [32];
logic signed [8:0] temp [32];
//feed in output from last pooling layer
fc1 full_1 (.fmap(pool_conv2), .wa(wa), .clk_50(CLOCK_50), .start(finish_pool2), .finish(finish_fc1), .fc_out(fc_out));
//===========================================================================================================
// final map layer
//===========================================================================================================
//32x10 binary weight array
logic wa1 [32][10];
logic finish_fc2;
//1x10 8 bit output from layer
logic signed [7:0] final_out [10];
//feed in output from fully connected layer
ten_map last (.fmap(fc_out), .wa1(wa1), .final_out(final_out), .clk_50(CLOCK_50), .start(finish_fc1), .finish(finish_fc2));
assign pio_end = finish_fc2;
//=======================================================
// Structural coding
//=======================================================
Computer_System The_System (
// FPGA Side
// PIO ports
//.pio_fpga_data_external_connection_export (pio_fpga_data),
.pio_hps_image_data_external_connection_export (pio_hps_image_data),
.pio_hps_image_clk_external_connection_export (pio_hps_image_clk),
.pio_hps_image_cs_external_connection_export (pio_hps_image_cs),
.pio_out_data_external_connection_export (pio_out_data),
.pio_out_cs_external_connection_export (pio_out_cs),
.pio_start_external_connection_export (pio_start),
.pio_end_external_connection_export (pio_end),
.pio_switch_external_connection_export (pio_switch),
//.pio_x_external_connection_export(pio_x),
//.pio_y_external_connection_export(pio_y),
// Global signals
.system_pll_ref_clk_clk (CLOCK_50),
.system_pll_ref_reset_reset (1'b0),
// SRAM shared block with HPS
.onchip_sram_0_s1_address (sram_address),
.onchip_sram_0_s1_clken (sram_clken),
.onchip_sram_0_s1_chipselect (sram_chipselect),
.onchip_sram_0_s1_write (sram_write),
.onchip_sram_0_s1_readdata (sram_readdata),
.onchip_sram_0_s1_writedata (sram_writedata),
// AV Config
.av_config_SCLK (FPGA_I2C_SCLK),
.av_config_SDAT (FPGA_I2C_SDAT),
// Audio Subsystem
// .audio_pll_ref_clk_clk (CLOCK3_50),
// .audio_pll_ref_reset_reset (1'b0),
// .audio_clk_clk (AUD_XCK),
// .audio_ADCDAT (AUD_ADCDAT),
// .audio_ADCLRCK (AUD_ADCLRCK),
// .audio_BCLK (AUD_BCLK),
// .audio_DACDAT (AUD_DACDAT),
// .audio_DACLRCK (AUD_DACLRCK),
// Slider Switches
//.slider_switches_export (SW),
// Pushbuttons (~KEY[3:0]),
//.pushbuttons_export (~KEY[3:0]),
// Expansion JP1
//.expansion_jp1_export ({GPIO_0[35:19], GPIO_0[17], GPIO_0[15:3], GPIO_0[1]}),
// Expansion JP2
//.expansion_jp2_export ({GPIO_1[35:19], GPIO_1[17], GPIO_1[15:3], GPIO_1[1]}),
// LEDs
//.leds_export (LEDR),
// Seven Segs
//.hex3_hex0_export (hex3_hex0),
//.hex5_hex4_export (hex5_hex4),
// PS2 Ports
//.ps2_port_CLK (PS2_CLK),
//.ps2_port_DAT (PS2_DAT),
//.ps2_port_dual_CLK (PS2_CLK2),
//.ps2_port_dual_DAT (PS2_DAT2),
// IrDA
//.irda_RXD (IRDA_RXD),
//.irda_TXD (IRDA_TXD),
// VGA Subsystem
.vga_pll_ref_clk_clk (CLOCK2_50),
.vga_pll_ref_reset_reset (1'b0),
.vga_CLK (VGA_CLK),
.vga_BLANK (VGA_BLANK_N),
.vga_SYNC (VGA_SYNC_N),
.vga_HS (VGA_HS),
.vga_VS (VGA_VS),
.vga_R (VGA_R),
.vga_G (VGA_G),
.vga_B (VGA_B),
// Video In Subsystem
.video_in_TD_CLK27 (TD_CLK27),
.video_in_TD_DATA (TD_DATA),
.video_in_TD_HS (TD_HS),
.video_in_TD_VS (TD_VS),
.video_in_clk27_reset (),
.video_in_TD_RESET (),
.video_in_overflow_flag (),
.ebab_video_in_external_interface_address (bus_addr), //
.ebab_video_in_external_interface_byte_enable (bus_byte_enable), // .byte_enable
.ebab_video_in_external_interface_read (bus_read), // .read
.ebab_video_in_external_interface_write (bus_write), // .write
.ebab_video_in_external_interface_write_data (bus_write_data), //.write_data
.ebab_video_in_external_interface_acknowledge (bus_ack), // .acknowledge
.ebab_video_in_external_interface_read_data (bus_read_data),
// clock bridge for EBAb_video_in_external_interface_acknowledge
.clock_bridge_0_in_clk_clk (CLOCK_50),
// SDRAM
.sdram_clk_clk (DRAM_CLK),
.sdram_addr (DRAM_ADDR),
.sdram_ba (DRAM_BA),
.sdram_cas_n (DRAM_CAS_N),
.sdram_cke (DRAM_CKE),
.sdram_cs_n (DRAM_CS_N),
.sdram_dq (DRAM_DQ),
.sdram_dqm ({DRAM_UDQM,DRAM_LDQM}),
.sdram_ras_n (DRAM_RAS_N),
.sdram_we_n (DRAM_WE_N),
// HPS Side
// DDR3 SDRAM
.memory_mem_a (HPS_DDR3_ADDR),
.memory_mem_ba (HPS_DDR3_BA),
.memory_mem_ck (HPS_DDR3_CK_P),
.memory_mem_ck_n (HPS_DDR3_CK_N),
.memory_mem_cke (HPS_DDR3_CKE),
.memory_mem_cs_n (HPS_DDR3_CS_N),
.memory_mem_ras_n (HPS_DDR3_RAS_N),
.memory_mem_cas_n (HPS_DDR3_CAS_N),
.memory_mem_we_n (HPS_DDR3_WE_N),
.memory_mem_reset_n (HPS_DDR3_RESET_N),
.memory_mem_dq (HPS_DDR3_DQ),
.memory_mem_dqs (HPS_DDR3_DQS_P),
.memory_mem_dqs_n (HPS_DDR3_DQS_N),
.memory_mem_odt (HPS_DDR3_ODT),
.memory_mem_dm (HPS_DDR3_DM),
.memory_oct_rzqin (HPS_DDR3_RZQ),
// Ethernet
.hps_io_hps_io_gpio_inst_GPIO35 (HPS_ENET_INT_N),
.hps_io_hps_io_emac1_inst_TX_CLK (HPS_ENET_GTX_CLK),
.hps_io_hps_io_emac1_inst_TXD0 (HPS_ENET_TX_DATA[0]),
.hps_io_hps_io_emac1_inst_TXD1 (HPS_ENET_TX_DATA[1]),
.hps_io_hps_io_emac1_inst_TXD2 (HPS_ENET_TX_DATA[2]),
.hps_io_hps_io_emac1_inst_TXD3 (HPS_ENET_TX_DATA[3]),
.hps_io_hps_io_emac1_inst_RXD0 (HPS_ENET_RX_DATA[0]),
.hps_io_hps_io_emac1_inst_MDIO (HPS_ENET_MDIO),
.hps_io_hps_io_emac1_inst_MDC (HPS_ENET_MDC),
.hps_io_hps_io_emac1_inst_RX_CTL (HPS_ENET_RX_DV),
.hps_io_hps_io_emac1_inst_TX_CTL (HPS_ENET_TX_EN),
.hps_io_hps_io_emac1_inst_RX_CLK (HPS_ENET_RX_CLK),
.hps_io_hps_io_emac1_inst_RXD1 (HPS_ENET_RX_DATA[1]),
.hps_io_hps_io_emac1_inst_RXD2 (HPS_ENET_RX_DATA[2]),
.hps_io_hps_io_emac1_inst_RXD3 (HPS_ENET_RX_DATA[3]),
// Flash
.hps_io_hps_io_qspi_inst_IO0 (HPS_FLASH_DATA[0]),
.hps_io_hps_io_qspi_inst_IO1 (HPS_FLASH_DATA[1]),
.hps_io_hps_io_qspi_inst_IO2 (HPS_FLASH_DATA[2]),
.hps_io_hps_io_qspi_inst_IO3 (HPS_FLASH_DATA[3]),
.hps_io_hps_io_qspi_inst_SS0 (HPS_FLASH_NCSO),
.hps_io_hps_io_qspi_inst_CLK (HPS_FLASH_DCLK),
// Accelerometer
.hps_io_hps_io_gpio_inst_GPIO61 (HPS_GSENSOR_INT),
//.adc_sclk (ADC_SCLK),
//.adc_cs_n (ADC_CS_N),
//.adc_dout (ADC_DOUT),
//.adc_din (ADC_DIN),
// General Purpose I/O
.hps_io_hps_io_gpio_inst_GPIO40 (HPS_GPIO[0]),
.hps_io_hps_io_gpio_inst_GPIO41 (HPS_GPIO[1]),
// I2C
.hps_io_hps_io_gpio_inst_GPIO48 (HPS_I2C_CONTROL),
.hps_io_hps_io_i2c0_inst_SDA (HPS_I2C1_SDAT),
.hps_io_hps_io_i2c0_inst_SCL (HPS_I2C1_SCLK),
.hps_io_hps_io_i2c1_inst_SDA (HPS_I2C2_SDAT),
.hps_io_hps_io_i2c1_inst_SCL (HPS_I2C2_SCLK),
// Pushbutton
.hps_io_hps_io_gpio_inst_GPIO54 (HPS_KEY),
// LED
.hps_io_hps_io_gpio_inst_GPIO53 (HPS_LED),
// SD Card
.hps_io_hps_io_sdio_inst_CMD (HPS_SD_CMD),
.hps_io_hps_io_sdio_inst_D0 (HPS_SD_DATA[0]),
.hps_io_hps_io_sdio_inst_D1 (HPS_SD_DATA[1]),
.hps_io_hps_io_sdio_inst_CLK (HPS_SD_CLK),
.hps_io_hps_io_sdio_inst_D2 (HPS_SD_DATA[2]),
.hps_io_hps_io_sdio_inst_D3 (HPS_SD_DATA[3]),
// SPI
.hps_io_hps_io_spim1_inst_CLK (HPS_SPIM_CLK),
.hps_io_hps_io_spim1_inst_MOSI (HPS_SPIM_MOSI),
.hps_io_hps_io_spim1_inst_MISO (HPS_SPIM_MISO),
.hps_io_hps_io_spim1_inst_SS0 (HPS_SPIM_SS),
// UART
.hps_io_hps_io_uart0_inst_RX (HPS_UART_RX),
.hps_io_hps_io_uart0_inst_TX (HPS_UART_TX),
// USB
.hps_io_hps_io_gpio_inst_GPIO09 (HPS_CONV_USB_N),
.hps_io_hps_io_usb1_inst_D0 (HPS_USB_DATA[0]),
.hps_io_hps_io_usb1_inst_D1 (HPS_USB_DATA[1]),
.hps_io_hps_io_usb1_inst_D2 (HPS_USB_DATA[2]),
.hps_io_hps_io_usb1_inst_D3 (HPS_USB_DATA[3]),
.hps_io_hps_io_usb1_inst_D4 (HPS_USB_DATA[4]),
.hps_io_hps_io_usb1_inst_D5 (HPS_USB_DATA[5]),
.hps_io_hps_io_usb1_inst_D6 (HPS_USB_DATA[6]),
.hps_io_hps_io_usb1_inst_D7 (HPS_USB_DATA[7]),
.hps_io_hps_io_usb1_inst_CLK (HPS_USB_CLKOUT),
.hps_io_hps_io_usb1_inst_STP (HPS_USB_STP),
.hps_io_hps_io_usb1_inst_DIR (HPS_USB_DIR),
.hps_io_hps_io_usb1_inst_NXT (HPS_USB_NXT)
);
endmodule
//===========================================================================================================
// helper modules
//===========================================================================================================
//
module conv_1 (fmap, filter, partial_sums, clk_50); //, start, finish
input clk_50;
input signed [1:0] fmap[8][8]; //Input Image - 2 bit 8x8
input filter [3][3]; //Input Filter - 1 bit 3x3
logic signed [1:0] fmap_padded[10][10]; // Padded input image to be 10x10
logic signed [4:0] temp_sum[8][8];
output logic signed [1:0] partial_sums[8][8];
//pad to 10 by 10 to maintain size after convolving
always @(*) begin
for (int i = 0; i<10; i++) begin //row
for (int j =0; j<10; j++) begin //column
if ((i==0) || (i==9) || (j==0) || (j==9)) fmap_padded[i][j] <= 2'b00; //pad with -1
else fmap_padded[i][j]<=fmap[i-1][j-1];
end
end
end
always @(*) begin
for (int k = 1; k<9; k++) begin //row
for (int l=1; l<9; l++) begin //column
//get 3 by 3 surrounded by pixel at fmap_padded[k][l], multiply by filter and add
temp_sum[k-1][l-1] =
((filter[0][0] ? fmap_padded[k-1][l-1] : -fmap_padded[k-1][l-1]) //top-left
+ (filter[0][1] ? fmap_padded[k-1][l] : -fmap_padded[k-1][l])) //top-middle
+ ((filter[0][2] ? fmap_padded[k-1][l+1] : -fmap_padded[k-1][l+1]) //top-right
+ (filter[1][0] ? fmap_padded[k][l-1] : -fmap_padded[k][l-1])) //middle-left
+ ((filter[1][1] ? fmap_padded[k][l] : -fmap_padded[k][l]) //middle-middle
+ (filter[1][2] ? fmap_padded[k][l+1] : -fmap_padded[k][l+1])) //middle-right
+ ((filter[2][0] ? fmap_padded[k+1][l-1] : -fmap_padded[k+1][l-1]) //bottom-left
+ (filter[2][1] ? fmap_padded[k+1][l] : -fmap_padded[k+1][l])) //bottom-middle
+ (filter[2][2] ? fmap_padded[k+1][l+1] : -fmap_padded[k+1][l+1]); //bottom-right
//store temp sum in partial sum matrix
if (temp_sum[k-1][l-1] == 0) partial_sums[k-1][l-1] = 2'b11;
else partial_sums[k-1][l-1] = (temp_sum[k-1][l-1]>>>4) ? 2'b11 : 2'b01; //load in 1 or -1
end
end
end
endmodule
module pool1 (pool_conv1, out_map, clk_50); //, start, finish
input clk_50;
input signed [1:0] out_map [16][8][8];
output logic signed [1:0] pool_conv1 [16][4][4];
integer h;
//max pooling - check if any ones in 4x4 square-- if yes, max = 1, if no max = -1 since outmap_conv2 binarized to 1/-1
always @(*) begin //posedge clk_50
//if (start) begin
for (h=0; h<16; h++) begin
pool_conv1 [h][0][0] <= ((out_map[h][0][0]&out_map[h][0][1]&out_map[h][1][0]&out_map[h][1][1])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][0][1] <= ((out_map[h][0][2]&out_map[h][0][3]&out_map[h][1][2]&out_map[h][1][3])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][1][0] <= ((out_map[h][2][0]&out_map[h][2][1]&out_map[h][3][0]&out_map[h][3][1])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][1][1] <= ((out_map[h][2][2]&out_map[h][2][3]&out_map[h][3][2]&out_map[h][3][3])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][0][2] <= ((out_map[h][0][4]&out_map[h][0][5]&out_map[h][1][4]&out_map[h][1][5])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][0][3] <= ((out_map[h][0][6]&out_map[h][0][7]&out_map[h][1][6]&out_map[h][1][7])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][1][2] <= ((out_map[h][2][4]&out_map[h][2][5]&out_map[h][3][4]&out_map[h][3][5])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][1][3] <= ((out_map[h][2][6]&out_map[h][2][7]&out_map[h][3][6]&out_map[h][3][7])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][2][0] <= ((out_map[h][4][0]&out_map[h][4][1]&out_map[h][5][0]&out_map[h][5][1])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][2][1] <= ((out_map[h][4][2]&out_map[h][4][3]&out_map[h][5][2]&out_map[h][5][3])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][3][0] <= ((out_map[h][6][0]&out_map[h][6][1]&out_map[h][7][0]&out_map[h][7][1])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][3][1] <= ((out_map[h][6][2]&out_map[h][6][3]&out_map[h][7][2]&out_map[h][7][3])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][2][2] <= ((out_map[h][4][4]&out_map[h][4][5]&out_map[h][5][4]&out_map[h][5][5])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][2][3] <= ((out_map[h][4][6]&out_map[h][4][7]&out_map[h][5][6]&out_map[h][5][7])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][3][2] <= ((out_map[h][6][4]&out_map[h][6][5]&out_map[h][7][4]&out_map[h][7][5])==2'b01) ? 2'b01 : 2'b11;
pool_conv1 [h][3][3] <= ((out_map[h][6][6]&out_map[h][6][7]&out_map[h][7][6]&out_map[h][7][7])==2'b01) ? 2'b01 : 2'b11;
end
//finish <= 1;
//end
end
endmodule
module conv_2 (fmap, filter, partial_sums, clk_50); //, start, finish
input clk_50;
input signed [1:0] fmap [4][4]; //Input Image - 2 bit 4x4
input filter [3][3]; //Input Filter - 1 bit 3x3
logic signed [1:0] fmap_padded [6][6]; //Convert input filter to 6x6
//logic signed [4:0] temp_sum [4][4];
output logic signed [4:0] partial_sums [4][4];
//pad to size to 5 by 5 to maintain size after convolving
always @(*) begin
for (int i=0; i<6; i++) begin //row
for (int j=0; j<6; j++) begin //column
if ((i==0) || (i==5) || (j==0) || (j==5)) fmap_padded[i][j] <= 2'd0;
else fmap_padded[i][j]<=fmap[i-1][j-1];
end
end
end
always @(*) begin
for (int k = 1; k<5; k++) begin //row
for (int l=1; l<5; l++) begin //column
partial_sums[k-1][l-1] =
((filter[0][0] ? fmap_padded[k-1][l-1] : -fmap_padded[k-1][l-1]) //top-left
+ (filter[0][1] ? fmap_padded[k-1][l] : -fmap_padded[k-1][l])) //top-middle
+ ((filter[0][2] ? fmap_padded[k-1][l+1] : -fmap_padded[k-1][l+1]) //top-right
+ (filter[1][0] ? fmap_padded[k][l-1] : -fmap_padded[k][l-1])) //middle-left
+ ((filter[1][1] ? fmap_padded[k][l] : -fmap_padded[k][l]) //middle-middle
+ (filter[1][2] ? fmap_padded[k][l+1] : -fmap_padded[k][l+1])) //middle-right
+ ((filter[2][0] ? fmap_padded[k+1][l-1] : -fmap_padded[k+1][l-1]) //bottom-left
+ (filter[2][1] ? fmap_padded[k+1][l] : -fmap_padded[k+1][l])) //bottom-middle
+ (filter[2][2] ? fmap_padded[k+1][l+1] : -fmap_padded[k+1][l+1]); //bottom-right
end
end
end
endmodule
module partial_sums (outmap_conv2d, partials_conv2, clk_50, start, finish);
input clk_50;
//sum up sets up 16 partial sums and binarize the sum to generate the final 32 output fmaps
input logic signed [4:0] partials_conv2 [16][32][4][4]; //Input Range from -9 to 9, so 5 bit
output logic signed [1:0] outmap_conv2d [32][4][4];
logic signed [9:0] temp_sum[32][4][4]; //
integer a, b, c, d, e, f, g, h, i;
input logic start;
output logic finish;
logic [2:0] state;
initial begin
state = 3'b0;
finish = 0;
b = 0;
i = 0;
end
always @ (posedge clk_50) begin
if (start) begin
if (state == 3'b0) begin
for (a=0; a<32; a++) begin //columns
//reset temp_sum to 0
for (g=0; g<4; g++) begin //iterate through all 4x4s and sum partial sums
for (h=0; h<4; h++) begin
temp_sum[a][g][h] <= 10'd0;
end
end
end
state <= 3'd1;
end
if (state == 3'd1) begin
for (a=0; a<32; a++) begin //rows
for (c=0; c<4; c++) begin //iterate through all 4x4s and sum partial sums
for (d=0; d<4; d++) begin
temp_sum[a][c][d] <= temp_sum[a][c][d] + partials_conv2[b][a][c][d];
end
end
end
b<= b+1; //iterate 16 times
if (b==15) state <= 3'd2;
end
if (state == 3'd2) begin
for (a=0; a<32; a++) begin //rows
for (e=0; e<4; e++) begin //transfer sign bit from temporary sums to output fmaps
for (f=0; f<4; f++) begin
if (temp_sum[a][e][f] == 0) outmap_conv2d[a][e][f] <= 2'b11;
else outmap_conv2d[a][e][f] <= ((temp_sum[a][e][f])>>>8) ? 2'b11 : 2'b01; //store 1 or -1 based on sign bit
end
end
end
state <= 3'd3;
end
if (state == 3'd3) begin
state <= 3'd3;
finish <= 1;
end
end
else begin
state <= 3'd0;
finish <= 0;
b<=0;
i<=0;
end
end
endmodule
module pool2 (pool_conv2, outmap_conv2d, clk_50, start, finish);
input clk_50;
input logic signed [1:0] outmap_conv2d [32][4][4];
output logic signed [1:0] pool_conv2 [32][2][2];
integer g;
input logic start;
output logic finish;
initial begin
finish <= 0;
end
//max pooling - check if any ones in 2x2 square-- if yes, max = 1, if no max = -1 since outmap_conv2 binarized to 1/-1
always @(posedge clk_50) begin
if (start) begin
if (g<32) begin
pool_conv2 [g][0][0] <= ((outmap_conv2d[g][0][0]&outmap_conv2d[g][0][1]&outmap_conv2d[g][1][0]&outmap_conv2d[g][1][1])==2'b01) ? 2'b01 : 2'b11;
pool_conv2 [g][0][1] <= ((outmap_conv2d[g][0][2]&outmap_conv2d[g][0][3]&outmap_conv2d[g][1][2]&outmap_conv2d[g][1][3])==2'b01) ? 2'b01 : 2'b11;
pool_conv2 [g][1][0] <= ((outmap_conv2d[g][2][0]&outmap_conv2d[g][2][1]&outmap_conv2d[g][3][0]&outmap_conv2d[g][3][1])==2'b01) ? 2'b01 : 2'b11;
pool_conv2 [g][1][1] <= ((outmap_conv2d[g][2][2]&outmap_conv2d[g][2][3]&outmap_conv2d[g][3][2]&outmap_conv2d[g][3][3])==2'b01) ? 2'b01 : 2'b11;
g <= g+ 1;
end
else begin
finish <= 1;
end
end
else begin
finish <= 0;
g<=0;
end
end
endmodule
module fc1 (fmap, wa, clk_50,start, finish, fc_out);
input logic signed [1:0] fmap [32][2][2]; //output from last pooling layer
input logic wa[128][32]; //input weights array
output logic signed [1:0] fc_out [32];
input logic clk_50;
integer i, j, k, l;
logic signed [1:0] fmap_flat [128];
logic [7:0] count;
logic [2:0] state;
input logic start;
output logic finish;
initial begin
state <= 3'b0;
finish <= 0;
end
logic signed [8:0] temp [32];
//flatten 2D array
always @(*)begin
for (i=0; i<32; i++) begin
fmap_flat[i] = fmap[i][0][0];
fmap_flat[i+32] = fmap[i][0][1];
fmap_flat[i+64] = fmap[i][1][0];
fmap_flat[i+96] = fmap[i][1][1];
end
end
always @ (posedge clk_50) begin
if (start) begin
if (state == 3'b0) begin
for (i=0; i<32; i++) begin
temp[i] <= 0;
j <= 0;
end
state <= 3'd1;
end
//calculate cumulative sum
if (state == 3'd1) begin
for (k=0;k<32; k++) begin
temp[k] <= temp[k] + (wa[j][k] ? fmap_flat[j] : -fmap_flat[j] );
end
if (j==127) state <= 3'd2; //iterate 128 times
else j <= j+1;
end
if (state == 3'd2) begin
for (l=0; l<32; l++) begin
if ( temp[l] == 0) fc_out[l] <= 2'b11;
else fc_out[l] <= (temp[l]>>>8) ? 2'b11 : 2'b01; //binarize
end
state <= 3'd3;
end
if (state == 3'd3) begin
state <= 3'd3;
finish <= 1;
end
end
else begin
state <= 3'b0;
finish <= 0;
j <= 0;
i<= 0;
end
end
endmodule
module ten_map (fmap, wa1, final_out, clk_50, start, finish);
input logic clk_50;
input logic signed [1:0] fmap [32];
input logic wa1 [32][10];
output logic signed [7:0] final_out [10];
input logic start;
output logic finish;
//multiply matrices 1x128 x 128x32 = 1x32
integer j, k;
logic [2:0] state;
initial begin
finish <=0;
state <= 3'b0;
end
logic signed [7:0] temp[10];
always @ (posedge clk_50) begin
if (start) begin
if (state == 3'b0) begin
for (k=0;k<10; k++) begin
temp[k] <=0;
j <= 0;
end
if (k==10) state <= 3'd1;
end
//calculate cumulative sum
if (state == 3'd1) begin
for (k=0;k<10; k++) begin
temp[k] <= temp[k] + (wa1[j][k] ? fmap[j] : - fmap[j]);
end
if (j==32) begin
state <= 3'd2;
for (k=0;k<10; k++) begin
final_out[k] <= temp[k];
end
end
j <= j+1;
end
if (state == 3'd2) begin
finish <= 1;
end
end
else begin
finish <=0;
state <= 3'b0;
j<=0;
end
end
endmodule
{"mode":"full","isActive":false}
//HPS
///
/// 640x480 version!
/// test VGA with hardware video input copy to VGA
///
//gcc v1.c -o v1
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include "address_map_arm_brl4.h"
#include
#include
/* function prototypes */
void VGA_text (int, int, char *);
void VGA_text_clear();
void VGA_box (int, int, int, int, short);
void VGA_line(int, int, int, int, short) ;
void VGA_disc (int, int, int, short);
int VGA_read_pixel(int, int) ;
int video_in_read_pixel(int, int);
void draw_delay(void) ;
// the light weight buss base
void *h2p_lw_virtual_base;
volatile unsigned int *h2p_lw_video_in_control_addr=NULL;
volatile unsigned int *h2p_lw_video_in_resolution_addr=NULL;
//volatile unsigned int *h2p_lw_video_in_control_addr=NULL;
//volatile unsigned int *h2p_lw_video_in_control_addr=NULL;
volatile unsigned int *h2p_lw_video_edge_control_addr=NULL;
// pixel buffer
volatile unsigned int * vga_pixel_ptr = NULL ;
void *vga_pixel_virtual_base;
// video input buffer
volatile unsigned int * video_in_ptr = NULL ;
void *video_in_virtual_base;
// character buffer
volatile unsigned int * vga_char_ptr = NULL ;
void *vga_char_virtual_base;
// /dev/mem file id
int fd;
// shared memory
key_t mem_key=0xf0;
int shared_mem_id;
int *shared_ptr;
int shared_time;
int shared_note;
char shared_str[64];
// pixel macro
#define VGA_PIXEL(x,y,color) do{\
char *pixel_ptr ;\
pixel_ptr = (char *)vga_pixel_ptr + ((y)<<10) + (x) ;\
*(char *)pixel_ptr = (color);\
} while(0)
#define VIDEO_IN_PIXEL(x,y,color) do{\
char *pixel_ptr ;\
pixel_ptr = (char *)video_in_ptr + ((y)<<9) + (x) ;\
*(char *)pixel_ptr = (color);\
} while(0)
// measure time
struct timeval t1, t2;
double elapsedTime;
struct timespec delay_time ;
#define HPS_IMAGE_DATA_BASE 0x00000070
#define HPS_IMAGE_CLK_BASE 0x00000090
#define HPS_IMAGE_CS_BASE 0x00000080
#define OUT_DATA_BASE 0x00000120
#define OUT_CS_BASE 0x00000130
#define PIO_START_BASE 0x00000140
#define PIO_END_BASE 0x00000150
#define PIO_SWITCH_BASE 0x00000160
//1. Function to Read Input Image from File-------------------------------
// and Send it to FPGA from HPS
volatile signed int * hps_image_data = NULL ;
volatile unsigned int * hps_image_clk = NULL ;
volatile unsigned int * hps_image_cs = NULL ;
volatile unsigned int * pio_start = NULL ;
volatile unsigned int * pio_end = NULL ;
volatile unsigned int * pio_switch = NULL ;
int image_matrix[8*8];
void toggle_image_clk (void){
*hps_image_clk = 0;
*hps_image_clk = 1;
//sleep(1);
}
void load_input(void){
//Initialize things in FPGA
*hps_image_cs = 0;
toggle_image_clk();
*hps_image_cs = 1; //Set CS high
//Open input_data.txt file to read output
FILE *myFile;
char c;
int counter = 0;
int nFile;
myFile = fopen("input_data.txt", "r");
//myFile = fopen("input_test.txt", "r");
if (myFile == NULL) {
printf("Fail to open Input Image File \n");
exit(1);
}
printf("Input Image File open successfully \n");
while ((c = getc(myFile)) != 255){
if ((c == '1')){
*hps_image_data = 1;
toggle_image_clk();
image_matrix[counter] = 1;
counter++;
//printf("number#%d: %d \n", counter, 1);
//if (counter%8 == 0) printf("\n");
}
else if ((c == '0')){
*hps_image_data = -1;
toggle_image_clk();
image_matrix[counter] = 0;
counter++;
//printf("number#%d: %d \n", counter, -1);
//if (counter%8 == 0) printf("\n");
}
}
*hps_image_cs = 0;
// Print output
printf("%d input image is successfully loaded \n", counter);
int i, offset;
for (i = 0; i < 8; i++){
offset = i * 8;
printf("%d %d %d %d %d %d %d %d \n", image_matrix[offset]
, image_matrix[offset+1], image_matrix[offset+2], image_matrix[offset+3]
, image_matrix[offset+4], image_matrix[offset+5], image_matrix[offset+6]
, image_matrix[offset+7]);
}
fclose(myFile);
}
volatile signed int * out_data = NULL ;
volatile unsigned int * out_cs = NULL ;
signed int final_out [10];
void read_output(void){
//Initialize things in FPGA
*out_cs = 0;
toggle_image_clk();
//Set CS high
*out_cs = 1;
int i;
for (i = 0; i < 10; i++){
toggle_image_clk();
final_out[i] = (signed int) (*out_data);
}
*out_cs = 0;
}
int global_maxIdx, global_maxIdx2, global_maxIdx3;
float global_probability, global_probability2, global_probability3;
//Print Output -------------------------------
void print_output(void){
printf("Negative Output \n");
printf("%d %d %d %d %d %d %d %d %d %d\n", final_out[0], final_out[1],
final_out[2], final_out[3], final_out[4], final_out[5],
final_out[6], final_out[7], final_out[8], final_out[9]);
printf("Convert to Positive and Probablity Computation \n");
int i, sum_magnitude = 0, maxValue = -9999, maxIdx = 0;
for (i = 0; i < 10; i++){
if (final_out[i] > 127) final_out[i] = final_out[i] - 256;
if (final_out[i] > 0) sum_magnitude += final_out[i];
//else sum_magnitude -= final_out[i];
// Extract max values in the list
if (final_out[i] > maxValue) {
maxValue = final_out[i];
maxIdx = i;
}
}
printf("%d %d %d %d %d %d %d %d %d %d\n", final_out[0], final_out[1],
final_out[2], final_out[3], final_out[4], final_out[5],
final_out[6], final_out[7], final_out[8], final_out[9]);
float probability, probability2, probability3;
probability = (float) maxValue/(float)sum_magnitude;
printf("Probability that it is #%d is %.3f\n", maxIdx, probability);
int maxValue2 = -9999, maxIdx2 = 0;
for (i = 0; i < 10; i++){
// Extract second max values in the list
if ((final_out[i] > maxValue2) && (i != maxIdx)) {
maxValue2 = final_out[i];
maxIdx2 = i;
}
}
int maxValue3 = -9999, maxIdx3 = 0;
for (i = 0; i < 10; i++){
// Extract third max values in the list
if ((final_out[i] > maxValue3) && (i != maxIdx) && (i != maxIdx2)) {
maxValue3 = final_out[i];
maxIdx3 = i;
}
}
probability2 = (float) maxValue2/(float)sum_magnitude;
probability3 = (float) maxValue3/(float)sum_magnitude;
printf("Probability that it is #%d is %.3f\n", maxIdx2, probability2);
printf("Probability that it is #%d is %.3f\n", maxIdx3, probability3);
global_maxIdx = maxIdx;
global_maxIdx2 = maxIdx2;
global_maxIdx3 = maxIdx3;
global_probability = probability;
global_probability2 = probability2;
global_probability3 = probability3;
}
int control = 1;
double elapsedTime;
void *scan_thread(void * t){
int input;
float input_value;
int input_dt;
struct timeval t1, t2;
while(1){
printf("Note: \n");
printf("Enter 0 to read and display output \n");
printf("Enter 1 to 6 output prebuilt module \n");
//printf("Enter 3 to restart drum on HPS \n");
printf(">");
scanf("%d", &input);
while (input < 0 || input > 6){
printf("Enter a number from 1 to 3 \n");
printf(">");
scanf("%d", &input);
}
if (input == 0){ //Enter 0 to read and display output
read_output();
print_output();
}
else if ((input >= 1)&&(input <= 6)){ //Enter 1-6 to read and display output
*pio_switch = input;
control = input;
*pio_start = 0;
*pio_start = 1; //Start compute
gettimeofday(&t1, NULL);
while (*pio_end != 1); //Wait until compute finish
gettimeofday(&t2, NULL);
elapsedTime = (t2.tv_sec - t1.tv_sec) * 1000.0; // sec to ms
elapsedTime += (t2.tv_usec - t1.tv_usec) / 1000.0; // us to ms
printf ("Compute time is %.3f ms\n", elapsedTime);
read_output();
print_output();
VGA_text_clear();
}
else if (input == 7){
}
printf("--------Done---------\n\n");
}
}
int main(void)
{
printf("Hello \n");
delay_time.tv_nsec = 10 ;
delay_time.tv_sec = 0 ;
// Declare volatile pointers to I/O registers (volatile // means that IO load and store instructions will be used // to access these pointer locations,
// instead of regular memory loads and stores)
// === need to mmap: =======================
// FPGA_CHAR_BASE
// FPGA_ONCHIP_BASE
// HW_REGS_BASE
// === get FPGA addresses ==================
// Open /dev/mem
if( ( fd = open( "/dev/mem", ( O_RDWR | O_SYNC ) ) ) == -1 ) {
printf( "ERROR: could not open \"/dev/mem\"...\n" );
return( 1 );
}
// get virtual addr that maps to physical
h2p_lw_virtual_base = mmap( NULL, HW_REGS_SPAN, ( PROT_READ | PROT_WRITE ), MAP_SHARED, fd, HW_REGS_BASE );
if( h2p_lw_virtual_base == MAP_FAILED ) {
printf( "ERROR: mmap1() failed...\n" );
close( fd );
return(1);
}/*
h2p_lw_video_in_control_addr=(volatile unsigned int *)(h2p_lw_virtual_base+VIDEO_IN_BASE+0x0c);
h2p_lw_video_in_resolution_addr=(volatile unsigned int *)(h2p_lw_virtual_base+VIDEO_IN_BASE+0x08);
*(h2p_lw_video_in_control_addr) = 0x04 ; // turn on video capture
*(h2p_lw_video_in_resolution_addr) = 0x00f00140 ; // high 240 low 320
h2p_lw_video_edge_control_addr=(volatile unsigned int *)(h2p_lw_virtual_base+VIDEO_IN_BASE+0x10);
*h2p_lw_video_edge_control_addr = 0x01 ; // 1 means edges
*h2p_lw_video_edge_control_addr = 0x00 ; // 1 means edges
*/
//New PIO
hps_image_data = (signed int*)(h2p_lw_virtual_base + HPS_IMAGE_DATA_BASE);
hps_image_clk = (unsigned int*)(h2p_lw_virtual_base + HPS_IMAGE_CLK_BASE);
hps_image_cs = (unsigned int*)(h2p_lw_virtual_base + HPS_IMAGE_CS_BASE);
out_data = (signed int*)(h2p_lw_virtual_base + OUT_DATA_BASE);
out_cs = (unsigned int*)(h2p_lw_virtual_base + OUT_CS_BASE);
pio_start = (unsigned int*)(h2p_lw_virtual_base + PIO_START_BASE);
pio_end = (unsigned int*)(h2p_lw_virtual_base + PIO_END_BASE);
pio_switch = (unsigned int*)(h2p_lw_virtual_base + PIO_SWITCH_BASE);
// === get VGA char addr =====================
// get virtual addr that maps to physical
vga_char_virtual_base = mmap( NULL, FPGA_CHAR_SPAN, ( PROT_READ | PROT_WRITE ), MAP_SHARED, fd, FPGA_CHAR_BASE );
if( vga_char_virtual_base == MAP_FAILED ) {
printf( "ERROR: mmap2() failed...\n" );
close( fd );
return(1);
}
// Get the address that maps to the character
vga_char_ptr =(unsigned int *)(vga_char_virtual_base);
// === get VGA pixel addr ====================
// get virtual addr that maps to physical
// SDRAM
vga_pixel_virtual_base = mmap( NULL, FPGA_ONCHIP_SPAN, ( PROT_READ | PROT_WRITE ), MAP_SHARED, fd, SDRAM_BASE); //SDRAM_BASE
if( vga_pixel_virtual_base == MAP_FAILED ) {
printf( "ERROR: mmap3() failed...\n" );
close( fd );
return(1);
}
// Get the address that maps to the FPGA pixel buffer
vga_pixel_ptr =(unsigned int *)(vga_pixel_virtual_base);
// === get video input =======================
// on-chip RAM
video_in_virtual_base = mmap( NULL, FPGA_ONCHIP_SPAN, ( PROT_READ | PROT_WRITE ), MAP_SHARED, fd, FPGA_ONCHIP_BASE);
if( video_in_virtual_base == MAP_FAILED ) {
printf( "ERROR: mmap3() failed...\n" );
close( fd );
return(1);
}
// format the pointer
video_in_ptr =(unsigned int *)(video_in_virtual_base);
// ===========================================
/* create a message to be displayed on the VGA
and LCD displays */
char text_top_row[40] = "DE1-SoC ARM/FPGA\0";
char text_top_row_2[40] = "BNN Inference on FPGA";
char text_top_row_1[40] = "By: Vidya and Xitang";
char text_bottom_row[40] = "Cornell ece5760 - Bruce Land :D\0";
char num_string[20], time_string[50] ;
// a pixel from the video
int pixel_color;
// video input index
int i,j;
// clear the screen
VGA_box (0, 0, 639, 479, 0x03);
// clear the text
VGA_text_clear();
VGA_text (1, 56, text_top_row);
VGA_text (1, 57, text_bottom_row);
// start timer
//gettimeofday(&t1, NULL);
// Load input
load_input();
read_output();
print_output();
pthread_t threads;
// thread attribute used here to allow JOIN
pthread_attr_t attr;
/* Initialize mutex and condition variable objects */
pthread_mutex_t run_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_init(&run_mutex, NULL);
/* For portability, explicitly create threads in a joinable state */
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
pthread_create(&threads, &attr, scan_thread, NULL);
int pixel_read;
struct timeval t3, t4;
gettimeofday(&t3, NULL);
int pixel_color_matrix[224][224];
int sum_color_matrix[8][8];
// 0 is black, 255 is white
for (i=0; i<224; i++) {
for (j=0; j<224; j++) {
pixel_color_matrix[i][j] = 0;
}
}
//rand()%2 to get 0 or 1
int temp = 0;
for (i=0; i<8; i++) {
for (j=0; j<8; j++) {
sum_color_matrix[i][j] = image_matrix[temp++];//j; //Initialize as white 3, Greyscale 0-3, 0 is black
}
}
int x_offset, y_offset, grey_color, x_offset_end, y_offset_end;
int start_idx_i, end_idx_i, start_idx_j, end_idx_j;
//images to display on monitor
int matrix1[8][8] = {
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1,-1,-1, 1, 1,-1,-1},
{-1,-1, 1, 1, 1, 1,-1,-1},
{-1, 1, 1,-1,-1, 1,-1,-1},
{-1, 1, 1, 1, 1,-1,-1,-1},
{-1,-1, 1, 1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1}
};
int matrix2[8][8] = {
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1,-1, 1,-1,-1,-1,-1},
{-1,-1,-1, 1,-1,-1,-1,-1},
{-1,-1,-1, 1,-1,-1,-1,-1},
{-1,-1,-1, 1,-1,-1,-1,-1},
{-1,-1,-1, 1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1}
};
int matrix3[8][8] = {
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1, 1, 1, 1,-1,-1,-1},
{-1,-1,-1,-1, 1,-1,-1,-1},
{-1,-1,-1,-1, 1,-1,-1,-1},
{-1,-1,-1, 1, 1,-1,-1,-1},
{-1, 1, 1, 1, 1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1}
};
int matrix4[8][8] = {
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1, 1,-1,-1,-1,-1,-1},
{-1, 1, 1,-1, 1, 1,-1,-1},
{-1,-1, 1, 1, 1,-1,-1,-1},
{-1,-1,-1, 1, 1,-1,-1,-1},
{-1,-1,-1, 1, 1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1}
};
int matrix5[8][8] = {
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1, 1, 1,-1,-1,-1,-1},
{-1,-1, 1,-1,-1,-1,-1,-1},
{-1,-1, 1, 1, 1,-1,-1,-1},
{-1,-1,-1,-1,-1, 1,-1,-1},
{-1,-1, 1, 1, 1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1}
};
int matrix6[8][8] = {
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1, 1, 1, 1,-1,-1,-1},
{-1,-1,-1,-1, 1,-1,-1,-1},
{-1,-1,-1,-1, 1,-1,-1,-1},
{-1,-1,-1,-1, 1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1}
};
int matrix7[8][8] = {
{-1,-1,-1,-1,-1,-1,1,-1},
{-1,-1,-1,-1,1,1,-1,-1},
{-1,-1,-1,-1,1,-1,-1,-1},
{-1,-1,-1,1,-1,-1,-1,-1},
{-1,-1,1,1,-1,-1,-1,-1},
{-1,-1,1,-1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1},
{-1,-1,-1,-1,-1,-1,-1,-1}
};
while(1)
{
gettimeofday(&t1, NULL);
// note that this version of VGA_disk
// has THROTTLED pixel write
// software copy test.
// in production, hardware does the copy
// put a few pixel in input buffer
//VIDEO_IN_PIXEL(160,120,0xff);
//VIDEO_IN_PIXEL(0,0,0xff);
//VIDEO_IN_PIXEL(319,239,0xff);
//VIDEO_IN_PIXEL(300,200,0xff);
// Video input is 224 by 224
// read/write video input -- copy to VGA display
// Copy over every 2s
gettimeofday(&t4, NULL);
if ((t4.tv_sec - t3.tv_sec) > 2){
//VGA_disc((rand()&0x3ff), (rand()&0x1ff), rand()&0x3f, rand()