This section describes an actual case in which a particular customer tried to resize their memory map.
The application is a four-channel CIF MPEG4 Simple Profile (or H.264 Baseline Profile) Digital Video Recorder (DVR) based on DM6446 with 64MB DDR2. 4 channels of CIF video are encoded and 1 channel of CIF video is decoded. The MPEG4 (or H.264) & Audio encoder & decoder conform to xDM. In this discussion, we will focus on the video codecs. So, 4 CIF encoding instances and 1 CIF decoding instance will be created by calling the VISA API:s. This example is based on Codec Engine 1.02 & DSPLINK 1.30.08.02. Below is the system block diagram:
The final 64MB memory map looks like:
0x80000000 .. 0x83200000-1 (0-50MB; size 50MB): Linux: booted with MEM = 50M 0x83200000 .. 0x83A00000-1 (50-58MB; size 8MB): CMEM: shared ARM/DSP I/O buffers 0x83A00000 .. 0x83C00000-1 (58-60MB; size 2MB): DDRALGHEAP: codec dynamic memory 0x83C00000 .. 0x83E00000-1 (60-62MB; size 2MB): DDR: code, stack, system data 0x83E00000 .. 0x83F00000-1 (62-63MB; size 1MB): DSPLINKMEM: memory for DSPLINK 0x83F00000 .. 0x83F00080-1 (63-63MB; size 128B): RESET_VECTOR: reset vectors 0x83F00080 .. 0x84000000-1 (63-64MB; size 1MB): Unused memory
How did we arrive at this memory map? First of all, 1MB DSPLINKMEM is the default size of DSPLINK 1.30.08.02. It is important to correctly allocate the right memory size for CMEM, DDRALGHEAP & DDR. Then we will have enough space for DSP S/W & Linux OS.
Below diagram illustrates the system application data flow. VPFE puts the video input data (to be encoded data) in CMEM. The encoder running on DSP outputs the encoded data to CMEM. The encoded data is stored on hard disk finally. As for the decoding data flow, to-be-decoded data is copied from hard disk to CMEM first. Then, the decoder decompresses the data and outputs the result into decoded data buffers. The VPBE output resolution is CIF. Sometimes, we can use the resizer of the VPFE peripheral on DM6446 to get D1 resolution. So we need to allocate a buffer for resizer results in CMEM too.
As for to-be-encoded data buffers, the size is ((352 * 288) * 4 * 2B) * 3 = 2433024B. 352*288 is CIF resolution, 4 means 4 channels, one pixel in YUV4:2:2 needs 2 bytes and three (352 * 288) * 4 * 2B buffers are allocated for encoder algorithm. The size of decoded data buffers is same: 2433024B. Because we encode 4 channels CIF, you can calculate the size of encoded data buffers by 50% D1 ((720 * 576 * 3 / 2 ) / 2 = 303.75KB, YUV4:2:0) or standard MPEG4 compression ratio. Here we allocate 256KB (262144B) for encoded data buffers less than 303.75KB. This is chosen based on experience. So, we configure three 256KB buffers (786432B) for to-be-decoded data buffer accordingly. As for the buffer of resizer result, we need 720 * 576 * 2B = 829440B in YUV4:2:2. So, the insmod cmemk command looks like:
insmod cmemk.ko phys_start=0x83200000 phys_end=0x83A00000 pools=1x262144, 2x2433024, 1x829440,1x786432
DDRALGHEAP is the memory allocated for codec dynamic memory requests. Both encoder and decoder will process and accept data with YUV4:2:0. One channel CIF data in YUV4:2:0 is 352 * 288 * 3 / 2 B (one pixel with YUV4:2:0 format needs 3/2 byte). Encoder and decoder algorithms need the current frame and previous frame data. To compress or decompress one channel CIF, we need to allocate 352 * 288 * 3 / 2 * 2 B memory for encoder and decoder respectively. Because 4 channels CIF will be encoded and 1 channel CIF will be decoded. So, the encoder needs 352 * 288 * 3 / 2 * 2 * 4B (about 1.16MB) of dynamic memory and the decoder needs 352 * 288 * 3 / 2 * 2 * 1B (about 297KB) of dynamic memory. The total of them is about 1.45MB. 2MB DDRALGHEAP is allocated in this example.
DDR is the DSP-side segment including all the system code, data, stack, heaps and code and static data for the codecs. The code size for the most complex video codecs is less than several hundred KBs. We can use the script sectti.pl to determine DDR section size:
ofd6x -x codec_server.x64P | perl c:\temp\cg_xml\ofd\sectti.pl > codec_server.x64P.sectti.csv
The script generated a report file, we can get about 416 KB of the totals of data and code. So 2MB DDR of this application is enough.
REPORT FOR FILE: codec_server.x64P Name : Size (dec) Size (hex) Type Load Addr Run Addr MPEG4ENC : 23840 0x00005d20 CODE 0x83c71000 0x83c71000 MPEG4DEC : 10784 0x00002a20 CODE 0x83c82000 0x83c82000 .bss : 910 0x0000038e UDATA 0x83c88000 0x83c88000 .hwi_vec : 512 0x00000200 CODE 0x83c70c00 0x83c70c00 .far : 204920 0x00032078 UDATA 0x83c00000 0x83c00000 .bios : 22912 0x00005980 CODE 0x83c76d20 0x83c76d20 .text : 123136 0x0001e100 CODE 0x83c52080 0x83c52080 .cinit : 8196 0x00002004 DATA 0x83c84a20 0x83c84a20 .sysinit : 1792 0x00000700 CODE 0x83c70180 0x83c70180 .const : 21288 0x00005328 DATA 0x83c7c6a0 0x83c7c6a0 .stack : 4096 0x00001000 UDATA 0x83c86a28 0x83c86a28 Totals by section type (about 416KB) Uninitialized Data: 212958 0x00033fde Initialized Data : 30080 0x00007580 Code : 182976 0x0002cac0
We computed it by calculating the DSP needs first and subtracting that from the total amount of memory available. We know our production system has only 64MB of memory. Given we need 1MB for "DSPLINKMEM", 2MB for "DDR", 2MB for "DDRALGHEAP", 1MB for "RESET_VECTOR" & unused memory and 8MB for CMEM, that gives a total of 14MB for DSP and shared buffers, leaving 50MB for Linux.
Memory map configuration for Davinci-based system can be systematically performed after the user has designed the memory map to suit the amount of memory available. In order for the procedure to go smoothly, a reminder is to: