【DSP开发】【并行计算-CUDA开发】TI OpenCL v01.01.xx

TI OpenCL v01.01.xx

TI OpenCL™ Runtime Documentation Contents:

  • Introduction
  • OpenCL 1.1 Reference Material
  • Compilation
    • Compile Host OpenCL Applications
    • Compiling OpenCL C Programs
      • Create an OpenCL program from source, with embedded source
      • Create an OpenCL program from source, with source in a file
      • Create an OpenCL program from binary, with binary in a file
      • Create an OpenCL program from binary, with embedded binary
    • Caching on-line compilation results
    • The TI off-line OpenCL C compiler: clocl
  • Memory Usage
    • Device Memory
      • Caching
    • How DDR3 is Partitioned for Linux System and OpenCL
      • 66AK2x
      • AM57
      • Changing DDR3 Partition for OpenCL
    • Alternate Host malloc/free Extension for Zero Copy OpenCL Kernels
    • The OpenCL Memory Model
    • OpenCL Buffers
      • Global Buffers
      • Local Buffers
      • Sub-Buffers
    • Buffer Read/Write vs. Map/Unmap
    • Discovering OpenCL Memory Sizes and Limits
    • Cache Operations
    • Large OpenCL buffers and Memory Beyond the 32-bit DSP Address Space
      • Large Buffer Use Cases
    • User Defined DSP Heap Extension
      • User Defined DSP Heap Built-in Functions
      • Allocation of the Underlying Memory for User Defined DSP Heaps
      • Putting it all Together
  • Execution Model
    • Terminology
    • Device Discovery
    • Understanding Kernels, Work-groups and Work-items
      • Enqueueing a Kernel
      • Mapping the OpenCL C work-item Built-in Functions
      • OpenCL C Kernel Code
      • NDRangeKernel Execution on DSP Devices
  • Extensions
    • Calling Standard C Code From OpenCL C Code
    • Calling Standard C code with OpenMP from OpenCL C code
      • OpenMP dispatch from OpenCL
    • C66x standard C compiler intrinsic functions
    • OpenCL C code using printf
    • DMA Control Using EdmaMgr Functions
      • Single Transfer EdmaMgr APIs
      • Multiple Transfer EdmaMgr APIs
    • Using Extended Memory on the 66AK2x device
    • Fast Global buffers in on-chip MSMC memory
    • OpenCL C Builtin Function Extensions
    • Cache Operations
  • Environment Variables
  • Optimization Tips
    • Optimization Techniques for Host Code
      • Use Off-line, Embedded Compilation Model
      • Avoid the read/write Buffer model on shared memory SoC platforms
      • Use MSMC Buffers Whenever Possible
      • Dispatch Appropriate Compute Loads
      • Prefer Kernels with 1 work-item per work-group
    • Optimization Techniques for Device (DSP) Code
      • Prefer Kernels with 1 work-item per work-group
      • Use Local Buffers
      • Use async_work_group_copy and async_work_group_strided_copy
      • Avoid DSP writes directly to DDR
      • Use the reqd_work_group_size attribute on kernels
      • Use the TI OpenCL extension than allows Standard C code to be called from OpenCL C code
      • Avoid OpenCL C Barriers
      • Use the most efficient data type on the DSP
      • Do Not Use Large Vector Types
      • Consecutive memory accesses
      • Prefer the CPU style of writing OpenCL code over the GPU style
    • Typical Steps to Optimize Device Code
    • Optimizing 3x3 Gaussian smoothing filter
      • Overview of Gaussian Filter
      • Natural C Code
      • Optimizing for DSP
      • Performance Improvement
    • Performance Data
  • Examples
    • Building and Running
    • Example Descriptions
      • platforms example
      • simple example
      • mandelbrot, mandelbrot_native examples
      • ccode example
      • matmpy example
      • offline example
      • vecadd_openmp example
      • vecadd_openmp_t example
      • vecadd example
      • vecadd_mpax example
      • vecadd_mpax_openmp example
      • dsplib_fft example
      • ooo, ooo_map examples
      • null example
      • sgemm example
      • dgemm example
      • edmamgr example
      • dspheap example
    • Float compute example
      • Host Code (main.cpp)
      • OpenCL C kernel code (dsp_compute.cl)
      • Sample Output
    • Monte Carlo example
      • Algorithm for Gaussian Random Number Generation
      • Executing the code
      • Sample Output
  • Debug
    • Debug with printf
      • Host side OpenCL application code
      • DSP side OpenCL kernel code
    • Debug with gdb
      • Host side gdb
      • DSP side debug with host side client gdbc6x
    • Debug with CCS
      • Connect emulator to EVM and CCS
      • Debug DSP side code with CCS
    • Debug with dsptop
  • Profiling
    • Host Side Profiling
    • DSP Side Profiling
  • OpenCL on TI-RTOS
    • Overview
      • OpenCL on RTOS Package
      • Running Examples Shipped with OpenCL Package
    • Basic OpenCL RTOS Application Development
      • Building Application on Linux
      • Building Application on Windows
      • Creating an OpenCL RTOS Application
      • Limited Customization: Participating DSP Core(s)
      • Differences from OpenCL Linux (Host running Linux)
    • Advanced OpenCL RTOS Application Development
  • Frequently Asked Questions
    • How do I get support for TI OpenCL products?
    • Which TI OpenCL Version is Installed?
    • Using Python OpenCL with the TI OpenCL implementation
    • Guidelines for porting Stand-alone DSP applications to OpenCL
      • Heap Memory Management
      • Stack Usage
      • Boot Routine Dependencies
      • Linker Command Files
    • OpenCL Interoperability with Host OpenMP
    • MCSDK-HPC to OpenCL Component Version Map
    • Does TI’s OpenCL support images and samplers?
    • Why does the OpenCL ICD installed on my platform not find the TI OpenCL implementation?
    • Why do I get messages about /var/lock/opencl when running OpenCL applications?
    • Why do I get DLOAD error messages when running OpenCL applications?
    • How do I limit log file sizes on EVM’s temporary file storage (tmpfs)?
      • 66AK2* EVMs
      • AM57* EVMs
  • Readme
    • OpenCL v01.01.09.x Readme
      • Platforms supported
      • Release Notes
      • Compiler Versions
    • OpenCL v01.01.08.x Readme
      • Platforms supported
      • Release Notes
      • Compiler Versions
    • OpenCL v01.01.07.x Readme
      • Platforms supported
      • Release Notes
      • Compiler Versions
  • Disclaimer
  • Important Notice

你可能感兴趣的:(DSP开发技术,并行计算-CUDA开发)