CUDA 第三方库 cula 应用示例

CULA 库, 说白了就是LAPACK库的cuda版, 实现了大部分LAPACK的函数功能, 而且函数命令以及参数都极其类似LAPACK, 不了解或者不知道LAPACK的, wiki 自己去wiki吧!

目前该库, 有free版可以下载, 此版只能实现6个函数, 而且是单精度计算,如下

Type
Description
Real
Complex
General Solves a general system of linear equations AX=B. SGESV CGESV

 

Computes an LU factorization of a general matrix, using partial pivoting with row interchanges. SGETRF CGETRF
Computes a QR factorization of a general rectangular matrix. SGEQRF CGEQRF
Computes the least squares solution to an over-determined system of linear equations, AX=B, ATX=B, or AHX=B, or the minimum norm solution of an under-determined system, where A is a general rectangular matrix of full rank, using a QR or LQ factorization. SGELS CGELS
Solves the LSE (Constrained Linear Least Squares Problem) using the GRQ (Generalized RQ) factorization. SGGLSE CGGLSE
Computes the singular value decomposition (SVD) of a general rectangular matrix. SGESVD CGESVD

以下代码示例怎么使用其中一个函数 SGESVD, 矩阵奇异值分解, 应用广泛, 该分解旨在分解任意一个矩阵A大小mxn成, 三个矩阵的积, A=U*S*V, 其中U大小 mxm, S 是对角矩阵对角线上的特征值按照从大到小排列, V大小nxn.

CULA库的这个函数, 对matlab 相应的SVD函数加速比已经超过20倍, 让我们还是来看看怎么用吧!

头文件:

#include
#include
#include
#include
#include
#include
#include       //cula 库的头文件
#include
#include
#include


/* Setup SVD Parameters */
    int LDA;
    int LDU;
    int LDVT;
    int i,j;
    int* dev=NULL;
    float* A = NULL;
    float* S = NULL;
    float* U = NULL;
    float* VT = NULL;
    char jobu = 'A';
    char jobvt = 'A';

   /* 参数初始化*/ 

/*LDA LDU LDVT 都是用来指定矩阵同行两个相邻元素之间的物理存储距离,由于cula lapack 都是按照列存储矩阵所以此处的LDA LDVT 是矩阵中一列中包含的元素个数, 也就是行数.*/

    LDA = m;   //m矩阵A的行数 n矩阵A的列数
    LDU = m;
    LDVT = n;
    A = (float*)malloc(m*n*sizeof(float));
    S = (float*)malloc(imin(m,n)*sizeof(float));
    U = (float*)malloc(LDU*m*sizeof(float));
    VT = (float*)malloc(LDVT*n*sizeof(float));


/*初始化cula库*/

/*culaSgesvd函数参数描述*/

/*Parameters
• jobu
– Type: char
– Direction: Input
Specifies options for computing all or part of the matrix U:
= ‘A’: all M columns of U are returned in array U:
= ‘S’: the first min(m,n) columns of U (the left singular vectors) are returned in the array U;
= ‘O’: the first min(m,n) columns of U (the left singular vectors) are overwritten on the array A;
= ‘N’: no columns of U (no left singular vectors) are computed.
• jobvt
– Type: char
– Direction: Input
Specifies options for computing all or part of the matrix VT :
= ‘A’: all N rows of VT are returned in the array VT;
= ‘S’: the first min(m,n) rows of VT (the right singular vectors) are returned in the array VT;
= ‘O’: the first min(m,n) rows of VT (the right singular vectors) are overwritten on the array A;
= ‘N’: no rows of VT (no right singular vectors) are computed.
JOBVT and JOBU cannot both be ‘O’.
• m
– Type: int
– Direction: Input
The number of rows of the input matrix A. M >= 0.
• n
– Type: int
– Direction: Input
The number of columns of the input matrix A. N >= 0.
• a
– Type: S/D/C/Z Pointer
– Direction: Input/Output
– Dimension: (LDA,N)
On entry, the M-by-N matrix A.
On exit,
if JOBU = ‘O’, A is overwritten with the first min(m,n) columns of U (the left singular vectors, stored
columnwise);
if JOBVT = ‘O’, A is overwritten with the first min(m,n) rows of VT (the right singular vectors,
stored rowwise);
if JOBU != ‘O’ and JOBVT != ‘O’, the contents of A are destroyed.
• lda
– Type: int
– Direction: Input
The leading dimension of the array A. LDA >= max(1,M).
• s
– Type: S/D Pointer
– Direction: Output
– Dimension: (min(M,N))
The singular values of A, sorted so that S(i) >= S(i+1).
• u
– Type: S/D/C/Z Pointer
– Direction: Output
– Dimension: (LDU,UCOL)
(LDU,M) if JOBU = ‘A’ or (LDU,min(M,N)) if JOBU = ‘S’. If JOBU = ‘A’, U contains the M-by-M
orthogonal/unitary matrix U; if JOBU = ‘S’, U contains the first min(m,n) columns of U (the left
singular vectors, stored columnwise); if JOBU = ‘N’ or ‘O’, U is not referenced.
• ldu
– Type: int
– Direction: Input
The leading dimension of the array U. LDU >= 1; if JOBU = ‘S’ or ‘A’, LDU >= M.
• vt
– Type: S/D/C/Z Pointer
– Direction: Output
– Dimension: (LDVT,N)
If JOBVT = ‘A’, VT contains the N-by-N orthogonal/unitary matrix VT ; if JOBVT = ‘S’, VT contains
the first min(m,n) rows of VT (the right singular vectors, stored rowwise); if JOBVT = ‘N’ or ‘O’,
VT is not referenced.
• ldvt
– Type: int
– Direction: Input
The leading dimension of the array VT. LDVT >= 1; if JOBVT = ‘A’, LDVT >= N; if JOBVT = ‘S’,
LDVT >= min(M,N)

*/

status=culaSelectDevice(2);  // 选择运行cula库的GPU

status = culaInitialize();  //初始化

status = culaSgesvd(jobu, jobvt, m, n, A, LDA, S, U, LDU, VT, LDVT); //开始计算

 

本示例, 仅仅示范cula库的一个函数应用, 有需要源码和示例程序的, 上cula官方网站下载免费版, 里面有个小的sdk, 可以参考下http://www.culatools.com

 

   

 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/20259129/viewspace-662452/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/20259129/viewspace-662452/

你可能感兴趣的:(CUDA 第三方库 cula 应用示例)