Spark MLlib Vector

首先不能忘记:

import org.apache.spark.mllib.linalg.{Vectors,Vector}

供测试数据:

val Ar = ( for(i <- to 10 yield (i+1)*(i+4).toDouble ).toArray
一:稠密向量的声明方法有两种:

<一>  
def  dense(values: Array[Double]): Vector                      
Creates a dense vector from a double array.    
val vd1 = Vectors.dense(Ar)             
* vd1: org.apache.spark.mllib.linalg.Vector =[10.0,18.0,28.0,40.0,54.0,70.0,88.0,108.0,130.0,154.0] 
<二>:
def  dense(firstValue: Double, otherValues: Double*): Vector   
Creates a dense vector from its values.      
val vd2: Vector = Vectors.dense(1.00.03.0)             
vd2: org.apache.spark.mllib.linalg.Vector = [1.0,0.0,3.0]  
二:稀疏向量的声明方法有三种

<一>
def sparse(size: Int, indicesArray[Int], values: Array[Double]): Vector         
Creates a sparse vector providing its index array and value array.  
val Vs1: Vector = Vectors.sparse(4Array(0,2)Array(1.02.0 ,3.0))       
Vs1: org.apache.spark.mllib.linalg.Vector = (4,[0,1,2],[1.0,2.0,3.0])    
<二>: 
def sparse(size: Int, elements: Seq[(Int, Double)]): Vector      
Creates a sparse vector using unordered (index, value) pairs.    
val Vs2: Vector = Vectors.sparse(4Seq((01.0)(23.0),(1,2.0)))    
* Vs2: org.apache.spark.mllib.linalg.Vector = (4,[0,1,2],[1.0,2.0,3.0])   
 <三>:
def sparse(size: Int, elements: Iterable[(Integer, Double)]): Vector               
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way
 <<一>>                
val Vs3_Ar = Vectors.sparse (Ar.length                                            Ar.zipWithIndex.map(e => (e._2e._1)).filter(_._2 != 0.0))                       
Ar: Array[Double] = Array(10.0, 18.0, 28.0, 40.0, 54.0, 70.0, 88.0, 108.0, 130.0, 154.0)                 
Vs3_Ar: org.apache.spark.mllib.linalg.Vector = ( 10  ,  [0,1,2,3,4,5,6,7,8,9]  ,  [10.0,18.0,28.0,40.0,54.0,70.0,88.0,108.0,130.0,154.0])                                
 <<二>>                      
 val zv: Vector = Vectors.zeros(10)   
zv: org.apache.spark.mllib.linalg.Vector = [0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]  
val Vs3_Vz = Vectors.sparse(VzAr.lengthVzAr.zipWithIndex.map(e => (e._2e._1)).filter(_._2 != 0.0))                   
* s3_Vz: org.apache.spark.mllib.linalg.Vector = (10,[],[])


你可能感兴趣的:(spark,vector,MLlib)