首先不能忘记:
import org.apache.spark.mllib.linalg.{Vectors,Vector}
供测试数据:
val Ar = ( for(i <- 1 to 10 ) yield (i+1)*(i+4).toDouble ).toArray
一:稠密向量的声明方法有两种:
<一>:
def dense(values: Array[Double]): Vector
Creates a dense vector from a double array.
val vd1 = Vectors.dense(Ar)
* vd1: org.apache.spark.mllib.linalg.Vector =[10.0,18.0,28.0,40.0,54.0,70.0,88.0,108.0,130.0,154.0]
<二>:
def dense(firstValue: Double, otherValues: Double*): Vector
Creates a dense vector from its values.
val vd2: Vector = Vectors.dense(1.0, 0.0, 3.0)
* vd2: org.apache.spark.mllib.linalg.Vector = [1.0,0.0,3.0]
二:稀疏向量的声明方法有三种
<一>:
def sparse(size: Int, indices: Array[Int], values: Array[Double]): Vector
Creates a sparse vector providing its index array and value array.
val Vs1: Vector = Vectors.sparse(4, Array(0, 1 ,2), Array(1.0, 2.0 ,3.0))
*Vs1: org.apache.spark.mllib.linalg.Vector = (4,[0,1,2],[1.0,2.0,3.0])
<二>:
def sparse(size: Int, elements: Seq[(Int, Double)]): Vector
Creates a sparse vector using unordered (index, value) pairs.
val Vs2: Vector = Vectors.sparse(4, Seq((0, 1.0), (2, 3.0),(1,2.0)))
* Vs2: org.apache.spark.mllib.linalg.Vector = (4,[0,1,2],[1.0,2.0,3.0])
<三>:
def sparse(size: Int, elements: Iterable[(Integer, Double)]): Vector
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way
<<一>>:
val Vs3_Ar = Vectors.sparse (Ar.length, Ar.zipWithIndex.map(e => (e._2, e._1)).filter(_._2 != 0.0))
* Ar: Array[Double] = Array(10.0, 18.0, 28.0, 40.0, 54.0, 70.0, 88.0, 108.0, 130.0, 154.0)
* Vs3_Ar: org.apache.spark.mllib.linalg.Vector = ( 10 , [0,1,2,3,4,5,6,7,8,9] , [10.0,18.0,28.0,40.0,54.0,70.0,88.0,108.0,130.0,154.0])
<<二>>:
val zv: Vector = Vectors.zeros(10)
* zv: org.apache.spark.mllib.linalg.Vector = [0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
val Vs3_Vz = Vectors.sparse(VzAr.length, VzAr.zipWithIndex.map(e => (e._2, e._1)).filter(_._2 != 0.0))
* s3_Vz: org.apache.spark.mllib.linalg.Vector = (10,[],[])