网络嵌入算法-Network Embedding-LINE/LANE/M-NMF

本文结构安排

  • M-NMF
  • LANE
  • LINE

什么是Network Embedding?

LINE

  • [Information Network]
    An information network is defined as G = ( V , E ) G = (V,E) G=(V,E), where V V V is the set
    of vertices, each representing a data object and E E E is the
    set of edges between the vertices, each representing a relationship between two data objects. Each edge e ∈ E e\in E eE is an ordered pair e = ( u , v ) e = (u,v) e=(u,v) and is associated with a weight w u v > 0 w_{uv} > 0 wuv>0, which indicates the strength of the relation. If G G G is undirected, we have ( u , v ) ≡ ( v , u ) (u,v) ≡ (v,u) (u,v)(v,u) and w u v ≡ w v u w_{uv} \equiv w_{vu} wuvwvu; if G is directed, we have ( u , v ) ≠ ( v , u ) (u,v) \neq (v,u) (u,v)̸=(v,u) and w u v ≠ w v u w uv \neq w vu wuv̸=wvu

  • [First-order Proximity] The first-order proximity in a network is the local pairwise proximity between two vertices. For each pair of vertices linked by an edge ( u , v ) (u,v) (u,v), the weight on that edge, w u v w_{uv} wuv, indicates the first-order proximity between u and v. If no edge is observed between u and v, their first-order proximity is 0. The first-order proximity usually implies the similarity of two nodes in a real-world network.

    LINE with First-order Proximity:The first-order proximity refers to the local pairwise proximity between the vertices in the network. For each undirected edge ( i , j ) (i,j) (i,j), the joint probability between vertex v i v_{i} vi and v j v_{j} vj as follows:
    p 1 ( v i , v j ) = 1 1 + exp ⁡ ( − u ⃗ i T ⋅ u ⃗ j ) p_{1}(v_{i},v_{j})=\frac{1}{1+\exp(-\vec{u}_{i}^{T} \cdot \vec{u}_{j})} p1(vi,vj)=1+exp(u iTu j)1
    where $u_{i} \in R^{d} $ is the low-dimensional vector representation of vertex v i v_{i} vi . p ^ 1 ( i , j ) = w i j W \hat{p}_{1}(i,j) = \frac{w_{ij}}{W} p^1(i,j)=Wwij,where W = ∑ ( i , j ) ∈ E w i j W = \sum_{(i,j) \in E}^{ }w_{ij} W=(i,j)Ewij .
    And its empirical probability can be defined as p ^ 1 ( i , j ) = w i j W \hat{p}_{1}(i,j)=\frac{w_{ij}}{W} p^1(i,j)=Wwij,where W = ∑ ( i , j ) ∈ E w i j W=\sum_{(i,j)\in E}^{ }w_{ij} W=(i,j)Ewij.

    To preserve the first-order proximity we can minimize the following objective function:
    O 1 = d ( p ^ 1 ( ⋅ , ⋅ ) , p 1 ( ⋅ , ⋅ ) ) O_{1}=d(\hat{p}_{1}(\cdot,\cdot),p_{1}(\cdot,\cdot)) O1=d(p^1(,),p1(,))
    where d ( ⋅ , ⋅ ) d(\cdot,\cdot) d(,) is the distance between two distributions. We choose to minimize the KL-divergence of two probability distributions. Replacing d ( ⋅ , ⋅ ) d(\cdot,\cdot) d(,) with KL-divergence and omitting some constants, we have:
    O 1 = − ∑ ( i , j ) ∈ E w i j log ⁡ p 1 ( v i , v j ) O_{1}=-\sum_{(i,j)\in E}^{ }w_{ij}\log p_{1}(v_{i},v_{j}) O1=(i,j)Ewijlogp1(vi,vj)

  • [Second-order Proximity] The second-order proximity between a pair of vertices (u,v) in a network is the similarity between their neighborhood network structures. Mathematically, let p u = ( w u , 1 , . . . , w u , ∣ V ∣ ) p_{u} = (w_{u,1} ,...,w_{u,|V|}) pu=(wu,1,...,wu,V) denote the first-order proximity of u with all the other vertices,then the second-order proximity between u and v is determined by the similarity between p u and p v . If no vertex is linked from/to both u and v, the second-order proximity between u and v is 0.

    The second-order proximity assumes that vertices sharing many connections to other vertices are similar to each other. In this case, each vertex is also treated as a specific “context” and vertices with similar distributions over the “contexts” are assumed to be similar.
    Therefore, each vertex plays two roles: the vertex itself and a specific “context” of other vertices.We introduce two vectors u ⃗ i \vec{u}_{i} u i and u ⃗ i ′ \vec{u}_{i}^{'} u i , where u ⃗ i \vec{u}_{i} u i is the representation of v i v_{i} vi when it is treated as a vertex while u ⃗ i ′ \vec{u}_{i}^{'} u i is the representation of v i v_{i} vi when it is treated as a specific “context”. For each directed edge ( i , j ) (i,j) (i,j),we first define the probability of “context” v j v_{j} vj generated by vertex v i v_{i} vi as:

    p 2 ( v j , v i ) = exp ⁡ ( u ⃗ i ′ T ⋅ u ⃗ i ) ∑ k = 1 ∣ V ∣ exp ⁡ ( u ⃗ k ′ T ⋅ u ⃗ i ) p_{2}(v_{j},v_{i})=\frac{\exp(\vec{u}_{i}^{'T} \cdot \vec{u}_{i}) }{\sum_{k=1}^{|V|}\exp(\vec{u}_{k}^{'T} \cdot \vec{u}_{i})} p2(vj,vi)=k=1Vexp(u kTu i)exp(u iTu i)
    where ∣ V ∣ |V| V is the number of vertices or “contexts”. p ^ 2 ( v i , v j ) = w i j d \hat{p}_{2}(v_{i},v_{j}) = \frac{w_{ij}}{d} p^2(vi,vj)=dwij,where d = ∑ k ∈ N e i b o u r ( i ) w i k d = \sum_{k \in Neibour(i)}^{ }w_{ik} d=kNeibour(i)wik.

    The second-order proximity assumes that vertices with similar distributions over the contexts are similar to each other. To preserve the second-order proximity, we should make the conditional distribution of the contexts p 2 ( ⋅ ∣ v i ) p_{2}(\cdot|v_{i}) p2(vi) specified by the low-dimensional representation be close to the empirical distribution p ^ 2 ( ⋅ ∣ v i ) \hat{p}_{2}(\cdot |v_{i}) p^2(vi).Therefore, we minimize the following objective function:

    O 2 = ∑ i ∈ V λ i d ( p ^ 2 ( ⋅ ∣ v i ) , p 2 ( ⋅ ∣ v i ) ) O_{2}=\sum_{i \in V}^{ }\lambda_{i}d(\hat{p}_{2}(\cdot | v_{i}),p_{2}(\cdot | v_{i})) O2=iVλid(p^2(vi),p2(vi))

    where d ( ⋅ , ⋅ ) d(\cdot,\cdot) d(,) is the distance between two distributions.
    $\lambda_{i} $ in the objective function is to represent the prestige of vertex i in the network,which can be measured by the degree or estimated through algorithms.

    The empirical distribution p ^ 2 ( ⋅ ∣ v i ) \hat{p}_{2}(\cdot |v_{i}) p^2(vi) is defined as
    p ^ 2 ( v j ∣ v i ) = w i j d i \hat{p}_{2}(v_{j} |v_{i})=\frac{w_{ij}}{d_{i}} p^2(vjvi)=diwij,where w i j w_{ij} wij is the weight of the edge ( i , j ) (i,j) (i,j) and d i d_{i} di is the out-degree of vertex i. Here we adopt KL-divergence as the distance function:

    O 2 = − ∑ ( i , j ) ∈ E w i j log ⁡ p 2 ( v j ∣ v i ) O_{2}=-\sum_{(i,j)\in E}^{ }w_{ij}\log p_{2}(v_{j}|v_{i}) O2=(i,j)Ewijlogp2(vjvi)

    minimize this objective O 2 O_{2} O2, we are able to represent every vertex v i v{i} vi with a d-dimensional vector u ⃗ i \vec{u}_{i} u i

  • [Large-scale Information Network Embedding] Given a large network G = ( V , E ) G = (V,E) G=(V,E), the problem of Large-scale Information Network Embedding aims to represent each vertex v ∈ V v \in V vV into a low-dimensional space R d R^{d} Rd,learning a function f G : V → R d f_{G}:V \rightarrow R^{d} fG:VRd , where ∣ V ∣ ≫ d |V| \gg d Vd. In the space R d R^{d} Rd , both the first-order proximity and the second-order proximity between the vertices are preserved.

    We adopt the asynchronous stochastic gradient algorithm (ASGD) for optimizing O 2 O_{2} O2,In each step, the ASGD algorithm samples a mini-batch of edges and then updates the model parameters. If an edge ( i , j ) (i,j) (i,j) is sampled, the gradient the embedding vector u ⃗ i \vec{u}_{i} u i of vertex i will be calculated as:

    ∂ O 2 ∂ u ⃗ i = w i j ∂ log ⁡ p 2 ( v j ∣ v i ) ∂ u ⃗ i \frac{\partial O_{2}}{\partial \vec{u}_{i}}=w_{ij}\frac{\partial \log p_{2}(v_{j}|v_{i})}{\partial \vec{u}_{i}} u iO2=wiju ilogp2(vjvi)

    Optimizing objectives are computationally expensive,which requires the summation over the entire set of vertices when calculating the conditional probability p 2 ( ⋅ ∣ v i ) p_{2}(\cdot |v_{i}) p2(vi). To address this problem, we adopt the approach of\textbf{ negative sampling }proposed.

    a r g min ⁡ U , U ′ O 2 = ∑ ( i , j ) ∈ E w i j [ log ⁡ σ ( u ⃗ j ′ T ⋅ u ⃗ i ) + ∑ i = 1 K E v n ∼ P n ( v ) [ log ⁡ σ ( − u ⃗ k ′ T ⋅ u ⃗ i ) ] ] arg \min_{U,U'} O_{2} = \sum_{(i,j)\in E}^{ }w_{ij}[\log \sigma(\vec{u}_{j}^{'T}\cdot \vec{u}_{i}) + \sum_{i=1}^{K}E_{v_{n}}\sim P_{n}(v)[\log \sigma(-\vec{u}_{k}^{'T}\cdot \vec{u}_{i})]] argU,UminO2=(i,j)Ewij[logσ(u jTu i)+i=1KEvnPn(v)[logσ(u kTu i)]]

∂ O 2 ∂ u ⃗ i = − w i j [ u ⃗ j ′ ( 1 − σ ( u ⃗ j ′ T ⋅ u ⃗ i ) ) − ∑ k = 1 K u ⃗ k ′ ( u ⃗ k ′ T ⋅ u ⃗ i ) ] \frac{\partial O_{2}}{\partial \vec{u}_{i}} = -w_{ij}[\vec{u}_{j}^{'}(1-\sigma(\vec{u}_{j}^{'T}\cdot \vec{u}_{i})) -\sum_{k=1}^{K}\vec{u}_{k}^{'}(\vec{u}_{k}^{'T}\cdot \vec{u}_{i})] u iO2=wij[u j(1σ(u jTu i))k=1Ku k(u kTu i)]

∂ O 2 ∂ u ⃗ j ′ = − w i j u ⃗ i [ 1 − σ ( u ⃗ j ′ T ⋅ u ⃗ i ) ] \frac{\partial O_{2}}{\partial \vec{u}_{j}^{'}} = -w_{ij}\vec{u}_{i}[1-\sigma(\vec{u}_{j}^{'T}\cdot \vec{u}_{i})] u jO2=wiju i[1σ(u jTu i)]

∂ O 2 ∂ u ⃗ k ′ = w i j u ⃗ i σ ( u ⃗ k ′ T ⋅ u ⃗ i ) \frac{\partial O_{2}}{\partial \vec{u}_{k}^{'}} = w_{ij}\vec{u}_{i}\sigma(\vec{u}_{k}^{'T}\cdot \vec{u}_{i}) u kO2=wiju iσ(u kTu i)

Update parameter u ⃗ i , u ⃗ j ′ , u ⃗ k ′ \vec{u}_{i},\vec{u}_{j}^{'},\vec{u}_{k}^{'} u i,u j,u k:

u ⃗ i = u ⃗ i − ρ ∂ O 2 ∂ u ⃗ i \vec{u}_{i} = \vec{u}_{i} - \rho \frac{\partial O_{2}}{\partial \vec{u}_{i}} u i=u iρu iO2

u ⃗ j ′ = u ⃗ j ′ ρ ∂ O 2 ∂ u ⃗ j ′ \vec{u}_{j}^{'} = \vec{u}_{j}^{'} \rho \frac{\partial O_{2}}{\partial \vec{u}_{j}^{'}} u j=u jρu jO2

u ⃗ k ′ = u ⃗ k ′ − ρ ∂ O 2 ∂ u ⃗ k ′ \vec{u}_{k}^{'} = \vec{u}_{k}^{'} - \rho \frac{\partial O_{2}}{\partial \vec{u}_{k}^{'}} u k=u kρu kO2

The above is the result of optimizing O 2 O_{2} O2, and the obtained U U U is the result of the second-order similarity. The optimization of O 1 O_{1} O1 is similar to optimization of O 2 O_{2} O2, only one variable U needs to be updated. Just change $\vec{u}_{j}^{’} $ to

M-NMF

The objective function is not convex, and we separate the
optimization to four subproblems and iteratively optimize them, which guarantees each subproblem converges to the local minima.

objective function:
min ⁡ M , U , H , C = ∣ ∣ S − M U ∣ ∣ F 2 + α ∣ ∣ H − U C T ∣ ∣ F 2 − β t r ( H T B H ) \min_{M,U,H,C}=||S-MU||_{F}^{2}+\alpha||H-UC^{T}||_{F}^{2}-\beta tr(H^{T}BH) M,U,H,Cmin=SMUF2+αHUCTF2βtr(HTBH)

s . t . , M ≥ 0 , U ≥ 0 , H ≥ , C ≥ , t r ( H T H ) = n s.t.,M\geq 0,U\geq0,H\geq,C\geq,tr(H^{T}H)=n s.t.,M0,U0,H,C,tr(HTH)=n

M-subproblem: With other parameters in objective function fixed leads to a standard NMF formulation,the updating rule for M is:

M ← M ⊙ S U M U T U M \leftarrow M \odot \frac{SU}{MU^{T}U} MMMUTUSU

U-subproblem: Updating U with other parameters in objective function
fixed leads to a joint NMF problem,the updating rule is:

U ← U ⊙ S T M + α H C U ( M T M + α C T C ) U \leftarrow U \odot \frac{S^{T}M+\alpha HC}{U(M^{T}M+\alpha C^{T}C)} UUU(MTM+αCTC)STM+αHC

C-subproblem: Updating C with other parameters in objective function
fixed also leads to a standard NMF formulation,the updating rule of C is:

C ← C ⊙ H T U C U T U C \leftarrow C \odot \frac{H^{T}U}{CU^{T}U} CCCUTUHTU

H-subproblem: This is the fixed point equation that the solution must satisfy at convergence. Given an initial value of H, the successive updating rule of H is:

H ← H ⊙ − w β B 1 H + △ 8 λ H H T H H \leftarrow H \odot \sqrt{ \frac{-w\beta B_{1}H+\sqrt{ \bigtriangleup}}{8\lambda HH^{T}H}} HH8λHHTHwβB1H+

where △ = 2 β ( B 1 H ) ⊙ 2 β ( B 1 H ) + 16 λ ( H H T H ) ⊙ ( 2 β A H + 2 α U C T + ( 4 λ − 2 α ) H ) \bigtriangleup = 2\beta(B_{1}H) \odot 2\beta(B_{1}H) + 16\lambda(HH^{T}H)\odot(2\beta AH+2\alpha UC^{T}+(4\lambda - 2\alpha)H) =2β(B1H)2β(B1H)+16λ(HHTH)(2βAH+2αUCT+(4λ2α)H)

LANE

see as another article of my blog
[论文阅读——LANE-Label Informed Attributed Network Embedding原理即实现]https://www.jianshu.com/p/1abb24bb8a04

LINE-O2 Spark实现

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import breeze.linalg._
import breeze.numerics._
import breeze.stats.distributions.Rand
import scala.math._

object LINE {

	//生成一个随机数序列List,Range是范围,num是随机序列个数
	def RandList(Range:Int,num:Int) : List[Int] = {
		var resultList:List[Int]=Nil
		while (resultList.length < num){
			val randomNum = (new util.Random).nextInt(Range)
			if(!resultList.exists(s => s==randomNum )){
				resultList=resultList:::List(randomNum)
			}
		}
		return resultList
	}

	def RandNumber(Range:Int) : Int = {
		val randomNum = (new util.Random).nextInt(Range)
		return randomNum
	}

	def Sigmoid(In:Double): Double = {
		var Out:Double = 1.0/(math.exp(-1.0*In)+1)
		return Out
	}

	def main(args: Array[String]) {
		if (args.length < 4) {
			System.err.println("Usage: LINE    ")
			System.exit(1)
		}

		//负采样个数
		val NS = args(2).toInt
		println("Negative Sample: "+NS)

		//图嵌入的维度
		val Dim = args(3).toInt
		println("Embedding dimension: "+Dim)

		//spark配置和上下文
		val conf = new SparkConf().setAppName("LINE")
		val sc = new SparkContext(conf)

		//输入邻接矩阵
		val InputFile = sc.textFile(args(0),3)
		//输入邻接表文件
		val EgdeFile = sc.textFile(args(1),3)

		//输出输入的文件行数
		val InputFileCount = InputFile.count().toInt
		println("InputFileCount(number of lines): "+InputFileCount)

		//随机采样率
		val sample_rate : Double = 0.1

		//负采样哈希表的映射长度
		val HashTableSize: Int = 50000
		println("HashTableSize: "+HashTableSize)


		//LINE O_2 的二阶相似度变量 
		var U_vertex = DenseMatrix.rand(InputFileCount, Dim, Rand.uniform)
		var U_context = DenseMatrix.rand(InputFileCount, Dim, Rand.uniform)

		//邻接矩阵RDD
		val Adjacent = InputFile.map(line => line.split(",")).map(splitline => splitline.map(word => word.toDouble))
		val EgdeSet = EgdeFile.map(line => line.split(",")).map(splitline => splitline.map(word => word.toDouble))
		

		//当数据量变大,collect操作将会有崩溃 待优化点1
		val AdjacentCollect = Adjacent.collect()

		//邻接矩阵的行和列
		val rows = AdjacentCollect.length
		val cols = AdjacentCollect(0).length

		//邻接矩阵拉长为一维向量
		val flattenAdjacent = AdjacentCollect.flatten

		//邻接矩阵转为 breeze 矩阵
		val AdjacentMatrix = new DenseMatrix(cols,rows,flattenAdjacent).t
		
		//println(Adjacent.take(10).toList)
		// Adjacent.foreach{
		// 	rdd => println(rdd.toList)
		// }

		//每个点的度RDD
		val VertexDegree = Adjacent.map(line => line.reduce((x,y) => x+y))

		//所有点的度求和
		var SumOfDegree = VertexDegree.reduce((x,y)=>x+y)

		//var SumOfDegree = sc.accumulator(0)
		//VertexDegree.foreach(x => SumOfDegree += x)  

		//对点的概率进行平滑,3/4次幂
		val SmoothProbability = VertexDegree.map(degree => degree/SumOfDegree).map(math.pow(_,0.75))

		//求SmoothProbability的累积概率CumulativeProbability
		val p : Array[Double] = SmoothProbability.collect()
		val CumulativeProbability : Array[Double] = new Array[Double](InputFileCount)
		for(i <- 0 to InputFileCount-1) {
			var inner_sum : Double = 0.0
			for(j <- 0 to i){
				inner_sum = inner_sum + p(j)
			}
			CumulativeProbability(i) = inner_sum
		}

		//归一化后的累积概率后,乘以HashTableSize并取整,可以得到0~HashTableSize之内的整数
		val HashProbability : Array[Int] = new Array[Int](InputFileCount)
		//累积概率的最大值
		var max_cpro = CumulativeProbability(InputFileCount-1)
		for(i <- 0  to InputFileCount-1)
		{
			HashProbability(i) = ((CumulativeProbability(i)/max_cpro)*HashTableSize).toInt
		}
		//点的id的哈希表
		val HashTable : Array[Int] = new Array[Int](HashTableSize+1)

		//循环生成哈希映射,HashTableSize大小的数组,数组内存储的是点的id标识
		for(i <- 0 to InputFileCount-1) {
			if (i==0) {
				var start : Int = 0
				var end : Int = HashProbability(1)
				for(j <- start to end) {
					HashTable(j) = i
				}
			}
			else {
				var start : Int = HashProbability(i-1)
				var end : Int = HashProbability(i)
				for(j <- start to end) {
					HashTable(j) = i
				}
			}
		}

		println("HashTable(HashTableSize):"+HashTable(HashTableSize))


		val sample_num = (sample_rate*InputFileCount).toInt
		println("sample_num "+sample_num)
		var O2_Array: Array[Double] = new Array[Double](100)
		for(iterator <- 0 to 99)
		{
			//println("the iterator is "+iterator)
			var learningrate = 0.1
			var O_2 = 0.0
			
			//false表示无放回采样 选取预先选定的采样数量
			var sampling = EgdeSet.takeSample(false,sample_num)
			for(i <- 0 to sample_num-1)
			{
				var objective = 0.0
				//println("i is " + i)
				var row:Int = sampling(i)(0).toInt
				var col:Int = sampling(i)(1).toInt
				//println("row:"+row)
				//println("col:"+col)
				var u_j_context = U_context(col,::).t
				var u_j_context_t = U_context(col,::)
				var u_i_vertex = U_vertex(row,::).t
				
				var part1=(-1)*sampling(i)(2)*u_j_context*(1-Sigmoid((u_j_context_t*u_i_vertex).toDouble))
				//println("part1: "+part1)

				//生成0~50000的NS个随机数,用于挑选负采样样本
				var negativeSampleSum = DenseVector.zeros[Double](Dim)
				var RandomSet : List[Int] = RandList(50000,NS)
				//println("RandomSet is:"+RandomSet)
				for(j <- 0 to RandomSet.length-1){
					//println(RandomSet(j))
					var u_k_context = U_context(HashTable(RandomSet(j)),::).t
					var u_k_context_t = U_context(HashTable(RandomSet(j)),::)
					negativeSampleSum = negativeSampleSum + u_k_context*Sigmoid((u_k_context_t*u_i_vertex).toDouble)
				}
				//println("negativeSampleSum: "+negativeSampleSum)
				
				var part2 = sampling(i)(2)*negativeSampleSum
				//println("part2: "+part2)
				
				var d_O2_ui = part1-part2
				//println("d_O2_ui: "+d_O2_ui)

				//更新u_i
				var tmp1 = u_i_vertex - learningrate*(d_O2_ui)
				//println(tmp1(0)+" "+tmp1(1))

				// println("previous U_context(row,::): "+U_context(row,::))
				for(k1 <- 0 to Dim-1){
					U_vertex(row,k1) = tmp1(k1)
				}
				//println("after U_context(row,::): "+U_context(row,::))

				var d_O2_uj_context = (-1)*sampling(i)(2)*u_i_vertex*(1-Sigmoid((u_j_context_t*u_i_vertex).toDouble))

				//更新u_j'
				var tmp2 = u_j_context - learningrate*(d_O2_uj_context)
				for(k2 <- 0 to Dim-1){
					U_context(row,k2) = tmp2(k2)
				}


				//更新u_k'
				var negative_cal = 0.0
				for(j <- 0 to RandomSet.length-1){
					
					var u_k_context = U_context(HashTable(RandomSet(j)),::).t
					var u_k_context_t = U_context(HashTable(RandomSet(j)),::)

					//这两行用于计算目标函数的值
					var sigmoid_uk_ui = Sigmoid((u_k_context_t*u_i_vertex).toDouble)
					negative_cal = negative_cal + math.log(sigmoid_uk_ui)

					//对u_k'求导
					var d_O2_uk_context = sampling(i)(2)*u_i_vertex*sigmoid_uk_ui

					var tmp3 = u_k_context - learningrate*d_O2_uk_context
					for(k3 <- 0 to Dim-1){
						U_context(HashTable(RandomSet(j)),k3) = tmp2(k3)
					}
					
				}

				//计算误差的变化
				objective = (-1)*sampling(i)(2)*(math.log(Sigmoid((u_j_context_t*u_i_vertex).toDouble)) + negative_cal)
				O_2 = O_2 + objective
			}

			O2_Array(iterator) = O_2
				
		}
		
		
		val U2_HDFS = sc.parallelize(U_vertex.toArray,3)
		val O2_HDFS = sc.parallelize(O2_Array,3)

		//a(::, 2)	
		
		println("======================")
		//println(formZeroToOneRandomMatrix)
		//VertexDegree.saveAsTextFile("file:///usr/local/data/line")
		//IndexSmoothProbability.saveAsTextFile("file:///usr/local/data/line")
		//HashProbability.saveAsTextFile("file:///usr/local/data/line")
		U2_HDFS.saveAsTextFile("file:///usr/local/data/U2")
		O2_HDFS.saveAsTextFile("file:///usr/local/data/O2")
		println("======================")
		sc.stop()
	}
}

你可能感兴趣的:(Network,Embedding)