约束 ∥ w r ∥ 2 = 1 \left\|\mathbf{w}_{r}\right\|_{2}=1 ∥wr∥2=1, 可得
h ⊥ = h − w r h w r t ⊥ = t − w r t w r \begin{aligned} \mathbf{h}_{\perp} &=\mathbf{h}-\mathbf{w}_{r} \mathbf{h} \mathbf{w}_{r} \\ \mathbf{t}_{\perp} &=\mathbf{t}-\mathbf{w}_{r} \mathbf{t} \mathbf{w}_{r} \end{aligned} h⊥t⊥=h−wrhwr=t−wrtwr
设置三元组分数为
f r ( h , t ) = ∥ h ⊥ + d r − t ⊥ ∥ 2 2 = ∥ ( h − w r h w r ) + d r − ( t − w r t w r ) ∥ 2 2 \begin{aligned} f_{r}(\mathbf{h}, \mathbf{t}) &=\left\|\mathbf{h}_{\perp}+\mathbf{d}_{r}-\mathbf{t}_{\perp}\right\|_{2}^{2} \\ &=\left\|\left(\mathbf{h}-\mathbf{w}_{r} \mathbf{h} \mathbf{w}_{r}\right)+\mathbf{d}_{r}-\left(\mathbf{t}-\mathbf{w}_{r} \mathbf{t} \mathbf{w}_{r}\right)\right\|_{2}^{2} \end{aligned} fr(h,t)=∥h⊥+dr−t⊥∥22=∥(h−wrhwr)+dr−(t−wrtwr)∥22
损失函数: margin-based loss fuction
L = ∑ ( h , r , t ) ∈ Δ ∑ ( h ′ , r ′ , t ′ ) ∈ Δ ( h , r , t ) ′ m a x ( 0 , [ f r ( h , t ) + γ − f r ′ ( h ′ , t ′ ) ] ) \mathcal{L}=\sum_{(h, r, t) \in \Delta}\sum_{\left(h^{\prime}, r^{\prime}, t^{\prime}\right) \in \Delta_{(h, r, t)}^{\prime}} max(0, \left[f_{r}(\mathbf{h}, \mathbf{t})+\gamma-f_{r^{\prime}}\left(\mathbf{h}^{\prime}, \mathbf{t}^{\prime}\right)\right]) L=(h,r,t)∈Δ∑(h′,r′,t′)∈Δ(h,r,t)′∑max(0,[fr(h,t)+γ−fr′(h′,t′)])
其中 f r ( h , t ) f_{r}(\mathbf{h}, \mathbf{t}) fr(h,t)表示正样本的分数, f r ′ ( h ′ , t ′ ) f_{r^{\prime}}(\mathbf{h}^{\prime}, \mathbf{t}^{\prime}) fr′(h′,t′)表示负样本. γ \gamma γ为 m a r g i n margin margin, 最小化损失函数表示约束
f r ( h , t ) + γ ≤ f r ′ ( h ′ , t ′ ) f_{r}(\mathbf{h}, \mathbf{t})+\gamma \le f_{r^{\prime}}\left(\mathbf{h}^{\prime}, \mathbf{t}^{\prime}\right) fr(h,t)+γ≤fr′(h′,t′)
为了保证约束
∀ e ∈ E , ∥ e ∥ 2 ≤ 1 , //控制数据规模 ∀ r ∈ R , ∣ w r ⊤ d r ∣ / ∥ d r ∥ 2 ≤ ϵ , / / 保 证 w r 与 d r 正 交 ∀ r ∈ R , ∥ w r ∥ 2 = 1 , / / 单 位 法 向 量 \begin{array}{l}{\forall e \in E,\|\mathbf{e}\|_{2} \leq 1, \text {//控制数据规模}} \\ {\forall r \in R,\left|\mathbf{w}_{r}^{\top} \mathbf{d}_{r}\right| /\left\|\mathbf{d}_{r}\right\|_{2} \leq \epsilon, //保证 \mathbf{w}_{r}与\mathbf{d}_{r}正交} \\ {\forall r \in R,\left\|\mathbf{w}_{r}\right\|_{2}=1, //单位法向量 }\end{array} ∀e∈E,∥e∥2≤1,//控制数据规模∀r∈R,∣∣wr⊤dr∣∣/∥dr∥2≤ϵ,//保证wr与dr正交∀r∈R,∥wr∥2=1,//单位法向量
对给优化函数加上正则项
L = ∑ ( h , r , t ) ∈ Δ ∑ ( h ′ , r ′ , t ′ ) ∈ Δ ( h , r , t ) ′ m a x ( 0 , [ f r ( h , t ) + γ − f r ′ ( h ′ , t ′ ) ] ) + C { ∑ e ∈ E m a x ( 0 , [ ∥ e ∥ 2 2 − 1 ] ) + ∑ r ∈ R m a x ( 0 , [ ( w r ⊤ d r ) 2 ∥ d r ∥ 2 2 − ϵ 2 ] ) } \mathcal{L}=\sum_{(h, r, t) \in \Delta}\sum_{\left(h^{\prime}, r^{\prime}, t^{\prime}\right) \in \Delta_{(h, r, t)}^{\prime}} max(0, \left[f_{r}(\mathbf{h}, \mathbf{t})+\gamma-f_{r^{\prime}}\left(\mathbf{h}^{\prime}, \mathbf{t}^{\prime}\right)\right]) + C\left\{\sum_{e \in E}max(0,\left[\|\mathrm{e}\|_{2}^{2}-1\right])+\sum_{r \in R}max(0, \left[\frac{\left(\mathbf{w}_{r}^{\top} \mathbf{d}_{r}\right)^{2}}{\left\|\mathbf{d}_{r}\right\|_{2}^{2}}-\epsilon^{2}\right])\right\} L=(h,r,t)∈Δ∑(h′,r′,t′)∈Δ(h,r,t)′∑max(0,[fr(h,t)+γ−fr′(h′,t′)])+C{ e∈E∑max(0,[∥e∥22−1])+r∈R∑max(0,[∥dr∥22(wr⊤dr)2−ϵ2])}
其中C为超参数, 控制正则项的权重
设每个 t a i l tail tail 对应的 h e a d head head 数量的平均数为 t p h tph tph, 每个 h e a d head head 对应的 t a i l tail tail 数量的平均数为 h p t hpt hpt, 定义参数为 t p h t p h + h p t \frac{t p h}{t p h+h p t} tph+hpttph 的二项分布来抽样,即