curve25519-dalek中的montgomery_reduce算法细节

https://github.com/dalek-cryptography/curve25519-dalek/blob/master/src/backend/serial/u64/scalar.rs 中的montgomery_reduce()中的算法细节和标准的算法细节有所差异,进一步减少了内部算法的位数要求。

1. montgomery_reduction标准算法及举例

INPUT: integers m = (mn-1…m1m0)b with gcd(m,b) = 1, R= bn, m’ = -m-1 mod b, and T = (t2n-1…t1t0)b < mR.
OUTPUT: TR-1 mod m.
1. A ← T. (Notation: A = (a2n-1…a1a0)b)
2. For i from 0 to (n-1) do the following:
2.1 ui ← aim’ mod b.
2.2 A ← A + uimbi.
3. A ← A/bn.
4. if A >= m then A ← A - m.
5. Return(A).

注意,其中的m’与m 的模运算关系是b,而不是R,其中R=bn

举例:
假设 m = 72639, b = 10, R = 105, 且 T = 7118368.
则有 n = 5,m’ = −m−1 mod 10 = 1, T mod m = 72385, 且 TR−1 mod m = 39796.

对应的上述算法执行细节为:

i ui=aim’ mod b uimbi A
- - - 7118368
0 8 581112 7699480
1 8 5811120 13510600
2 6 43583400 57094000
3 4 290556000 347650000
4 5 3631950000 3979600000

A/bn mod m =39796
与预期相符。

2. montgomery_reduction改进算法

在上述算法中,在step2.2中,用于存储数据A的位数会膨胀。
基于以下事实:
m = (mn-1…m1m0)b = mn-1bn-1 + … + m1b1 + m0
T = (t2n-1…t1t0)b = t2n-1b2n-1 + … + t1b1 + t0
A = (a2n-1…a1a0)b = a2n-1b2n-1 + … + a1b1 + a0
而 y*bx mod b = 0, step2.1/2.2/3均可优化!

仍然基于上述算法进行如下改进可知:
INPUT: integers m = (mn-1…m1m0)b with gcd(m,b) = 1, R= bn, m’ = -m-1 mod b, and T = (t2n-1…t1t0)b < mR.
OUTPUT: TR-1 mod m.
1. A ← T. (Notation: A = (a2n-1…a1a0)b)
2. carry ← 0.
3. For i from 0 to (n-1) do the following:
3.1   a i = c a r r y + t i + ∑ k = 0 i − 1 , p = i − k , 0 = < p < n u k ∗ m p \ a_i = carry + t_i + \sum_{k=0}^{i-1, p=i-k, 0 =< p < n}{u_k*m_p}  ai=carry+ti+k=0i1,p=ik,0=<p<nukmp
3.2 ui ← aim’ mod b.
3.3 carry ← (ai + ui*m0)/b.
4. for j from 0 to (n-1) do the following:
4.1   r j = c a r r y + t j + n + ∑ k = 0 j − 1 , q = j + n − k , 0 = < 1 < n u k ∗ m q \ r_j = carry + t_{j+n} + \sum_{k=0}^{j-1, q=j+n-k, 0 =< 1 < n}{u_k*m_q}  rj=carry+tj+n+k=0j1,q=j+nk,0=<1<nukmq
4.2 carry ← rj/b.
4.3 wj ← rj mod b.
5.   A = ∑ j = 0 n − 1 w j ∗ b j \ A = \sum_{j=0}^{n-1}{w_j*b^j}  A=j=0n1wjbj
6. if A >= m then A ← A - m.
7. Return(A).

对于与1)相同的举例:
假设 m = 72639, b = 10, R = 105, 且 T = 7118368.
则有 n = 5,m’ = −m−1 mod 10 = 1, T mod m = 72385, 且 TR−1 mod m = 39796.

初始状态为:m0=9, carry = 0,b=10,n=5, T=71183868=(t9…t1t0)10
相应的算法Step 3执行细节如下:

i ui=aim’ mod b   a i = c a r r y + t i + ∑ k = 0 i − 1 , p = i − k , 0 = < p < n u k ∗ m p \ a_i = carry + t_i + \sum_{k=0}^{i-1, p=i-k, 0 =< p < n}{u_k*m_p}  ai=carry+ti+k=0i1,p=ik,0=<p<nukmp carry = (ai + ui*m0)/b
- - - 0
0 8 8 (8+8*9)/10 = 8
1 8 8+6+8*3=38 (38+8*9)/10 = 11
2 6 11+3+86+83=86 (86+6*9)/10 = 14
3 4 14+8+82+86+6*3=104 (104+4*9)/10 = 14
4 5 14+1+87+82+66+43=135 (135+5*9)/10 = 18

相应的算法Step 4执行细节如下:

j   r j = c a r r y + t j + n + ∑ k = 0 j − 1 , q = j + n − k , 0 = < 1 < n u k ∗ m q \ r_j = carry + t_{j+n} + \sum_{k=0}^{j-1, q=j+n-k, 0 =< 1 < n}{u_k*m_q}  rj=carry+tj+n+k=0j1,q=j+nk,0=<1<nukmq carry = rj /b wj = rj mod b.
- - 18 -
0 18+1+87+62+46+53=126 12 6
1 12+7+67+42+5*6=99 9 9
2 9+0+47+52=47 4 7
3 4+0+5*7=39 3 9
4 3 0 3

所以   A = ∑ j = 0 n − 1 w j ∗ b j \ A = \sum_{j=0}^{n-1}{w_j*b^j}  A=j=0n1wjbj=39796 , 符合预期。

3. curve25519-dalek中的montgomery_reduce算法细节

curve25519-dalek中的montgomery_reduce() 函数中采用的就是经过改进的算法2实现的。
其中对应关系为:
T: limbs
n:u
r:w

    /// Compute `limbs/R` (mod l), where R is the Montgomery modulus 2^260
    #[inline(always)]
    pub (crate) fn montgomery_reduce(limbs: &[u128; 9]) -> Scalar52 {

        #[inline(always)]
        fn part1(sum: u128) -> (u128, u64) {
            let p = (sum as u64).wrapping_mul(constants::LFACTOR) & ((1u64 << 52) - 1);
            ((sum + m(p,constants::L[0])) >> 52, p)
        }

        #[inline(always)]
        fn part2(sum: u128) -> (u128, u64) {
            let w = (sum as u64) & ((1u64 << 52) - 1);
            (sum >> 52, w)
        }

        // note: l3 is zero, so its multiplies can be skipped
        let l = &constants::L;

        // the first half computes the Montgomery adjustment factor n, and begins adding n*l to make limbs divisible by R
        let (carry, n0) = part1(        limbs[0]);
        let (carry, n1) = part1(carry + limbs[1] + m(n0,l[1]));
        let (carry, n2) = part1(carry + limbs[2] + m(n0,l[2]) + m(n1,l[1]));
        let (carry, n3) = part1(carry + limbs[3]              + m(n1,l[2]) + m(n2,l[1]));
        let (carry, n4) = part1(carry + limbs[4] + m(n0,l[4])              + m(n2,l[2]) + m(n3,l[1]));

        // limbs is divisible by R now, so we can divide by R by simply storing the upper half as the result
        let (carry, r0) = part2(carry + limbs[5]              + m(n1,l[4])              + m(n3,l[2]) + m(n4,l[1]));
        let (carry, r1) = part2(carry + limbs[6]                           + m(n2,l[4])              + m(n4,l[2]));
        let (carry, r2) = part2(carry + limbs[7]                                        + m(n3,l[4])             );
        let (carry, r3) = part2(carry + limbs[8]                                                     + m(n4,l[4]));
        let         r4 = carry as u64;

        // result may be >= l, so attempt to subtract l
        Scalar52::sub(&Scalar52([r0,r1,r2,r3,r4]), l)
    }

你可能感兴趣的:(curve25519-dalek中的montgomery_reduce算法细节)