C++: 详解 使用Schrage's Method Revealed 实现随机化算法

C++: 详解 使用Schrage’s Method Revealed(随机化算法)

标签: C++ 随机化算法 Schrage

by 小威威

在了解Schrage’s Method Revealed(随机化算法)之前,我们先来了解C++中随机数必备的基础知识。

1.随机数

在C++中,随机数的产生是依靠某种方法实现的,因为在计算机上无法实现随机数真正的随机性,因此C++中产生的随机数也可以称作 pseudo-random numbers(伪随机数)。其实将这些随机数统计起来,会发现这些数字满足了某些相同的性质(当然这些性质不是那么容易看出的,但是一定会有共同点,只是我们单看数据看不出来)。所以,现在我们要研究的是怎么实现随机数产生的方法。

做过贪吃蛇的同学应该有印象(C语言:贪吃蛇),要产生随机数,必须设置随机数种子(seed)。随机数种子相同,产生的随机数也就相同;随机数种子不同,产生的随机数也就不同。因此我们需要想到方法来产生不同的随机数种子。

现在我有两种方法。第一种就是我们在贪吃蛇里用到的方法。那就是调用<time.h>里的clock()函数,将返回的整数作为随机数种子。这个整数是指从程序运行开始计算,到调用这个函数所经过的秒数。但是这里有个问题,一旦我们产生随机数种子的周期小于1s,那么就会产生一系列相同的随机数。总而言之,当产生随机数的周期非常非常小时,用<time.h>已无法满足这一需求。因此,我们采用第二种方法,也就是较为简单的方法,那就是用现有的随机数种子来产生新的种子,也就是Linear-Congruential generator(线性同余数发生器)。这个发生器在下文会详细讲述,用到的公式如下:

// formula
seed = (a*seed) mod m;
// or this one:
seed = (a*seed + c) mod m;
// 其中,a指multiply,c指increment,M指modulus

要产生随机数种子,需要 generators(随机种子发生器) and distributions(分配范围限制)两个”工具”。其实C++中<random>能实现随机数产生。

下面引用cplusplus.com中对<random>的介绍。

———————————————<random>———————————————–
Random
This header introduces random number generation facilities.

This library allows to produce random numbers using combinations of generators and distributions:
Generators: Objects that generate uniformly distributed numbers.
Distributions: Objects that transform sequences of numbers generated by a generator into sequences of numbers that follow a specific random variable distribution, such as uniform, Normal or Binomial.

Distribution objects generate random numbers by means of their operator() member, which takes a generator object as argument:

std::default_random_engine generator;
std::uniform_int_distribution<int> distribution(1,6);
int dice_roll = distribution(generator);  // generates number in the range 1..6 

For repeated uses, both can be bound together:

auto dice = std::bind ( distribution, generator );
int wisdom = dice()+dice()+dice();

上面的代码中有一处需要解释一下,如下:

int dice_roll = distribution(generator);

这一句中的distribution(generator)是使用了运算符重载,重载了()运算符。这在cplusplus.com也给出了解释:

std::uniform_int_distribution::operator()

(1) 
template<class URNG>
result_type operator()(URNG& g);
(2) 
template<class URNG>
result_type operator()(URNG& g, const param_type& parm);

Generate random number
Returns a new random number that follows the distribution’s parameters associated to the object (version 1) or those specified by parm (version 2).

The generator object (g) supplies uniformly-distributed random integers through its operator() member function. The uniform_int_distribution object transforms the values obtained this way so that successive calls to this member function with the same arguments produce values that follow a uniform distribution within the appropriate range.

Parameters
g: A uniform random number generator object, used as the source of randomness.
URNG shall be a uniform random number generator type, such as one of the standard generator classes.
parm: An object representing the distribution’s parameters, obtained by a call to member function param. param_type is a member type.

Return Value
A new random number.
result_type is a member type, defined as an alias of the first class template parameter (IntType).

Example

// uniform_int_distribution::operator()
#include <iostream>
#include <chrono>
#include <random>

int main()
{
  // construct a trivial random generator engine from a time-based seed:
  unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
  std::default_random_engine generator (seed);

  std::uniform_int_distribution<int> distribution(1,10);

  std::cout << "some random numbers between 1 and 10: ";
  for (int i=0; i<10; ++i)
    std::cout << distribution(generator) << " ";

  std::cout << std::endl;

  return 0;
}

输出结果:

some random numbers between 1 and 10: 3 2 1 2 7 10 6 2 4 8

2.Schrage’s Method Revealed的应用

刚刚我提到了产生随机数种子的两种方法,现在我讲详细介绍第二种方法,也就是运用随机数种子产生随机数种子:

// formula
seed = (A*seed) mod M; (1)
// or this one:
seed = (A*seed + C) mod M; (2)
// 其中,A指multiply,C指increment,M指modulus

第一个随机数种子可以来源于系统时间,也可以来自用户输入。

其实M决定了随机数循环一个周期。如令M = 11, A = 7, seed = 1,那么我用公式(1)产生的一系列随机数如下:
7, 5, 2, 3, 10, 4, 6, 9, 8, 1, 7,5, 2 …

不难发现随机数循环的周期等于M-1。因此,为了产生更多的随机数,我们应该将M设置的大一些,一般默认就设置为2^31-1 = 2147483647(int 存储的最大数,此处指int的大小为4个字节)。如果要使其富有移植性,可将M定义为2^(B)-1,B取决于系统的位数。

但是这里有一个问题,如果当A非常大,可能会导致乘法运算的溢出。这样虽然不是一个错误,但是会破坏随机数的伪随机性。因此,我们必须找到一种方法来避免数据溢出。

很幸运,有人已经发明了这种算法,那就是:Schrage’s Method Revealed算法。这一算法将公式转换成了另一种形式,避免了数据溢出的发生。

现在直接给出运用Schrage’s Method Revealed后的公式,稍后我将以图片的形式呈现推导过程:

// Q = M / A, R = M mod A (R < Q)
// I use x instead seed for convenient of writing

// formula(1)
x = (a*x) mod m
   = a*(x mod Q) - R*[xi/Q];
if (x > 0) result_seed = x;
else result_seed = x + m;

// formula(2)
x = (a*x) mod m
   = a*(x mod Q) - R*[xi/Q] + C;
if (x > 0) result_seed = x;
else result_seed = x + m;

推导过程如下:
C++: 详解 使用Schrage's Method Revealed 实现随机化算法_第1张图片

如此,我们就将公式换成另一种形式,巧妙避免了数据的溢出。

3. 实例 (Author:林楚庭)

Now we are ready to make a program to generate pseudo-random numbers.
As we all know, the random number that C++ makes is pseudo,
When the seed is unchanged, everytime you run the program, it will
make the same sequence of random numbers. But now, you are invited to
challenge yourself to write the program, it will takes you some time.

For details: http://www.cplusplus.com/reference/random/

First you need to know is the idea of the method:
1. Linear-Congruential: (a * x + c) % m, a > 0, m > 0, m % a < m / a.
This formulus is a linear function to generate random numbers
a - multiplier | x - seed | c - increment | m - modulus
note: You may know that every random-number-engin generater need a seed.
2. calculation of (a * x + c) % m.
This formula need to avoid integer overflow, that means when x is
very big like 2147483646, it should still return the right answer.
The algorithm is very well-known and you should find by yourself.

Then, you should learn something about ‘mod’ and ‘linear_congruential_engine’
1. class mod is a model for linear_congruential_engine, which realize the
formula “(a * x + c) % m” in calc();
2. class linear_congruential_engine is a generater which sets seed and make
random-number with its public member ‘mod_temp’

———- mod_my ———-

int m, a, c; // This define the three parameters for the formula.

mod_my(int _m, int _a, int _c); // Constructer, initialize three params.

int calc(int x); // Caculator, take x as a seed to make number and return.

———- linear_congruential_engine ———-

int multiplier, increment, modulus; // Correspond to a, c, m
// Initialize to 16807, 1, 2147483647 as default.

unsigned int default_seed_my, seed_my; // Initialize to 1u as default.

mod_my mod_temp; // It is the model for this engin.

linear_congruential_engine_my(); // Default constructer.

linear_congruential_engine_my(int _m, int _a, int _c, int _s);

void seed(unsigned int); // Set seed.

int min(); // Return the least bound of the range

int max(); // Return the most bound of the range.

void discard(unsigned long long); // Discard the generator.
// Use its own seed to generate x random numbers (x is the input param).

int operator()(); // Overload the ‘()’

//main.cpp

#include <iostream>
#include "random_my.h"
using namespace std;
using namespace RAND_GEN;

void test_calc() {
    mod_my mod_1(9223372036854775807, 16807, 1);
    if (mod_1.calc(9223372036854775) != 7443261233741790514 ||
        mod_1.calc(922337203685477580) != 6456360425798331301 ||
        mod_1.calc(9223372036852222220) != 9223371993936639099 ||
        mod_1.calc(922337203685473330) != 6456360425726901551 ||
        mod_1.calc(9223372022254775806) != 9223126654654759001)
        cout << "Your calc() is wrong.\n";
    else cout << "Pass all tests for calc().\n";
}

void test_engin() {
    linear_congruential_engine_my lce;
    int count = 1000;
    int num[1001] = {0};
    while (count--) num[lce()%5]++;
    if (num[0] != 216 || num[1] != 190 ||
        num[2] != 203 || num[3] != 216 ||
        num[4] != 175) {
        cout << "Your engin class is wrong in generator.\n";
        return;
    } else if (lce.min() != (lce.increment == 0u ? 1u : 0u)) {
        cout << "Your engin class is wrong in min().\n";
        return;
    } else if (lce.max() != (lce.modulus - 1u)) {
        cout << "Your engin class is wrong in max().\n";
        return;
    }
    else cout << "Pass all tests for class engin.\n";
}

void hard_test() {
    long long m, a, c, n, num[5] = {0};
    unsigned long long s;
    unsigned long long discard_n;
    cin >> m >> a >> c >> s;
    mod_my mod_1(m, a, c);
    for (int i = 0; i < 5; i++) {
        cin >> n;
        cout << "(MOD CALC) ";
        cout << n << ": " << mod_1.calc(n) << endl;
    }
    linear_congruential_engine_my lce(m, a, c, s);
    cin >> discard_n;
    lce.discard(discard_n);
    cin >> n;
    while (n--) num[lce()%5]++;
    for (int i = 0; i < 5; i++) {
        cout << "(ENGIN) ";
        cout << i << ": " << num[i] << endl;
    }
}

int main() {
    int t;
    cin >> t;
    if (t == 0) {
        test_calc();
        test_engin();
    } else {
        hard_test();
    }
    return 0;
}

为防止同校同学copy代码提交而造成抄袭后果,此处暂时给出标程答案。在harddue过后会补充本人代码!标程的实现较为麻烦!
// random_my.h

#ifndef RANDOM_MY_H
#define RANDOM_MY_H

namespace RAND_GEN {
class mod_my {
  public:
    long long m, a, c;
    mod_my(long long _m, long long _a, long long _c) : m(_m), a(_a), c(_c) {}

    // General case for x = (ax + c) mod m -- use Schrage's algorithm
    // to avoid integer overflow.
    // (ax + c) mod m can be rewritten as:
    // a(x mod q) - r(x / q) if >= 0
    // a(x mod q) - r(x / q) otherwise
    // where: q = m / a , r = m mod a
    //
    // Preconditions: a > 0, m > 0.
    //
    // Note: only works correctly for m % a < m / a.
    long long calc(long long x) {
        if (a == 1) {
            x %= m;
        } else {
            long long q = m / a;
            long long r = m % a;
            long long t1 = a * (x % q);
            long long t2 = r * (x / q);
            if (t1 >= t2) x = t1 - t2;
            else x = m - t2 + t1;
        }
        if (c != 0) {
            const long long d = m - x;
            if (d > c) x += c;
            else x = c - d;
        }
        return x;
    }
};

class linear_congruential_engine_my {
  public:
    long long multiplier, increment, modulus;
    unsigned long long default_seed_my, seed_my;
    mod_my mod_temp;

    linear_congruential_engine_my()
    : multiplier(16807), increment(1), modulus(2147483647)
    , default_seed_my(1u), mod_temp(modulus, multiplier, increment)
    { seed(default_seed_my); }

    linear_congruential_engine_my(long long _m, long long _a,
    long long _c, long long _s)
    : multiplier(_a), increment(_c), modulus(_m)
    , default_seed_my(_s), mod_temp(modulus, multiplier, increment)
    { seed(default_seed_my); }

    void seed(unsigned long long s)
    { seed_my = s; }

    long long min()
    { return  increment == 0u ? 1u : 0u; }

    long long max()
    { return modulus - 1u; }

    void discard(unsigned long long z)
    { for (; z != 0ULL; --z) (*this)(); }

    long long operator()() {
        seed_my = mod_temp.calc(seed_my);
        return seed_my;
    }
};

}  // namespace RAND_GEN

#endif

以上内容皆为本人观点,欢迎大家提出批评和指导,我们一起探讨!

你可能感兴趣的:(C++,随机数,随机化算法,Schrage算法)