一种变进制数及其应用(全排列之Hash实现)

 我们经常使用的数的进制为“常数进制”,即始终逢p进1。例如,p进制数K可表示为
    K = a0*p^0 + a1*p^1 + a2*p^2 + ... + an*p^n (其中0 <= ai <= p-1),
它可以表示任何一个自然数。

对于这种常数进制表示法,以及各种进制之间的转换大家应该是很熟悉的了,但大家可能很少听说变进制数。这里我要介绍一种特殊的变进制数,它能够被用来实现全排列的Hash函数,并且该Hash函数能够实现完美的防碰撞和空间利用(不会发生碰撞,且所有空间被完全使用,不多不少)。这种全排列Hash函数也被称为全排列数化技术。下面,我们就来看看这种变进制数。

我们考查这样一种变进制数:第1位逢2进1,第2位逢3进1,……,第n位逢n+1进1。它的表示形式为
    K = a1*1! + a2*2! + a3*3! + ... + an*n! (其中0 <= ai <= i),
也可以扩展为如下形式(因为按定义a0始终为0),以与p进制表示相对应:
    K = a0*0! + a1*1! + a2*2! + a3*3! + ... + an*n! (其中0 <= ai <= i)。
(后面的变进制数均指这种变进制数,且采用前一种表示法)

先让我们来考查一下该变进制数的进位是否正确。假设变进制数K的第i位ai为i+1,需要进位,而ai*i!=(i+1)*i!=1*(i+1)!,即正确的向高位进1。这说明该变进制数能够正确进位,从而是一种合法的计数方式。

接下来我们考查n位变进制数K的性质:
(1)当所有位ai均为i时,此时K有最大值
    MAX[K] = 1*1! + 2*2! + 3*3! + ... + n*n!
           = 1! + 1*1! + 2*2! + 3*3! + ... + n*n! - 1
           = (1+1)*1! + 2*2! + 3*3! + ... + n*n! - 1
           = 2! + 2*2! + 3*3! + ... + n*n! - 1
           = ...
           = (n+1)!-1
    因此,n位K进制数的最大值为(n+1)!-1。
(2)当所有位ai均为0时,此时K有最小值0。
因此,n位变进制数能够表示0到(n+1)!-1的范围内的所有自然数,共(n+1)!个。

在一些状态空间搜索算法中,我们需要快速判断某个状态是否已经出现,此时常常使用Hash函数来实现。其中,有一类特殊的状态空间,它们是由全排列产生的,比如N数码问题。对于n个元素的全排列,共产生n!个不同的排列或状态。下面将讨论如何使用这里的变进制数来实现一个针对全排列的Hash函数。

从数的角度来看,全排列和变进制数都用到了阶乘。如果我们能够用0到n!-1这n!个连续的变进制数来表示n个元素的所有排列,那么就能够把全排列完全地数化,建立起全排列和自然数之间一一对应的关系,也就实现了一个完美的Hash函数。那么,我们的想法能否实现呢?答案是肯定的,下面将进行讨论。

假设我们有b0,b1,b2,b3,...,bn共n+1个不同的元素,并假设各元素之间有一种次序关系 b0<b1<b2<...<bn。对它们进行全排列,共产生(n+1)!种不同的排列。对于产生的任一排列 c0,c1,c2,..,cn,其中第i个元素ci(1 <= i <= n)与它前面的i个元素构成的逆序对的个数为di(0 <= di <= i),那么我们得到一个逆序数序列d1,d2,...,dn(0 <= di <= i)。这不就是前面的n位变进制数的各个位么?于是,我们用n位变进制数M来表示该排列:
   M = d1*1! + d2*2! + ... + dn*n!
因此,每个排列都可以按这种方式表示成一个n位变进制数。下面,我们来考查n位变进制数能否与n+1个元素的全排列建立起一一对应的关系。

由于n位变进制数能表示(n+1)!个不同的数,而n+1个元素的全排列刚好有(n+1)!个不同的排列,且每一个排列都已经能表示成一个n位变进制数。如果我们能够证明任意两个不同的排列产生两个不同的变进制数,那么我们就可以得出结论:
★ 定理1 n+1个元素的全排列的每一个排列对应着一个不同的n位变进制数。

对于全排列的任意两个不同的排列p0,p1,p2,...,pn(排列P)和q0,q1,q2,...,qn(排列Q),从后往前查找第一个不相同的元素,分别记为pi和qi(0 < i <= n)。
(1)如果qi > pi,那么,
如果在排列Q中qi之前的元素x与qi构成逆序对,即有x > qi,则在排列P中pi之前也有相同元素x > pi(因为x > qi且qi > pi),即在排列P中pi之前的元素x也与pi构成逆序对,所以pi的逆序数大于等于qi的逆序数。又qi与pi在排列P中构成pi的逆序对,所以pi的逆序数大于qi的逆序数。
(2)同理,如果pi > qi,那么qi的逆序数大于pi的逆序数。
因此,由(1)和(2)知,排列P和排列Q对应的变进制数至少有第i位不相同,即全排列的任意两个不同的排列具有不同的变进制数。至此,定理1得证。

计算n个元素的一个排列的变进制数的算法大致如下(时间复杂度为O(n^2)):
template <typename T>
size_t PermutationToNumber(const T permutation[], int n)
{
    // n不能太大,否则会溢出(如果size_t为32位,则n <= 12)
    size_t result = 0;
    for (int j = 1; j < n; ++j) {
        int count = 0;
        for (int k = 0; k < j; ++k) {
            if (permutation[k] > permutation[j])
                ++count;
        }
        // factorials[j]保存着j!
        result += count * factorials[j];
    }

    return result;
}

说明:
(1)由于n!是一个很大的数,因此一般只能用于较小的n。
(2)有了计算排列的变进制数的算法,我们就可以使用一个大小为n!的数组来保存每一个排列的状态,使用排列的变进制数作为数组下标,从而实现状态的快速检索。如果只是标记状态是否出现,则可以用一位来标记状态。

最后,附上一段完整的代码来演示使用变进制数实现全排列的数化(或者Hash,随便怎么称乎了)。


ImmutableArray.h

/** * ImmutableArray.h * @Author Tu Yongce <yongce (at) 126 (dot) com> * @Created 2008-10-7 * @Modified 2008-10-7 * @Version 0.1 */ #ifndef IMMUTABLE_ARRAY_H_INCLUDED #define IMMUTABLE_ARRAY_H_INCLUDED #include <vector> #include <cassert> /* * 不可修改数组,数组元素值一旦设定即不可修改 */ template <typename T> class ImmutableArray { public: typedef T ValueType; private: ValueType m_placeHolder; std::vector<ValueType> m_data; public: /* * 构造一个ImmutableArray对象 * @param n: 数组元素个数 * @param placeHolder: 数组元素在被设定值之前的占位符 */ ImmutableArray(size_t n, ValueType placeHolder): m_placeHolder(placeHolder), m_data(n, placeHolder) { } /* * 在数组的指定位置处存储值 * @param index: 指示存储位置的数组下标,必须在范围[0, n)内 * @param value: 待存储的值,不能与构造函数使用的参数值placeHolder相同 * @return: 如果指定位置已经有值,则返回false,并且放弃存储操作;否则返回true */ bool Put(size_t index, ValueType value) { assert(index < m_data.size()); assert(value != m_placeHolder); if (m_data[index] != m_placeHolder) return false; m_data[index] = value; return true; } /* * 读取数组指定存储位置处的值 * @param index: 指示存储位置的数组下标,必须在范围[0, n)内 * @return: 如果指定位置已经有值,则返回该值; * 否则,返回构造函数使用的参数值placeHolder */ ValueType Get(size_t index) const { return m_data[index]; } /* * 查询数组指定位置处是否为空(还未存储值) * @param index: 指示存储位置的数组下标,必须在范围[0, n)内 * @return: 如果指定位置有值,则返回false;否则返回true */ bool Empty(size_t index) const { return m_data[index] == m_placeHolder; } /* * 返回数组的元素个数(即返回构造的参数n的值) */ size_t Size() const { return m_data.size(); } }; /* * 模板类ImmutableArray针对bool类型的特化类 * @note: 能够压缩存储空间,有效节约使用的内存资源 */ template<> class ImmutableArray<bool> { public: typedef bool ValueType; private: typedef unsigned char uint8_t; size_t m_size; bool m_placeHolder; std::vector<uint8_t> m_data; public: ImmutableArray(size_t n, bool placeHolder) : m_size(n), m_placeHolder(placeHolder), m_data((n + 7) / 8, (placeHolder ? 0xFF : 0x00)) { } bool Put(size_t index, bool value) { assert(index < m_size); assert(value != m_placeHolder); bool tag = (m_data[index / 8] & (uint8_t(0x01) << (index % 8))) != 0; if (tag != m_placeHolder) return false; // 指定位的0,1互换 m_data[index / 8] ^= (uint8_t(0x01) << (index % 8)); return true; } bool Get(size_t index) const { return (m_data[index / 8] & (uint8_t(0x01) << (index % 8))) != 0; } bool Empty(size_t index) const { bool tag = (m_data[index / 8] & (uint8_t(0x01) << (index % 8))) != 0; return tag == m_placeHolder; } size_t Size() const { return m_size; } }; #endif // IMMUTABLE_ARRAY_H_INCLUDED

 

ImmutableArray_example.cpp

/** * ImmutableArray_example.cpp * @Author Tu Yongce <yongce (at) 126 (dot) com> * @Created 2008-10-7 * @Modified 2008-10-7 * @Version 0.1 */ #include <iostream> #include "ImmutableArray.h" #include "Assure.h" using namespace std; ANONYMOUS_NAMESPACE_START class UnitTest { private: ostream &m_log; public: UnitTest(ostream &log): m_log(log) { m_log << "TestImmutableArray Start.../n"; DoTest1(); DoTest2(); DoTest3(); m_log << "TestImmutableArray End/n/n"; } private: void DoTest1() { m_log << "DoTest1 Start.../n"; try { const size_t NUM = 0x100000; // 2^20 ImmutableArray<bool> arr(NUM, false); Assure(m_log, arr.Size() == NUM); for (size_t i = 0; i < NUM; ++i) { Assure(m_log, arr.Empty(i)); Assure(m_log, arr.Get(i) == false); Assure(m_log, arr.Put(i, true)); Assure(m_log, !arr.Empty(i)); Assure(m_log, arr.Get(i) == true); Assure(m_log, !arr.Put(i, true)); } } catch (AssureException) { } m_log << "DoTest1 End/n"; } void DoTest2() { m_log << "DoTest2 Start.../n"; try { const size_t NUM = 0x100000; // 2^20 ImmutableArray<bool> arr(NUM, true); Assure(m_log, arr.Size() == NUM); for (size_t i = 0; i < NUM; ++i) { Assure(m_log, arr.Empty(i)); Assure(m_log, arr.Get(i) == true); Assure(m_log, arr.Put(i, false)); Assure(m_log, !arr.Empty(i)); Assure(m_log, arr.Get(i) == false); Assure(m_log, !arr.Put(i, false)); } } catch (AssureException) { } m_log << "DoTest2 End/n"; } void DoTest3() { m_log << "DoTest3 Start.../n"; try { const size_t NUM = 0x100000; // 2^20 ImmutableArray<int> arr(NUM, -1); Assure(m_log, arr.Size() == NUM); for (size_t i = 0; i < NUM; ++i) { Assure(m_log, arr.Empty(i)); Assure(m_log, arr.Get(i) == -1); Assure(m_log, arr.Put(i, i)); Assure(m_log, !arr.Empty(i)); Assure(m_log, arr.Get(i) == i); Assure(m_log, !arr.Put(i, i)); } } catch (AssureException) { } m_log << "DoTest3 End/n"; } }; #ifdef SYMBOL_DO_TEST UnitTest obj(std::clog); // do test #endif // SYMBOL_DO_TEST ANONYMOUS_NAMESPACE_END

 

Assure.h

/** * Assure.h * @Author Tu Yongce <yongce (at) 126 (dot) com> * @Created 2008-1-1 * @Modified 2008-1-1 * @Version 0.1 */ #ifndef ASSURE_H_INCLUDED #define ASSURE_H_INCLUDED #include <ostream> #include <exception> class AssureException: public std::exception { }; #define Assure(os, x) (void)((!!(x)) || (ShowFailedMessage(os, #x, __FILE__, __LINE__), 0)) inline void ShowFailedMessage(std::ostream &os, const char* expr, const char *file, size_t line) { os << "Failed: " << expr << ", file /"" << file << "/", line " << line << '/n'; throw AssureException(); } #define ANONYMOUS_NAMESPACE_START namespace { #define ANONYMOUS_NAMESPACE_END } #endif // ASSURE_H_INCLUDED

 

 

PermutationMap.h

/** * PermutationMap.h * @Author Tu Yongce <yongce (at) 126 (dot) com> * @Created 2008-10-7 * @Modified 2008-10-7 * @Version 0.1 */ #ifndef PERMUTATION_MAP_H_INCLUDED #define PERMUTATION_MAP_H_INCLUDED #include <vector> #include <stdexcept> #include <cassert> #include "ImmutableArray.h" template <typename T> class PermutationMap { public: typedef T ValueType; private: int m_num; std::vector<size_t> m_factorials; ImmutableArray<ValueType> m_data; public: // n <= 12, 12! = 479001600 PermutationMap(int n, ValueType initValue): m_num(n), m_factorials(n, 0), m_data(CaclFactorial(), initValue) { } template <typename ElemType> bool Put(const ElemType permutation[], ValueType value) { return m_data.Put(PermutationToNumber(permutation), value); } template <typename ElemType> bool Put(const std::vector<ElemType> &permutation, ValueType value) { assert(permutation.size() == m_num); return Put(&permutation[0], value); } template <typename ElemType> ValueType Get(const ElemType permutation[]) const { return m_data.Get(PermutationToNumber(permutation)); } template <typename ElemType> ValueType Get(const std::vector<ElemType> &permutation) const { assert(permutation.size() == m_num); return Get(&permutation[0]); } size_t Size() const { return m_factorials[m_num - 1]; } private: size_t CaclFactorial() { m_factorials[0] = 1; for (int i = 2; i <= m_num; ++i) { if (m_factorials[i - 2] * i / i != m_factorials[i - 2]) throw std::overflow_error("overflow in PermutationMap::CaclFactorial"); m_factorials[i - 1] = m_factorials[i - 2] * i; } return m_factorials[m_num - 1]; } template <typename ElemType> size_t PermutationToNumber(const ElemType permutation[]) const { size_t result = 0; for (int i = 1; i < m_num; ++i) { int count = 0; for (int k = 0; k < i; ++k) { if (permutation[k] > permutation[i]) ++count; } result += count * m_factorials[i - 1]; } return result; } }; #endif // PERMUTATION_MAP_H_INCLUDED

 

PermutationMap_example.cpp

/** * PermutationMap_example.cpp * @Author Tu Yongce <yongce (at) 126 (dot) com> * @Created 2008-10-7 * @Modified 2008-10-7 * @Version 0.1 */ #include <iostream> #include <algorithm> #include "PermutationMap.h" #include "Assure.h" using namespace std; ANONYMOUS_NAMESPACE_START class UnitTest { private: ostream &m_log; public: UnitTest(ostream &log): m_log(log) { m_log << "TestPermutationMap Start.../n"; DoTest1(); DoTest2(); DoTest3(); m_log << "TestPermutationMap End/n/n"; } private: void DoTest1() { m_log << "DoTest1 Start.../n"; try { PermutationMap<bool> permMap(9, false); char perm[9] = {'1', '2', '3', '4', '5', '6', '7', '8', '9'}; Assure(m_log, permMap.Get(perm) == false); Assure(m_log, permMap.Put(perm, true)); Assure(m_log, permMap.Get(perm) == true); Assure(m_log, !permMap.Put(perm, true)); int count = 1; while (next_permutation(perm, perm + 9)) { Assure(m_log, permMap.Get(perm) == false); Assure(m_log, permMap.Put(perm, true)); Assure(m_log, permMap.Get(perm) == true); Assure(m_log, !permMap.Put(perm, true)); ++count; } Assure(m_log, count == permMap.Size()); } catch (AssureException) { } m_log << "DoTest1 End/n"; } void DoTest2() { m_log << "DoTest2 Start.../n"; try { PermutationMap<bool> permMap(9, true); char perm[9] = {'1', '2', '3', '4', '5', '6', '7', '8', '9'}; Assure(m_log, permMap.Get(perm) == true); Assure(m_log, permMap.Put(perm, false)); Assure(m_log, permMap.Get(perm) == false); Assure(m_log, !permMap.Put(perm, false)); int count = 1; while (next_permutation(perm, perm + 9)) { Assure(m_log, permMap.Get(perm) == true); Assure(m_log, permMap.Put(perm, false)); Assure(m_log, permMap.Get(perm) == false); Assure(m_log, !permMap.Put(perm, false)); ++count; } Assure(m_log, count == permMap.Size()); } catch (AssureException) { } m_log << "DoTest2 End/n"; } void DoTest3() { m_log << "DoTest3 Start.../n"; try { PermutationMap<int> permMap(9, 0); char perm[9] = {'1', '2', '3', '4', '5', '6', '7', '8', '9'}; int count = 1; Assure(m_log, permMap.Get(perm) == 0); Assure(m_log, permMap.Put(perm, count)); Assure(m_log, permMap.Get(perm) == count); Assure(m_log, !permMap.Put(perm, count)); while (next_permutation(perm, perm + 9)) { ++count; Assure(m_log, permMap.Get(perm) == 0); Assure(m_log, permMap.Put(perm, count)); Assure(m_log, permMap.Get(perm) == count); Assure(m_log, !permMap.Put(perm, count)); } Assure(m_log, count == permMap.Size()); } catch (AssureException) { } m_log << "DoTest3 End/n"; } void DoTest4() { m_log << "DoTest4 Start.../n"; try { PermutationMap<int> permMap(9, 0); char data[9] = {'1', '2', '3', '4', '5', '6', '7', '8', '9'}; vector<char> perm(data, data + 9); int count = 1; Assure(m_log, permMap.Get(perm) == 0); Assure(m_log, permMap.Put(perm, count)); Assure(m_log, permMap.Get(perm) == count); Assure(m_log, !permMap.Put(perm, count)); while (next_permutation(perm.begin(), perm.end())) { ++count; Assure(m_log, permMap.Get(perm) == 0); Assure(m_log, permMap.Put(perm, count)); Assure(m_log, permMap.Get(perm) == count); Assure(m_log, !permMap.Put(perm, count)); } Assure(m_log, count == permMap.Size()); } catch (AssureException) { } m_log << "DoTest4 End/n"; } }; #ifdef SYMBOL_DO_TEST UnitTest obj(std::clog); // do test #endif // SYMBOL_DO_TEST ANONYMOUS_NAMESPACE_END

 

 

十进制数 <--> 变进制数 <--> 排列”之间的转换算法

#include <iostream> #include <iterator> #include <vector> #include <algorithm> #include <cassert> using namespace std; // 把十进制数转换为变进制数,并返回变进制数的位数 // 变进制数varNumber[0]对应着变进制数的最低位 int DecimalToVariableRadix(size_t decimalNumber, vector<int> &varNumber) { varNumber.clear(); int carry = 2; while (decimalNumber > 0) { varNumber.push_back(decimalNumber % carry); decimalNumber /= carry; ++carry; } if (varNumber.empty()) varNumber.push_back(0); return varNumber.size(); } // 把十进制数转换为指定位数的变进制数(高位填充0),并返回变进制数的实际有效位数 // 如果产生的变进制数的位数比指定的位数要多,则指定位数不起作用 // 变进制数varNumber[0]对应着变进制数的最低位 int DecimalToVariableRadix(size_t decimalNumber, vector<int> &varNumber, int num) { varNumber.clear(); int carry = 2; while (decimalNumber > 0) { varNumber.push_back(decimalNumber % carry); decimalNumber /= carry; ++carry; } int size = varNumber.size(); if (size < num) varNumber.insert(varNumber.end(), num - size, 0); return size; } // 把变进制数转换为十进制数 // 变进制数varNumber[0]对应着变进制数的最低位 size_t VariableRadixToDecimal(const int varNumber[], int num) { size_t factor = 1; size_t result = 0; for (int i = 0; i < num; ++i) { result += varNumber[i] * factor; factor *= i + 2; } return result; } // 把排列转换为变进制数,变进制数的高位可能会出现多个0 // 变进制数varNumber[0]对应着变进制数的最低位 template <typename ElemType> void PermutationToVariableRadix(const ElemType permutation[], int num, vector<int> &varNumber) { for (int i = 1; i < num; ++i) { int count = 0; for (int k = 0; k < i; ++k) { if (permutation[k] > permutation[i]) ++count; } varNumber.push_back(count); } } // 把变进制数转换为排列,要求传入的排列元素集合是有序的(升序) // 并且要求变进制数的位数(包括高位的0)刚好比排列元素少一 // 变进制数varNumber[0]对应着变进制数的最低位 template <typename ElemType> void VariableRadixToPermutation(const int varNumber[], int num, ElemType perm[]) { for (int k = num - 1; k >= 0; --k) { // 交换当前待排子集中第(varNumber[k] + 1)大元素和它后面的子序列 int m = k + 1; // 当前待排子集中最后一个元素下标 int j = m - varNumber[k]; // 当前待排子集中第(varNumber[k] + 1)大元素 #if 0 // 实现std::rotate的功能 ElemType tmp = perm[j]; for (; j < m; ++j) perm[j] = perm[j + 1]; perm[m] = tmp; #else rotate(perm + j, perm + j + 1, perm + m + 1); #endif } } ////////////////////////////////////////////////////////////////////////////////////// class AssureException: public std::exception { }; #define Assure(os, x) (void)((!!(x)) || (ShowFailedMessage(os, #x, __FILE__, __LINE__), 0)) inline void ShowFailedMessage(std::ostream &os, const char* expr, const char *file, size_t line) { os << "Failed: " << expr << ", file /"" << file << "/", line " << line << '/n'; throw AssureException(); } void ShowUsage1() { try { size_t num = 235; vector<int> varNumber; DecimalToVariableRadix(num, varNumber); cout << "Decimal number: " << num; cout << "/nConverted to variable radix number (low -> high): "; copy(varNumber.begin(), varNumber.end(), ostream_iterator<int>(cout, " ")); size_t newNum = VariableRadixToDecimal(&varNumber[0], varNumber.size()); cout << "/nConverted back to decimal number: " << newNum << '/n'; Assure(cout, num == newNum); cout << endl; } catch (AssureException) { } } void ShowUsage2() { try { char perm[] = {'d', 'e', 'a', 'b', 'f', 'c', 'g'}; const int NUM = sizeof(perm) / sizeof(perm[0]); vector<int> varNumber; PermutationToVariableRadix(perm, NUM, varNumber); cout << "Permutation: "; copy(perm, perm + NUM, ostream_iterator<char>(cout)); cout << "/nConverted to variable radix number (low -> high): "; copy(varNumber.begin(), varNumber.end(), ostream_iterator<int>(cout, " ")); char newPerm[NUM] = {'a', 'b', 'c', 'd', 'e', 'f', 'g'}; VariableRadixToPermutation(&varNumber[0], varNumber.size(), newPerm); cout << "/nConverted back to permutation: "; copy(newPerm, newPerm + NUM, ostream_iterator<char>(cout)); cout << '/n'; Assure(cout, equal(perm, perm + NUM, newPerm)); cout << endl; } catch (AssureException) { } } void Test() { try { cout << "testing /"permutation -> variable radix -> decimal -> " "variable radix -> permutation/"..."; const int NUM = 9; char perm[NUM] = {'1', '2', '3', '4', '5', '6', '7', '8', '9'}; do { // permutation will be converted to variable radix number vector<int> varNumber; PermutationToVariableRadix(perm, NUM, varNumber); // variable radix number will be converted to decimal number size_t decimalNumber = VariableRadixToDecimal(&varNumber[0], varNumber.size()); // decimal number will be converted back to variable radix number vector<int> newVarNumber; DecimalToVariableRadix(decimalNumber, newVarNumber, NUM - 1); // variable radix number will be converted back to permutation char newPerm[NUM] = {'1', '2', '3', '4', '5', '6', '7', '8', '9'}; VariableRadixToPermutation(&newVarNumber[0], newVarNumber.size(), newPerm); Assure(cout, equal(varNumber.begin(), varNumber.end(), newVarNumber.begin())); Assure(cout, equal(perm, perm + NUM, newPerm)); } while (next_permutation(perm, perm + NUM)); cout << "done. Ok!" << endl; } catch (AssureException) { } } int main() { ShowUsage1(); ShowUsage2(); Test(); }

 

 

 

你可能感兴趣的:(exception,算法,File,存储,immutable,permutation)