Multithreading C++ Out of Core Sotring for Massive Data|多线程C++的大规模数据外部排序

先说一下,这个其实是我为实现PantaRay或者是类似Dreamworks的Out of Core点云GI的技术储备,为大规模点云光线跟踪所准备的第一步。在实际的应用中,int类型会被64bit的uint64_t所代替,代表空间中的一个hash键。所有的代码全部使用STL+boost实现了足够高层次的抽象,读者完全可以根据自己的需要改写。

This is the first step to implement the PantaRay or the GI solution from Dreamworks about Out-Core point cloud sorting. Actually the int type in the code would be replaced by he uint64_t which indices a hash key in space. All fragments code are using the STL+Boost, user can modify the code by yourself.

我们先来准备测试数据。这个测试数据有尺寸大小的限制,就是在现在x86_64环境下malloc/new分配的单个数组有1G尺寸的限制,这样就意味着内排序一次操作的数据不可能大于1G,造成了测试上的限制,所以我只生成了一个尺寸大约962M的文件测试,包含了246324610个int。

First of all, let’s prepare the test data. But as we know, there is the 1G array size limitation in x86_64, so that we can only apply qsort or std::stable_sort to a < 1G array. For this test I generate a 962M file which contains the 246324610 integers.

如下程序生成测试数据,均匀分布的Mersenne Twister 19937序列。

The following program generates the test data, using the MT19937 uniform distribution.

Multithreading C++ Out of Core Sotring for Massive Data|多线程C++的大规模数据外部排序
#include <iostream>



#include <boost/random/mersenne_twister.hpp>

#include <boost/random/uniform_int_distribution.hpp>



int main(int argc, char *argv[])

{

    -- argc, ++ argv;

    if (argc != 2)

    {

        return 1;

    }

    char * szPath = argv[0];

    int iCount = atoi(argv[1]);

    std::cout << szPath << " " << iCount << std::endl;



    boost::random::mt19937 cGen;

    boost::random::uniform_int_distribution<> cDist(0, 99999999);



    FILE * pFile = fopen(szPath, "wb");

    if (pFile)

    {

        for (int i = 0; i < iCount; ++ i)

        {

            int iRandom = cDist(cGen);

            fwrite(& iRandom, sizeof(int), 1, pFile);

        }



        fclose(pFile);

    }

    return 0;

}
View Code

然后生成内排序的结果,储存为外部独立文件为了比较。

Generate the internal sorted result to verify the data.

Multithreading C++ Out of Core Sotring for Massive Data|多线程C++的大规模数据外部排序
int main(int argc, char * argv[])

{

    PlaySTL();



    -- argc, ++ argv;

    if (argc != 2)

    {

        return EXIT_FAILURE;

    }



    FILE * pOriginalFile = fopen(argv[0], "rb");

    fseek(pOriginalFile, 0, SEEK_END);

    long lSize = ftell(pOriginalFile);

    fseek(pOriginalFile, 0, SEEK_SET);



    int iNumItems = lSize / 4;

    int * pData = new int[iNumItems];

    fread(pData, sizeof(int), iNumItems, pOriginalFile);

    fclose(pOriginalFile);

    std::stable_sort(pData, pData + iNumItems, std::less<int>());

    

    FILE * pSortedFile = fopen(argv[1], "wb");

    fwrite(pData, sizeof(int), iNumItems, pSortedFile);

    fclose(pSortedFile);



    delete [] pData;



    return EXIT_SUCCESS;

}
View Code

从设计的思路上,由于操作系统在磁盘IO上都是单线程的,每次只允许一个线程读写,所以把读取的工作部分都放在主线程中,排序线程为了让磁盘写入的时间占据总共处理的时间尽可能地小,所以尽可能的让一个工作线程处理更多的数据。

Because the disk access is synchronized at low-level IO, so that we will read the data in the main thread, the working thread process as much as data as possible to reduce the percent  of time on disk writing.

先让我们定义一个名字叫做Job的类,顾名思义,代表一个计算任务,每个计算任务都有一个自己的索引,以及一堆乱序的整数int数据。

Let’s define a Job class, each Job has a index and unsorted data.

Multithreading C++ Out of Core Sotring for Massive Data|多线程C++的大规模数据外部排序
class Job

{

public:



    Job()

    :

    m_iIndex(0),

    m_iNumItems(0)

    {

    }



    Job(int iIndex, int iNumItems, const boost::shared_array<int> & aData)

    :

    m_iIndex(iIndex),

    m_iNumItems(iNumItems),

    m_aData(aData)

    {

    }



    Job(const Job & cCopy)

    :

    m_iIndex(cCopy.m_iIndex),

    m_iNumItems(cCopy.m_iNumItems),

    m_aData(cCopy.m_aData)

    {

    }



public:



    int m_iIndex;

    int m_iNumItems;

    boost::shared_array<int> m_aData;

};
View Code

然后再来一个Context,负责存储用于计算的共享数据,比如工作队列,以及Mutex等为了同步所需要的对象。

Later the Context class, to keep the queue and mutex objects.

Multithreading C++ Out of Core Sotring for Massive Data|多线程C++的大规模数据外部排序
class Context

{

public:



    Context(int iNumSortingThread)

    :

    m_iNumSortingThread(iNumSortingThread),

    m_bHasMoreData(true)

    {

    }



public:



    int m_iNumSortingThread;



    bool m_bHasMoreData;



    boost::mutex m_cMutex;

    boost::condition_variable m_cEvent;



    std::list<Job > m_lJobQueue;

};
View Code

这里是工作线程,其中有工作代码的实现。当访问Context中的队列时必须要加锁,抓一个工作包出来,当作局部数据,接下来再排序和写出为Cache,末了尽可能贪婪的告诉主线程我们需要更多的数据,如果真的是没有任何数据了则直接退出。

Here is the working thread, it will get a Job from the queue, sort the data, and write out, at the end, tell the main thread it needs more data to process, if there is no more data it will return.

Multithreading C++ Out of Core Sotring for Massive Data|多线程C++的大规模数据外部排序
class SortingThread : public boost::thread

{

public:



    SortingThread(const boost::shared_ptr<Context> & pContext)

    :

    m_pContext(pContext),

    boost::thread(boost::bind(& SortingThread::Sort, this))

    {

    }



    void Sort()

    {

        while (1)

        {

            if (! m_pContext->m_bHasMoreData)

            {

                if (! m_pContext->m_lJobQueue.size())

                {

                    break;

                }

            }



            Job cJob;

            {

                boost::unique_lock<boost::mutex> cLock(m_pContext->m_cMutex);

                if (m_pContext->m_lJobQueue.size())

                {

                    // Get a job.

                    //

                    cJob = m_pContext->m_lJobQueue.front();

                    m_pContext->m_lJobQueue.pop_front();

                }

            }



            if (cJob.m_iNumItems)

            {

                std::stable_sort(cJob.m_aData.get(), cJob.m_aData.get() + cJob.m_iNumItems, std::less<int>());

                

                // Write out the sorted data.

                //

                char aBuffer[256];

                sprintf(aBuffer, "%.06d.tmp", cJob.m_iIndex);

                std::ofstream cOutput(aBuffer, std::ios_base::binary);

                cOutput.write(reinterpret_cast<const char *>(cJob.m_aData.get()), cJob.m_iNumItems * sizeof(int));

            }



            // Tell the main thread we need more data here.

            //

            m_pContext->m_cEvent.notify_one();

        }

    }



private:



    boost::shared_ptr<Context> m_pContext;

};
View Code

把所有的线程都放入线程池,这样就可以一股脑的执行了。

The simple thread pool.

Multithreading C++ Out of Core Sotring for Massive Data|多线程C++的大规模数据外部排序
class SortingThreadGroup : public boost::thread_group

{

public:



    SortingThreadGroup(const boost::shared_ptr<Context> & pContext)

    :

    m_pContext(pContext)

    {

        for (int i = 0; i < m_pContext->m_iNumSortingThread; ++ i)

        {

            SortingThread * pSortingThread = new SortingThread(pContext);

            add_thread(pSortingThread);

        }

    }



private:



    boost::shared_ptr<Context> m_pContext;

};
View Code

主线程从外部文件读取数据填充Job对象,尽可能的把整个队列的数据控制在一定得范围内,这样内存的占用可以小一些,否则就失去了外排序的意义。

Main thread reads data from file, fills the Job, and keep the memory usage minimal.

Multithreading C++ Out of Core Sotring for Massive Data|多线程C++的大规模数据外部排序
bool Sort(const char * szPath, int iNumSortingThreads, int iNumLocalItems)

{

    try

    {

        // Calculate real size.

        //

        std::ifstream cUnSortedFile(szPath, std::ios_base::binary);

        boost::uintmax_t ullSize = boost::filesystem::file_size(szPath);

        boost::uintmax_t ullNumItems = ullSize / 4;



        int iNumBatches = ullNumItems / iNumLocalItems;

        std::vector<int> vNumItemsPerBatch(iNumBatches, iNumLocalItems);

        int iNumRestItems = ullNumItems % iNumLocalItems;

        if (iNumRestItems)

        {

            vNumItemsPerBatch.push_back(iNumRestItems);

        }

        std::cout << "Number of Items   : " << ullNumItems << std::endl

                  << "Number of Batches : " << vNumItemsPerBatch.size() << std::endl;



        boost::shared_ptr<Context> pContext(new Context(iNumSortingThreads));

        boost::scoped_ptr<SortingThreadGroup> pSortingThreadGroup(new SortingThreadGroup(pContext));



        boost::timer::auto_cpu_timer cTimer;

        for (int i = 0; i < vNumItemsPerBatch.size(); ++ i)

        {

            boost::shared_array<int> aData(new int[vNumItemsPerBatch[i]]);

            cUnSortedFile.read(reinterpret_cast<char *>(aData.get()), vNumItemsPerBatch[i] * sizeof(int));



            Job cJob(i, vNumItemsPerBatch[i], aData);



            //

            boost::unique_lock<boost::mutex> cLock(pContext->m_cMutex);

            if (pContext->m_lJobQueue.size() > iNumSortingThreads * 2)

            {

                pContext->m_cEvent.wait(cLock);

            }

            pContext->m_lJobQueue.push_back(cJob);

        }

        std::cout << std::endl;

        pContext->m_bHasMoreData = false;



        pSortingThreadGroup->join_all();



        return true;

    }

    catch(const std::exception & cE)

    {

        std::cerr << cE.what() << std::endl;

    }

    catch(...)

    {

        std::cerr << __LINE__ << std::endl;

    }



    return false;

}
View Code

第二遍就是k Way Merge Sorting了。这里的思路很简单,直接读取外部的一坨文件,以及维护一个队列,每次从活的最小数字的那一列输出候选者,然后读出下一个放入队列。如果文件读完了,则说明那一路文件流可以丢弃了,队列也相应的变小了。这里当然是单线程的。

The second pass is the single-threaded classical k-Way Merging Sorting.

Multithreading C++ Out of Core Sotring for Massive Data|多线程C++的大规模数据外部排序
bool Merge(const char * szPath, int iNumBatches)

{

    try

    {

        //TODO : There is the limitation about the max number of opened file in process.

        //

        std::vector<boost::shared_ptr<std::ifstream> > vTempFiles;

        for (int i = 0; i < iNumBatches; ++ i)

        {

            char aBuffer[256];

            sprintf(aBuffer, "%.06d.tmp", i);

            boost::shared_ptr<std::ifstream> pTempFile(new std::ifstream(aBuffer, std::ios_base::binary));

            assert(pTempFile->is_open());

            vTempFiles.push_back(pTempFile);

        }



        std::ofstream cSortedFile(szPath, std::ios_base::binary);

        if (! cSortedFile)

        {

            std::cerr << "Can't open " << szPath << " to write. " << std::endl;

            return false;

        }



        //

        boost::timer::auto_cpu_timer cTimer;



        std::vector<int> vCache;

        vCache.reserve(10 * 1024 * 1024);



        std::vector<int> vQueue;

        std::vector<boost::shared_ptr<std::ifstream> >::iterator iFile = vTempFiles.begin();

        for (; iFile != vTempFiles.end(); ++ iFile)

        {

            int iNumber = - 1;

            if ((* iFile)->read(reinterpret_cast<char *>(& iNumber), sizeof(int)))

            {

                vQueue.push_back(iNumber);

            }

        }

        do

        {

            std::vector<int>::iterator iMinPos = std::min_element(vQueue.begin(), vQueue.end());

            vCache.push_back(* iMinPos);

            if (vCache.size() == vCache.capacity())

            {

                cSortedFile.write(reinterpret_cast<const char *>(& vCache[0]), vCache.size() * sizeof(int));

                vCache.clear();

            }



            iFile = vTempFiles.begin() + (iMinPos - vQueue.begin());

            int iNumber = - 1;

            if ((* iFile)->read(reinterpret_cast<char *>(& iNumber), sizeof(int)))

            {

                (* iMinPos) = iNumber;

            }

            else

            {

                vTempFiles.erase(iFile);

                vQueue.erase(iMinPos);

            }



        } while (vQueue.size());

        cSortedFile.write(reinterpret_cast<const char *>(& vCache[0]), vCache.size() * sizeof(int));



        return true;

    }

    catch(const std::exception & cE)

    {

        std::cerr << cE.what() << std::endl;

    }

    catch(...)

    {

        std::cerr << __LINE__ << std::endl;

    }



    return false;

}
View Code

测试的环境为Xeon [email protected],4个硬件线程,测试设置的Job中的数据长度为80M,每次工作线程需要排序20M个int。西部数据的蓝盘,非SSD,也不是混合硬盘,纯机械硬盘。

Tested by single Xeon E5-2603 CPU at 1.8G with 4 hardware threads, each thread process 20M integers. Using WD blue disk, not SSD,.

第一个Sort遍的时间为19.111468s wall, 52.369536s user + 4.243227s system = 56.612763s CPU (296.2%),CPU效率为296.2%/300% = 98.7%,几乎所有时间都在STL中的std::stable_sort里。

Sorting pass used total 19.11 seconds with 98.7% CPU usage.

第二个Merge遍的时间为33.082600s wall, 29.874191s user + 3.010819s system = 32.885011s CPU (99.4%),主要还是都在磁盘写入和排序。当然这里可以为每个文件流构造一个Cache,也可以显著地提高性能,不过这里有一个问题,一旦牵涉到了Cache,则必然又有内存的占用提升,如果占用过大则又失去了Merge的意义。

这里读者可能有个问题,关于主线程中的不停new,其实从Vista开始Windows的内存分配其实已经是池化的,而且这里根本不是性能瓶颈,只有磁盘IO才是,所以这里可以不需要优化。至于架构上的提升其实也不大,因为这里不是传统的多读取者+单写入着(Multiple Reader+Single Writer)而是多读取者写入者+单写入者(Multiple Reader and Writer + Single Writer),所以在结构上和传统的消费者/生产者的多线程工作方式还是有些不同。未来会尝试Lock-Free的工作方式而不用Mutex,这个是以后的内容了。

The memory allocation in the main thread is not a bottleneck compared with the disk IO and sorting, and the memory allocation is based on pool since Vista, so here we might discard the optimization. Later the Lock-Free architecture might be implemented.

这里有全套代码。

Here is the full code.

Multithreading C++ Out of Core Sotring for Massive Data|多线程C++的大规模数据外部排序
  1 /**

  2  * Multithreading C++ Out of Core Sotring for Massive Data

  3  *

  4  * Copyright (c) 2013 Bo Zhou<[email protected]>

  5  * All rights reserved.

  6  * Redistribution and use in source and binary forms, with or without

  7  * modification, are permitted provided that the following conditions are met:

  8  *

  9  *     * Redistributions of source code must retain the above copyright

 10  *       notice, this list of conditions and the following disclaimer.

 11  *     * Redistributions in binary form must reproduce the above copyright

 12  *       notice, this list of conditions and the following disclaimer in the

 13  *       documentation and/or other materials provided with the distribution.

 14  *     * Neither the name of the University of California, Berkeley nor the

 15  *       names of its contributors may be used to endorse or promote products

 16  *       derived from this software without specific prior written permission.

 17  */

 18 

 19 #include <fstream>

 20 #include <list>

 21 #include <iostream>

 22 #include <queue>

 23 

 24 #include <boost/filesystem.hpp>

 25 #include <boost/smart_ptr.hpp>

 26 #include <boost/thread.hpp>

 27 #include <boost/timer/timer.hpp>

 28 

 29 class Job

 30 {

 31 public:

 32 

 33     Job()

 34     :

 35     m_iIndex(0),

 36     m_iNumItems(0)

 37     {

 38     }

 39 

 40     Job(int iIndex, int iNumItems, const boost::shared_array<int> & aData)

 41     :

 42     m_iIndex(iIndex),

 43     m_iNumItems(iNumItems),

 44     m_aData(aData)

 45     {

 46     }

 47 

 48     Job(const Job & cCopy)

 49     :

 50     m_iIndex(cCopy.m_iIndex),

 51     m_iNumItems(cCopy.m_iNumItems),

 52     m_aData(cCopy.m_aData)

 53     {

 54     }

 55 

 56 public:

 57 

 58     int m_iIndex;

 59     int m_iNumItems;

 60     boost::shared_array<int> m_aData;

 61 };

 62 

 63 class Context

 64 {

 65 public:

 66 

 67     Context(int iNumSortingThread)

 68     :

 69     m_iNumSortingThread(iNumSortingThread),

 70     m_bHasMoreData(true)

 71     {

 72     }

 73 

 74 public:

 75 

 76     int m_iNumSortingThread;

 77 

 78     bool m_bHasMoreData;

 79 

 80     boost::mutex m_cMutex;

 81     boost::condition_variable m_cEvent;

 82 

 83     std::list<Job > m_lJobQueue;

 84 };

 85 

 86 class SortingThread : public boost::thread

 87 {

 88 public:

 89 

 90     SortingThread(const boost::shared_ptr<Context> & pContext)

 91     :

 92     m_pContext(pContext),

 93     boost::thread(boost::bind(& SortingThread::Sort, this))

 94     {

 95     }

 96 

 97     void Sort()

 98     {

 99         while (1)

100         {

101             if (! m_pContext->m_bHasMoreData)

102             {

103                 if (! m_pContext->m_lJobQueue.size())

104                 {

105                     break;

106                 }

107             }

108 

109             Job cJob;

110             {

111                 boost::unique_lock<boost::mutex> cLock(m_pContext->m_cMutex);

112                 if (m_pContext->m_lJobQueue.size())

113                 {

114                     // Get a job.

115                     //

116                     cJob = m_pContext->m_lJobQueue.front();

117                     m_pContext->m_lJobQueue.pop_front();

118                 }

119             }

120 

121             if (cJob.m_iNumItems)

122             {

123                 std::stable_sort(cJob.m_aData.get(), cJob.m_aData.get() + cJob.m_iNumItems, std::less<int>());

124                 

125                 // Write out the sorted data.

126                 //

127                 char aBuffer[256];

128                 sprintf(aBuffer, "%.06d.tmp", cJob.m_iIndex);

129                 std::ofstream cOutput(aBuffer, std::ios_base::binary);

130                 cOutput.write(reinterpret_cast<const char *>(cJob.m_aData.get()), cJob.m_iNumItems * sizeof(int));

131             }

132 

133             // Tell the main thread we need more data here.

134             //

135             m_pContext->m_cEvent.notify_one();

136         }

137     }

138 

139 private:

140 

141     boost::shared_ptr<Context> m_pContext;

142 };

143 

144 class SortingThreadGroup : public boost::thread_group

145 {

146 public:

147 

148     SortingThreadGroup(const boost::shared_ptr<Context> & pContext)

149     :

150     m_pContext(pContext)

151     {

152         for (int i = 0; i < m_pContext->m_iNumSortingThread; ++ i)

153         {

154             SortingThread * pSortingThread = new SortingThread(pContext);

155             add_thread(pSortingThread);

156         }

157     }

158 

159 private:

160 

161     boost::shared_ptr<Context> m_pContext;

162 };

163 

164 ///////////////////////////////////////////////////////////////////////////////////////////////////

165 

166 bool Sort(const char * szPath, int iNumSortingThreads, int iNumLocalItems)

167 {

168     try

169     {

170         // Calculate real size.

171         //

172         std::ifstream cUnSortedFile(szPath, std::ios_base::binary);

173         boost::uintmax_t ullSize = boost::filesystem::file_size(szPath);

174         boost::uintmax_t ullNumItems = ullSize / 4;

175 

176         int iNumBatches = ullNumItems / iNumLocalItems;

177         std::vector<int> vNumItemsPerBatch(iNumBatches, iNumLocalItems);

178         int iNumRestItems = ullNumItems % iNumLocalItems;

179         if (iNumRestItems)

180         {

181             vNumItemsPerBatch.push_back(iNumRestItems);

182         }

183         std::cout << "Number of Items   : " << ullNumItems << std::endl

184                   << "Number of Batches : " << vNumItemsPerBatch.size() << std::endl;

185 

186         boost::shared_ptr<Context> pContext(new Context(iNumSortingThreads));

187         boost::scoped_ptr<SortingThreadGroup> pSortingThreadGroup(new SortingThreadGroup(pContext));

188 

189         boost::timer::auto_cpu_timer cTimer;

190         for (int i = 0; i < vNumItemsPerBatch.size(); ++ i)

191         {

192             boost::shared_array<int> aData(new int[vNumItemsPerBatch[i]]);

193             cUnSortedFile.read(reinterpret_cast<char *>(aData.get()), vNumItemsPerBatch[i] * sizeof(int));

194 

195             Job cJob(i, vNumItemsPerBatch[i], aData);

196 

197             //

198             boost::unique_lock<boost::mutex> cLock(pContext->m_cMutex);

199             if (pContext->m_lJobQueue.size() > iNumSortingThreads * 2)

200             {

201                 pContext->m_cEvent.wait(cLock);

202             }

203             pContext->m_lJobQueue.push_back(cJob);

204         }

205         std::cout << std::endl;

206         pContext->m_bHasMoreData = false;

207 

208         pSortingThreadGroup->join_all();

209 

210         return true;

211     }

212     catch(const std::exception & cE)

213     {

214         std::cerr << cE.what() << std::endl;

215     }

216     catch(...)

217     {

218         std::cerr << __LINE__ << std::endl;

219     }

220 

221     return false;

222 }

223 

224 ///////////////////////////////////////////////////////////////////////////////////////////////////

225 

226 bool Merge(const char * szPath, int iNumBatches)

227 {

228     try

229     {

230         //TODO : There is the limitation about the max number of opened file in process.

231         //

232         std::vector<boost::shared_ptr<std::ifstream> > vTempFiles;

233         for (int i = 0; i < iNumBatches; ++ i)

234         {

235             char aBuffer[256];

236             sprintf(aBuffer, "%.06d.tmp", i);

237             boost::shared_ptr<std::ifstream> pTempFile(new std::ifstream(aBuffer, std::ios_base::binary));

238             assert(pTempFile->is_open());

239             vTempFiles.push_back(pTempFile);

240         }

241 

242         std::ofstream cSortedFile(szPath, std::ios_base::binary);

243         if (! cSortedFile)

244         {

245             std::cerr << "Can't open " << szPath << " to write. " << std::endl;

246             return false;

247         }

248 

249         //

250         boost::timer::auto_cpu_timer cTimer;

251 

252         std::vector<int> vCache;

253         vCache.reserve(10 * 1024 * 1024);

254 

255         std::vector<int> vQueue;

256         std::vector<boost::shared_ptr<std::ifstream> >::iterator iFile = vTempFiles.begin();

257         for (; iFile != vTempFiles.end(); ++ iFile)

258         {

259             int iNumber = - 1;

260             if ((* iFile)->read(reinterpret_cast<char *>(& iNumber), sizeof(int)))

261             {

262                 vQueue.push_back(iNumber);

263             }

264         }

265         do

266         {

267             std::vector<int>::iterator iMinPos = std::min_element(vQueue.begin(), vQueue.end());

268             vCache.push_back(* iMinPos);

269             if (vCache.size() == vCache.capacity())

270             {

271                 cSortedFile.write(reinterpret_cast<const char *>(& vCache[0]), vCache.size() * sizeof(int));

272                 vCache.clear();

273             }

274 

275             iFile = vTempFiles.begin() + (iMinPos - vQueue.begin());

276             int iNumber = - 1;

277             if ((* iFile)->read(reinterpret_cast<char *>(& iNumber), sizeof(int)))

278             {

279                 (* iMinPos) = iNumber;

280             }

281             else

282             {

283                 vTempFiles.erase(iFile);

284                 vQueue.erase(iMinPos);

285             }

286 

287         } while (vQueue.size());

288         cSortedFile.write(reinterpret_cast<const char *>(& vCache[0]), vCache.size() * sizeof(int));

289 

290         return true;

291     }

292     catch(const std::exception & cE)

293     {

294         std::cerr << cE.what() << std::endl;

295     }

296     catch(...)

297     {

298         std::cerr << __LINE__ << std::endl;

299     }

300 

301     return false;

302 }

303 

304 int main(int argc, char * argv[])

305 {

306     int iRet = EXIT_FAILURE;

307 

308     //

309     char * szPath = NULL;

310 

311     int iNumSortingThreads = 0;

312     int iNumLocalItems = 0;

313 

314     int iNumBatches = 0;

315 

316     //

317     -- argc, ++ argv;

318     if (argc == 3)

319     {

320         szPath = argv[0];

321         iNumSortingThreads = atoi(argv[1]);

322         iNumLocalItems = atoi(argv[2]) * 1024 * 1024;

323         if (Sort(szPath, iNumSortingThreads, iNumLocalItems))

324         {

325             iRet = EXIT_SUCCESS;

326         }

327     }

328     else if (argc == 2)

329     {

330         szPath = argv[0];

331         iNumBatches = atoi(argv[1]);

332         if (Merge(szPath, iNumBatches))

333         {

334             iRet = EXIT_SUCCESS;

335         }

336     }

337 

338     return iRet;

339 }
View Full Code

你可能感兴趣的:(reading)