外部排序

 External sorting is required when the data being sorted do not fit into the main memory of a computing device (usually RAM ) and instead they must reside in the slower external memory (usually a hard drive ). External sorting typicallyuses a hybrid sort-merge strategy. In the sorting phase, chunks of data small enough to fit in main memory are read, sorted, and written out to a temporary file. In themerge phase, the sorted subfiles are combined into a single larger file. (External sorting WIKI)


One example of external sorting is the external merge sort algorithm, which sorts chunksthat each fit in RAM, then merges the sorted chunks together. We first divide the file intoruns such that the size of a run is small enough to fit into main memory. Then sort each run in main memory using merge sort sorting algorithm. Finally merge the resulting runs together into successively bigger runs, until the file is sorted.

Prerequisite for the algorithm/code:
MergeSort : Used for sort individual runs (a run is part of file that is small enough to fit in main memory)
Merge K Sorted Arrays : Used to merge sorted runs.

Implemetation from geeksforgeeks


How to do:

1. 内存中分配一个极小的 Buffer,将大文件 input.txt 按行读入,读取到 buffer 大小 或者 大文件读完时,对 Buffer 中的数据调用内排进行排序(归并排序,时间复杂度O(nlogn) ),排序后将有序序列写入磁盘文件(0,1....,9)。

2. 9-way merge 有序文件(最小堆排序:root 最小值写到 output.txt )到一个大文件 output.txt



你可能感兴趣的:(算法分析,外部排序,外排)