本文翻译自:How to implement classic sorting algorithms in modern C++?
The std::sort
algorithm (and its cousins std::partial_sort
and std::nth_element
) from the C++ Standard Library is in most implementations a complicated and hybrid amalgamation of more elementary sorting algorithms , such as selection sort, insertion sort, quick sort, merge sort, or heap sort. 来自C ++标准库的std::sort
算法(及其堂兄std::partial_sort
和std::nth_element
)在大多数实现中是更复杂的混合更多基本排序算法 ,例如选择排序,插入排序,快速排序,合并排序或堆排序。
There are many questions here and on sister sites such as https://codereview.stackexchange.com/ related to bugs, complexity and other aspects of implementations of these classic sorting algorithms. 这里和姐妹网站上有很多问题,例如https://codereview.stackexchange.com/,与错误,复杂性以及这些经典排序算法的实现的其他方面有关。 Most of the offered implementations consist of raw loops, use index manipulation and concrete types, and are generally non-trivial to analyse in terms of correctness and efficiency. 大多数提供的实现包括原始循环,使用索引操作和具体类型,并且在正确性和效率方面分析通常是非常重要的。
Question : how can the above mentioned classic sorting algorithms be implemented using modern C++? 问 :如何使用现代C ++实现上述经典排序算法?
没有原始循环 ,但结合了
的标准库的算法构建块 auto
, template aliases, transparent comparators and polymorphic lambdas. C ++ 14风格 ,包括完整的标准库,以及语法降噪器,如auto
,模板别名,透明比较器和多态lambda。 Notes : 备注 :
for
-loop longer than composition of two functions with an operator. 根据肖恩家长约定 (幻灯片39),原料循环是一个for
-loop比与操作者的两个功能组合物更长的时间。 So f(g(x));
所以f(g(x));
or f(x); g(x);
或f(x); g(x);
f(x); g(x);
or f(x) + g(x);
或f(x) + g(x);
are not raw loops, and neither are the loops in selection_sort
and insertion_sort
below. 不是原始循环,下面的selection_sort
和insertion_sort
中的循环也不是。 参考:https://stackoom.com/question/1fQkk/如何在现代C-中实现经典排序算法
We begin by assembling the algorithmic building blocks from the Standard Library: 我们首先从标准库中组装算法构建块:
#include // min_element, iter_swap,
// upper_bound, rotate,
// partition,
// inplace_merge,
// make_heap, sort_heap, push_heap, pop_heap,
// is_heap, is_sorted
#include // assert
#include // less
#include // distance, begin, end, next
std::begin()
/ std::end()
as well as with std::next()
are only available as of C++11 and beyond. 迭代器工具,如非成员std::begin()
/ std::end()
以及std::next()
仅在C ++ 11及更高版本中可用。 For C++98, one needs to write these himself. 对于C ++ 98,人们需要自己编写。 There are substitutes from Boost.Range in boost::begin()
/ boost::end()
, and from Boost.Utility in boost::next()
. boost::begin()
/ boost::end()
Boost.Range和boost::next()
Boost.Utility都有替代品。 std::is_sorted
algorithm is only available for C++11 and beyond. std::is_sorted
算法仅适用于C ++ 11及更高版本。 For C++98, this can be implemented in terms of std::adjacent_find
and a hand-written function object. 对于C ++ 98,这可以用std::adjacent_find
和手写函数对象来实现。 Boost.Algorithm also provides a boost::algorithm::is_sorted
as a substitute. Boost.Algorithm还提供了boost::algorithm::is_sorted
作为替代。 std::is_heap
algorithm is only available for C++11 and beyond. std::is_heap
算法仅适用于C ++ 11及更高版本。 C++14 provides transparent comparators of the form std::less<>
that act polymorphically on their arguments. C ++ 14提供了std::less<>
形式的透明比较器 ,它们的参数可以多态化。 This avoids having to provide an iterator's type. 这避免了必须提供迭代器的类型。 This can be used in combination with C++11's default function template arguments to create a single overload for sorting algorithms that take <
as comparison and those that have a user-defined comparison function object. 这可以与C ++ 11的默认函数模板参数结合使用,为排序算法创建单个重载 ,该算法采用<
作为比较和具有用户定义的比较函数对象的算法。
template>
void xxx_sort(It first, It last, Compare cmp = Compare{});
In C++11, one can define a reusable template alias to extract an iterator's value type which adds minor clutter to the sort algorithms' signatures: 在C ++ 11中,可以定义一个可重用的模板别名来提取迭代器的值类型,这会给排序算法的签名增加一点点混乱:
template
using value_type_t = typename std::iterator_traits::value_type;
template>>
void xxx_sort(It first, It last, Compare cmp = Compare{});
In C++98, one needs to write two overloads and use the verbose typename xxx
syntax 在C ++ 98中,需要编写两个重载并使用详细的typename xxx
语法
template
void xxx_sort(It first, It last, Compare cmp); // general implementation
template
void xxx_sort(It first, It last)
{
xxx_sort(first, last, std::less::value_type>());
}
auto
parameters that are deduced like function template arguments). 另一个语法准确性是C ++ 14通过多态lambda ( auto
参数被推导出来,就像函数模板参数一样)来促进包装用户定义的比较器。 value_type_t
. C ++ 11只有单态lambda,需要使用上面的模板别名value_type_t
。 std::bind1st
/ std::bind2nd
/ std::not1
type of syntax. 在C ++ 98中,要么需要编写独立的函数对象,要么使用详细的std::bind1st
/ std::bind2nd
/ std::not1
类型的语法。 boost::bind
and _1
/ _2
placeholder syntax. Boost.Bind通过boost::bind
和_1
/ _2
占位符语法改进了这一点。 std::find_if_not
, whereas C++98 needs std::find_if
with a std::not1
around a function object. C ++ 11及更高版本也有std::find_if_not
,而C ++ 98需要std::find_if
和函数对象周围的std::not1
。 There is no generally acceptable C++14 style yet. 目前还没有普遍接受的C ++ 14风格。 For better or for worse, I closely follow Scott Meyers's draft Effective Modern C++ and Herb Sutter's revamped GotW . 无论好坏,我都会密切关注Scott Meyers的Effective Modern C ++草案和Herb Sutter 改进后的GotW 。 I use the following style recommendations: 我使用以下样式建议:
()
and {}
when creating objects" and consistently choose braced-initialization {}
instead of the good old parenthesized initialization ()
(in order to side-step all most-vexing-parse issues in generic code). Scott Meyers的“Distinguish ()
和{}
在创建对象时”并且始终选择braced-initialization {}
而不是旧的带括号的初始化()
(以便在通用代码中支持所有最棘手的解析问题)。 typedef
saves time and adds consistency. 对于模板,无论如何这是必须的,并且在任何地方使用它而不是typedef
可以节省时间并增加一致性。 for (auto it = first; it != last; ++it)
pattern in some places, in order to allow for loop invariant checking for already sorted sub-ranges. 我在某些地方使用for (auto it = first; it != last; ++it)
模式,以便允许对已经排序的子范围进行循环不变检查。 In production code, the use of while (first != last)
and a ++first
somewhere inside the loop might be slightly better. 在生产代码中, while (first != last)
和++first
在循环内部使用可能会稍好一些。 Selection sort does not adapt to the data in any way, so its runtime is always O(N²)
. 选择排序不以任何方式适应数据,因此其运行时始终为O(N²)
。 However, selection sort has the property of minimizing the number of swaps . 但是,选择排序具有最小化交换次数的属性。 In applications where the cost of swapping items is high, selection sort very well may be the algorithm of choice. 在交换项目的成本高的应用程序中,选择排序非常好可能是选择的算法。
To implement it using the Standard Library, repeatedly use std::min_element
to find the remaining minimum element, and iter_swap
to swap it into place: 要使用标准库实现它,请重复使用std::min_element
查找剩余的最小元素,并使用iter_swap
将其交换到位:
template>
void selection_sort(FwdIt first, FwdIt last, Compare cmp = Compare{})
{
for (auto it = first; it != last; ++it) {
auto const selection = std::min_element(it, last, cmp);
std::iter_swap(selection, it);
assert(std::is_sorted(first, std::next(it), cmp));
}
}
Note that selection_sort
has the already processed range [first, it)
sorted as its loop invariant. 请注意, selection_sort
已将已处理的范围[first, it)
排序为其循环不变量。 The minimal requirements are forward iterators , compared to std::sort
's random access iterators. 与std::sort
的随机访问迭代器相比,最小要求是前向迭代器。
Details omitted : 细节省略 :
if (std::distance(first, last) <= 1) return;
选择排序可以通过早期测试优化if (std::distance(first, last) <= 1) return;
(or for forward / bidirectional iterators: if (first == last || std::next(first) == last) return;
). (或对于forward / bidirectional迭代器: if (first == last || std::next(first) == last) return;
)。 [first, std::prev(last))
, because the last element is guaranteed to be the minimal remaining element and doesn't require a swap. 对于双向迭代器 ,上述测试可以与区间[first, std::prev(last))
上的循环组合,因为最后一个元素保证是最小剩余元素并且不需要交换。 Although it is one of the elementary sorting algorithms with O(N²)
worst-case time, insertion sort is the algorithm of choice either when the data is nearly sorted (because it is adaptive ) or when the problem size is small (because it has low overhead). 尽管它是具有O(N²)
最坏情况时间的基本排序算法之一,但是当数据几乎排序(因为它是自适应的 )或当问题大小很小时(因为它具有), 插入排序是选择的算法。低开销)。 For these reasons, and because it is also stable , insertion sort is often used as the recursive base case (when the problem size is small) for higher overhead divide-and-conquer sorting algorithms, such as merge sort or quick sort. 由于这些原因,并且因为它也是稳定的 ,插入排序通常用作递归基本情况(当问题大小很小时)用于更高开销的分而治之的排序算法,例如合并排序或快速排序。
To implement insertion_sort
with the Standard Library, repeatedly use std::upper_bound
to find the location where the current element needs to go, and use std::rotate
to shift the remaining elements upward in the input range: 要使用标准库实现insertion_sort
,请重复使用std::upper_bound
查找当前元素需要去的位置,并使用std::rotate
在输入范围std::rotate
上移动其余元素:
template>
void insertion_sort(FwdIt first, FwdIt last, Compare cmp = Compare{})
{
for (auto it = first; it != last; ++it) {
auto const insertion = std::upper_bound(first, it, *it, cmp);
std::rotate(insertion, it, std::next(it));
assert(std::is_sorted(first, std::next(it), cmp));
}
}
Note that insertion_sort
has the already processed range [first, it)
sorted as its loop invariant. 请注意, insertion_sort
已将已处理的范围[first, it)
排序为其循环不变量。 Insertion sort also works with forward iterators. 插入排序也适用于前向迭代器。
Details omitted : 细节省略 :
if (std::distance(first, last) <= 1) return;
if (std::distance(first, last) <= 1) return;
可以通过早期测试优化插入排序if (std::distance(first, last) <= 1) return;
(or for forward / bidirectional iterators: if (first == last || std::next(first) == last) return;
) and a loop over the interval [std::next(first), last)
, because the first element is guaranteed to be in place and doesn't require a rotate. (或者对于forward / bidirectional迭代器: if (first == last || std::next(first) == last) return;
)和一个循环超过区间[std::next(first), last)
,因为第一个元素保证就位,不需要旋转。 std::find_if_not
algorithm. 对于双向迭代器 ,使用标准库的std::find_if_not
算法可以使用反向线性搜索替换用于查找插入点的二进制搜索。 Four Live Examples ( C++14 , C++11 , C++98 and Boost , C++98 ) for the fragment below: 以下片段的四个实例 ( C ++ 14 , C ++ 11 , C ++ 98和Boost , C ++ 98 ):
using RevIt = std::reverse_iterator;
auto const insertion = std::find_if_not(RevIt(it), RevIt(first),
[=](auto const& elem){ return cmp(*it, elem); }
).base();
O(N²)
comparisons, but this improves to O(N)
comparisons for almost sorted inputs. 对于随机输入,这给出了O(N²)
比较,但是对于几乎排序的输入,这改进了O(N)
比较。 The binary search always uses O(N log N)
comparisons. 二进制搜索总是使用O(N log N)
比较。 When carefully implemented, quick sort is robust and has O(N log N)
expected complexity, but with O(N²)
worst-case complexity that can be triggered with adversarially chosen input data. 仔细实施后, 快速排序是稳健的,并且具有O(N log N)
预期复杂度,但O(N²)
最坏情况复杂性可以通过对侧选择的输入数据触发。 When a stable sort is not needed, quick sort is an excellent general-purpose sort. 当不需要稳定排序时,快速排序是一种出色的通用排序。
Even for the simplest versions, quick sort is quite a bit more complicated to implement using the Standard Library than the other classic sorting algorithms. 即使对于最简单的版本,使用标准库实现快速排序比使用其他经典排序算法要复杂得多。 The approach below uses a few iterator utilities to locate the middle element of the input range [first, last)
as the pivot, then use two calls to std::partition
(which are O(N)
) to three-way partition the input range into segments of elements that are smaller than, equal to, and larger than the selected pivot, respectively. 下面的方法使用一些迭代器实用程序来定位输入范围的中间元素 [first, last)
作为pivot,然后使用两个调用std::partition
(它们是O(N)
)来对输入进行三向分区范围分别为小于,等于和大于所选枢轴的元素段。 Finally the two outer segments with elements smaller than and larger than the pivot are recursively sorted: 最后,递归地对具有小于和大于枢轴的元素的两个外部区段进行递归排序:
template>
void quick_sort(FwdIt first, FwdIt last, Compare cmp = Compare{})
{
auto const N = std::distance(first, last);
if (N <= 1) return;
auto const pivot = *std::next(first, N / 2);
auto const middle1 = std::partition(first, last, [=](auto const& elem){
return cmp(elem, pivot);
});
auto const middle2 = std::partition(middle1, last, [=](auto const& elem){
return !cmp(pivot, elem);
});
quick_sort(first, middle1, cmp); // assert(std::is_sorted(first, middle1, cmp));
quick_sort(middle2, last, cmp); // assert(std::is_sorted(middle2, last, cmp));
}
However, quick sort is rather tricky to get correct and efficient, as each of the above steps has to be carefully checked and optimized for production level code. 但是,快速排序对于获得正确和有效是相当棘手的,因为必须仔细检查上述每个步骤并针对生产级代码进行优化。 In particular, for O(N log N)
complexity, the pivot has to result into a balanced partition of the input data, which cannot be guaranteed in general for an O(1)
pivot, but which can be guaranteed if one sets the pivot as the O(N)
median of the input range. 特别是,对于O(N log N)
复杂度,枢轴必须导致输入数据的平衡分区,这通常不能保证O(1)
枢轴,但是如果设置了枢轴,则可以保证作为输入范围的O(N)
中值。
Details omitted : 细节省略 :
O(N^2)
complexity for the " organ pipe " input 1, 2, 3, ..., N/2, ... 3, 2, 1
(because the middle is always larger than all other elements). 上述实现特别容易受到特殊输入的影响,例如它对于“ 器官管 ”输入1, 2, 3, ..., N/2, ... 3, 2, 1
O(N^2)
其具有O(N^2)
复杂度(因为中间总是比所有其他元素都大。) O(N^2)
. 来自输入范围的随机选择元素的 中间3个枢轴选择防止几乎排序的输入,否则复杂性将恶化到O(N^2)
。 std::partition
is not the most efficient O(N)
algorithm to achieve this result. 三次分区 (小于,等于和大于枢轴的分离元素),如对std::partition
的两次调用所示,并不是实现此结果的最有效的O(N)
算法。 O(N log N)
complexity can be achieved through median pivot selection using std::nth_element(first, middle, last)
, followed by recursive calls to quick_sort(first, middle, cmp)
and quick_sort(middle, last, cmp)
. 对于随机访问迭代器 ,保证O(N log N)
复杂度可以通过使用std::nth_element(first, middle, last)
中间数据透视选择来实现,然后是对quick_sort(first, middle, cmp)
和quick_sort(middle, last, cmp)
的递归调用quick_sort(middle, last, cmp)
。 O(N)
complexity of std::nth_element
can be more expensive than that of the O(1)
complexity of a median-of-3 pivot followed by an O(N)
call to std::partition
(which is a cache-friendly single forward pass over the data). 然而,这种保证是有代价的,因为std::nth_element
的O(N)
复杂度的常数因子可能比O(1)
中位数为3的枢轴后跟O(N)
复杂度更高。 O(N)
调用std::partition
(这是对数据的缓存友好的单向前传递)。 If using O(N)
extra space is of no concern, then merge sort is an excellent choice: it is the only stable O(N log N)
sorting algorithm. 如果使用O(N)
额外空间无关紧要,那么合并排序是一个很好的选择:它是唯一稳定的 O(N log N)
排序算法。
It is simple to implement using Standard algorithms: use a few iterator utilities to locate the middle of the input range [first, last)
and combine two recursively sorted segments with a std::inplace_merge
: 使用标准算法实现起来很简单:使用一些迭代器实用程序来定位输入范围的中间[first, last)
并将两个递归排序的段与std::inplace_merge
:
template>
void merge_sort(BiDirIt first, BiDirIt last, Compare cmp = Compare{})
{
auto const N = std::distance(first, last);
if (N <= 1) return;
auto const middle = std::next(first, N / 2);
merge_sort(first, middle, cmp); // assert(std::is_sorted(first, middle, cmp));
merge_sort(middle, last, cmp); // assert(std::is_sorted(middle, last, cmp));
std::inplace_merge(first, middle, last, cmp); // assert(std::is_sorted(first, last, cmp));
}
Merge sort requires bidirectional iterators, the bottleneck being the std::inplace_merge
. 合并排序需要双向迭代器,瓶颈是std::inplace_merge
。 Note that when sorting linked lists, merge sort requires only O(log N)
extra space (for recursion). 请注意,排序链接列表时,合并排序仅需要O(log N)
额外空间(用于递归)。 The latter algorithm is implemented by std::list
in the Standard Library. 后一种算法由标准库中的std::list
。
Heap sort is simple to implement, performs an O(N log N)
in-place sort, but is not stable. 堆排序很容易实现,执行O(N log N)
就地排序,但不稳定。
The first loop, O(N)
"heapify" phase, puts the array into heap order. 第一个循环, O(N)
“heapify”阶段,将数组放入堆顺序。 The second loop, the O(N log N
) "sortdown" phase, repeatedly extracts the maximum and restores heap order. 第二个循环, O(N log N
)“排序”阶段,重复提取最大值并恢复堆顺序。 The Standard Library makes this extremely straightforward: 标准库使这非常简单:
template>
void heap_sort(RandomIt first, RandomIt last, Compare cmp = Compare{})
{
lib::make_heap(first, last, cmp); // assert(std::is_heap(first, last, cmp));
lib::sort_heap(first, last, cmp); // assert(std::is_sorted(first, last, cmp));
}
In case you consider it "cheating" to use std::make_heap
and std::sort_heap
, you can go one level deeper and write those functions yourself in terms of std::push_heap
and std::pop_heap
, respectively: 如果你认为这是“骗”来使用std::make_heap
和std::sort_heap
,你可以去更深层次的原因自己写这些功能在以下方面std::push_heap
和std::pop_heap
分别为:
namespace lib {
// NOTE: is O(N log N), not O(N) as std::make_heap
template>
void make_heap(RandomIt first, RandomIt last, Compare cmp = Compare{})
{
for (auto it = first; it != last;) {
std::push_heap(first, ++it, cmp);
assert(std::is_heap(first, it, cmp));
}
}
template>
void sort_heap(RandomIt first, RandomIt last, Compare cmp = Compare{})
{
for (auto it = last; it != first;) {
std::pop_heap(first, it--, cmp);
assert(std::is_heap(first, it, cmp));
}
}
} // namespace lib
The Standard Library specifies both push_heap
and pop_heap
as complexity O(log N)
. 标准库将push_heap
和pop_heap
指定为复杂度O(log N)
。 Note however that the outer loop over the range [first, last)
results in O(N log N)
complexity for make_heap
, whereas std::make_heap
has only O(N)
complexity. 但是请注意,在[first, last)
范围内的外部循环导致make_heap
O(N log N)
复杂度,而std::make_heap
仅具有O(N)
复杂度。 For the overall O(N log N)
complexity of heap_sort
it doesn't matter. 对于heap_sort
的整体O(N log N)
复杂度,它无关紧要。
Details omitted : O(N)
implementation of make_heap
细节省略 : make_heap
O(N)
实现
Here are four Live Examples ( C++14 , C++11 , C++98 and Boost , C++98 ) testing all five algorithms on a variety of inputs (not meant to be exhaustive or rigorous). 这里有四个实例 ( C ++ 14 , C ++ 11 , C ++ 98和Boost , C ++ 98 )测试各种输入上的所有五种算法(并不是详尽无遗或严格的)。 Just note the huge differences in the LOC: C++11/C++14 need around 130 LOC, C++98 and Boost 190 (+50%) and C++98 more than 270 (+100%). 请注意LOC中的巨大差异:C ++ 11 / C ++ 14需要大约130 LOC,C ++ 98和Boost 190(+ 50%)以及C ++ 98大于270(+ 100%)。
Another small and rather elegant one originally found on code review . 在代码审查中最初发现的另一个小而优雅的。 I thought it was worth sharing. 我认为值得分享。
While it is rather specialized, counting sort is a simple integer sorting algorithm and can often be really fast provided the values of the integers to sort are not too far apart. 虽然它是相当专业的,但计数排序是一种简单的整数排序算法,如果要排序的整数值不是太远,通常可以非常快。 It's probably ideal if one ever needs to sort a collection of one million integers known to be between 0 and 100 for example. 如果有人需要对已知在0到100之间的一百万个整数的集合进行排序,这可能是理想的。
To implement a very simple counting sort that works with both signed and unsigned integers, one needs to find the smallest and greatest elements in the collection to sort; 要实现一个非常简单的计数排序,它适用于有符号和无符号整数,需要找到集合中最小和最大的元素进行排序; their difference will tell the size of the array of counts to allocate. 它们的区别将告诉要分配的计数数组的大小。 Then, a second pass through the collection is done to count the number of occurrences of every element. 然后,完成第二次通过集合以计算每个元素的出现次数。 Finally, we write back the required number of every integer back to the original collection. 最后,我们将每个整数的所需数量写回原始集合。
template
void counting_sort(ForwardIterator first, ForwardIterator last)
{
if (first == last || std::next(first) == last) return;
auto minmax = std::minmax_element(first, last); // avoid if possible.
auto min = *minmax.first;
auto max = *minmax.second;
if (min == max) return;
using difference_type = typename std::iterator_traits::difference_type;
std::vector counts(max - min + 1, 0);
for (auto it = first ; it != last ; ++it) {
++counts[*it - min];
}
for (auto count: counts) {
first = std::fill_n(first, count, min++);
}
}
While it is only useful when the range of the integers to sort is known to be small (generally not larger than the size of the collection to sort), making counting sort more generic would make it slower for its best cases. 虽然只有在要排序的整数范围很小(通常不大于要排序的集合的大小)时才有用,但是使计数排序更通用会使其在最佳情况下变慢。 If the range is not known to be small, another algorithm such a radix sort , ska_sort or spreadsort can be used instead. 如果不知道该范围很小,则可以使用另一种算法,例如基数排序 , ska_sort或展开 。
Details omitted : 细节省略 :
We could have passed the bounds of the range of values accepted by the algorithm as parameters to totally get rid of the first std::minmax_element
pass through the collection. 我们可以通过算法接受的值范围的边界作为参数来完全摆脱第一个std::minmax_element
通过集合。 This will make the algorithm even faster when a usefully-small range limit is known by other means. 当通过其他方式知道有用的小范围限制时,这将使算法更快。 (It doesn't have to be exact; passing a constant 0 to 100 is still much better than an extra pass over a million elements to find out that the true bounds are 1 to 95. Even 0 to 1000 would be worth it; the extra elements are written once with zero and read once). (它没有确切;通过恒定的0至100仍然比一个额外的传球超过一百万的元素更好地发现,真正的范围是1到95。0,即使到1000将是值得的;在额外的元素用零写入一次并读取一次)。
Growing counts
on the fly is another way to avoid a separate first pass. 飞行中越来越多的counts
是避免单独第一次通过的另一种方法。 Doubling the counts
size each time it has to grow gives amortized O(1) time per sorted element (see hash table insertion cost analysis for the proof that exponential grown is the key). 每次必须增加时counts
大小加倍,每个排序元素的停顿时间为O(1)(参见哈希表插入成本分析,指数增长的证据是关键)。 Growing at the end for a new max
is easy with std::vector::resize
to add new zeroed elements. 使用std::vector::resize
添加新的归零元素,最终可以增加新的max
。 Changing min
on the fly and inserting new zeroed elements at the front can be done with std::copy_backward
after growing the vector. 在生长向量之后,可以使用std::copy_backward
在运行中更改min
并在前面插入新的归零元素。 Then std::fill
to zero the new elements. 然后std::fill
将新元素std::fill
为零。
The counts
increment loop is a histogram. counts
增量循环是直方图。 If the data is likely to be highly repetitive, and the number of bins is small, it can be worth unrolling over multiple arrays to reduce the serializing data dependency bottleneck of store/reload to the same bin. 如果数据可能是高度重复的,并且容器的数量很少,则可以值得展开多个阵列以减少存储/重新加载到同一个bin的序列化数据依赖性瓶颈。 This means more counts to zero at the start, and more to loop over at the end, but should be worth it on most CPUs for our example of millions of 0 to 100 numbers, especially if the input might already be (partially) sorted and have long runs of the same number. 这意味着在开始时更多计数为零,并且在结束时更多地进行循环,但对于我们的数百万个0到100数字的示例,在大多数CPU上应该是值得的,特别是如果输入可能已经(部分)排序并且有相同数字的长跑。
In the algorithm above, we use a min == max
check to return early when every element has the same value (in which case the collection is sorted). 在上面的算法中,当每个元素具有相同的值时,我们使用min == max
check来提前返回(在这种情况下,对集合进行排序)。 It is actually possible to instead fully check whether the collection is already sorted while finding the extreme values of a collection with no additional time wasted (if the first pass is still memory bottlenecked with the extra work of updating min and max). 实际上可以完全检查集合是否已经排序,同时在没有浪费额外时间的情况下找到集合的极值(如果第一遍仍然是内存瓶颈,需要更新min和max的额外工作)。 However such an algorithm does not exist in the standard library and writing one would be more tedious than writing the rest of counting sort itself. 然而,在标准库中不存在这样的算法,并且编写一个算法比编写其余的计数排序本身更繁琐。 It is left as an exercise for the reader. 它留给读者练习。
Since the algorithm only works with integer values, static assertions could be used to prevent users from making obvious type mistakes. 由于该算法仅适用于整数值,因此可以使用静态断言来防止用户犯明显的类型错误。 In some contexts, a substitution failure with std::enable_if_t
might be preferred. 在某些情况下,使用std::enable_if_t
的替换失败可能是首选。
While modern C++ is cool, future C++ could be even cooler: structured bindings and some parts of the Ranges TS would make the algorithm even cleaner. 虽然现代C ++很酷,但未来的C ++可能会更酷: 结构化绑定和Ranges TS的某些部分会使算法更加清晰。