Order I Say!

from : http://cpp-next.com/archive/2010/02/order-i-say/

 

The C++ standard library’s sorting functions and its associative containers all require a strict weak ordering criterion. But what is that, really?

Well, I thought I had an intuitive feeling for what “strict weak” means, but recently I realized I needed to nail it down. If you look up the definition, you might encounter something like this:

a strict weak ordering is a strict partial order in which incomparability is transitive.

Strict Partial What?!

So naturally, my next instinct was to track down the definition of “strict weak,” and then of whatever that was based on, and so on, so I could parse the meaning of words like “strict,” “weak,” “total,” etc. Bad idea .

The problem is that some of these words don’t act like ordinary adjectives, and if you try to think of them that way, you’ll just get horribly confused. For example, a strict total order is-not-a total order . In fact, instead of creating a refined ordering concept, as one might expect, adding the word “strict” seems to create a sibling concept . It takes away the reflexivity requirement (i.e. ∀ x , x φ x ) and adds irreflexivity (∀ x , ¬(x φ x )). So, “≤” over the integers is a total order (∀ x εZ, xx ), “<” over integers is a strict total order (∀ x εZ, xx ), and a relation φ for which x φ x is sometimes true and sometimes false is neither “total” nor “strict total.” 1

Fig. 1: ≤ over the integers 1-6 is a total order

Okay. Having said all that, a strict weak order lies somewhere in the territory between a total order and a partial order, so let’s take a look at those.

Total Orderings

Fig. 2: Hasse Diagram for < or ≤ over the integers 1-6

In a total order φ, every pair of elements x and y are related. That is, either (x φ y ) or (y φ x ) is true, but—unless x is identical to y —not both. If you represent the ordering relation as a directed graph where an edge from x to y means (x φ y ), you always get something like Fig 1: every vertex has a self-loop, and every pair of vertices are connected by a single edge. The graph for a “strict total” order looks just the same, except that there are no self-loops.

Since all orderings are transitive relations, you can reduce the complexity of thinking about them by looking only at the graph’s transitive reduction , which eliminates nearly all of its redundant information. If you arrange the the result vertically from “greater” to “lesser,” and drop any self-loops, you get what’s known as a Hasse Diagram . The Hasse Diagram for total orders and strict total orders is always a nice, linear list as in Fig 2.

Partial Orderings

Fig. 3: vxu and vzu

In a partial order, the Hasse diagram is a Directed Acyclic Graph (DAG)—a directional network with no loops.

Notice that in Fig. 3, x and z are not ordered with respect to one another; they are called incomparable . That is perhaps a badly-chosen name, because it is possible to compare x and z ; if you do, though, you’ll find that neither xz nor zx .

Incomparability

Fig. 4: divisibility over the integers 1-6 is a partial order

If the whole idea of incomparability seems a bit implausible, consider the relation of human ancestry. By this criterion, you are preceded by your parents, who in turn are preceded by their parents, etc. However, your maternal grandparents are incomparable to your paternal grandparents.

The divisibility ordering over integers is another example. As you can see from Fig. 4, the prime numbers line up together above 1, and are incomparable with one another.

Order of the Weak

Fig. 5: ⌊x /3⌋ < ⌊y /3⌋ over the integers is a strict weak order

Perhaps counterintuitively, a strict weak ordering actually has stronger requirements than a strict partial ordering does. In a strict weak ordering, every pair of incomparable elements are related—or incomparable—to every other element in exactly the same way. Thus, the Hasse diagrams of all strict weak orderings follow the same general pattern shown in Fig. 5.: every element is related to all the elements at the next-higher level (and to no elements at the same level).

Strictly Speaking, Weak ⇒ Strict?

There’s some disagreement in the literature I’ve found over whether there’s such a thing as a “non-strict” weak order. The definition of “weak order” in Elements of Programming implies strictness. I like this definition because it’s simple and elegant. Any definition of “non-strict weak order” seems to require an extra rule to add reflexivity.

In other contexts I found on the web, though, the unqualified term “weak order” actually implies reflexivity, i.e. non-strictness . Probably for this reason, “weak order” is seldom used alone, but instead is usually prefixed either by “strict,” “non-strict,” or “reflexive” (sometimes parenthesized). That seems like a good idea to me.

Properties of Incomparability

Fig. 6: Equivalence classes of the strict weak ordering ⌊x/3⌋ < ⌊y/3⌋

Incomparability can be viewed as a relation in its own right:

xyxyyx

The incomparability relation is symmetric, almost by definition: xy implies yx , and vice versa. In a strict ordering, where nothing is ordered with respect to itself, it’s also reflexive: xx for all x . In a strict weak ordering, it’s also transitive. Since an equivalence relation is by definition symmetric, reflexive, and transitive, when taken together, these three properties allow us to group incomparable elements of a strict weak order into equivalence classes as shown in Fig. 6.

Notice that applying the strict weak ordering relation between representatives of these equivalence classes results in a strict total order of the equivalence classes themselves. In fact, we could view a strict weak ordering as the result of ignoring some of the elements’ distinguishing characteristics and applying a strict total ordering to the rest. With that understanding in hand, it becomes easy to recognize many of the orderings we can legitimately use with std::sort . For example, the “is a brighter color than” comparison ignores hue and saturation, arranging colors based on a total ordering over brightness. A case-insensitive string comparison ignores case and performs a (total) lexicographical ordering of the lowercased characters.

How Does It Work?

So why is this “strict weak ordering” property so essential to efficient sorting? Well, since all incomparable elements are interchangeable as far as sort order is concerned, a sorting algorithm can avoid making comparisons. In particular, the transitivity property of incomparability in a weak ordering means that once you know that xy and yz , you also don’t have to compare x and z to get the sort order right.

Are You Pondering What I’m Pondering?

A couple of thought-provoking ideas came to me as I was writing this article and I thought it would be more fun to leave them as open questions than to answer them in detail myself. Maybe we’ll get some good answers in the comments

Maps

The sorting criterion for a std::map is a strict weak ordering that ignores the value field of its (key , value ) pairs and orders the elements… based on a strict weak ordering of the keys alone. But we said above that a strict weak ordering could be viewed as a total ordering over the non-ignored characteristics. Is there an inconsistency here?

What About Topological Sort?

Those of you who were awake during graph theory class might remember that it’s possible to efficiently sort the elements of a DAG so that the arrows all point in one direction after sorting. In fact, you can do it in linear time . We said that every partial ordering could be represented as a DAG. So why does std::sort require a strict weak order? Shouldn’t this allow a super-efficient sort on any partial ordering criterion, even one that isn’t weak?

你可能感兴趣的:(Order I Say!)