aifangzsrxp4574

A functional C# (type)switch

转自http://community.bartdesmet.net/blogs/bart/archive/2008/03/30/a-functional-c-type-switch.aspx

A while ago I posted a functional way of exception handling, introducing similar functionality as exception filters (VB's When keyword). I admitted it was a crazy Sunday afternoon idea, maybe I should create a category entitled "Crazy Sundays" since this post very much belongs to that same category (update: I did create the category).

It all started a few weeks ago when I was explaining the way LINQ works, starting by focusing on the concept of extension methods and mentioning sexy words like monads, continuations, etc. After the session somebody came to me and wondered what other ideas could be expressed in a similar way like LINQ's query operator chaining. I came up with a couple of uses and this post concretizes one of those.

The rules

Okay, let's get started. What's up? Cloning the switch functionality as is? Well, almost, but adding some other stuff to it. First, take a look at what we have today, baked in by the language (section 15.7.2 of the C# Language Specification):

switch-statement:
     switch ( expression ) switch-block

switch-block:
     { [switch-sections] }

switch-sections:
     switch-section
     switch-sections switch-section

switch-section:
    switch-labels statement-list

switch-labels:
     switch-label
     switch-labels switch-label

switch-label:
     case constant-expression :
     default :

What's interesting about the above are the limitations imposed by the switch statement. First of all there's the governing type defined by the switch expression, which needs to be of (or convertible into using one user-defined implicit conversion) a built-in integral numerical type or a string (the exact list of types is specified in 15.7.2). You can think of it as a big if-else if-else statement although the implementation might be quite different, using switching tables (IL switch) and for (more than 5 I believe) strings a Dictionary. I won't go in further IL detail nor dive into the subtleties of nullables (maybe another time).

Another thing to know about is the fact the language has a "no fall through" rule (read: forgetting a break statement), which eliminates a series of common problems in other curly brace languages that will remain unnamed. In addition to this, one can reorder the switch sections at will without affecting the semantics of the switch. And last but not least, all of the (values of the) labels should be unique.

All of this being said, we're going to break certain rules over the course of this fun activity. Beware of this, especially when you'd be tempted (I doubt) to embrace this idea.

Simple switch

We'll start by defining simple switch logic. How could we mimic a switch statement by means of method calls? Right, extension methods. Before we go there, it's quite important to pick a target for those extensions (the 'this' parameter). Sure, we could go for System.Object but do we want to spoil such a fundamental type with (seemingly) additional methods? I'm tempting to say no, but feel free to have another opinion, so we'll define a wrapper. Quick-n-dirty, here it is:

class Switch
{
    public Switch(object o)
    {
        Object = o;
    }

    public object Object { get; private set; }
}

Exercise: Adding a special class for switching logic has some drawbacks. What about a struct? Predict what would happen if you trade the class keyword for the struct keyword above. Will the fragment still compile? If not, what needs to change? Try to push forward the choice of a struct in the rest of this post if you're convinced about the alternative.

In fact, we could even forget about extension methods at this point in time, since we own the Switch class. You can choose either way, but to keep myself honest on the goal of Crazy Sunday posts I'll stick with extension methods.

Exercise: Abandon my idea of using extension methods and go with instance methods from here on.

A first difference has become apparent already, we'll support all types to be used in our switch. Next, we have to pick the syntax we're aiming for. It should go along those lines:

void Do(int age)
{
     new Switch(age)
          .Case(a => (int)a < 18, a =>
          {
               Console.WriteLine("Young");
          })
          .Case(18, a =>
          {
               Console.WriteLine("Middle-age");
          })
          .Default(a =>
          {
               Console.WriteLine("Old");
          });
}

There are a couple of remarkable things in here. Let's analyze case by case:

We use System.Object as our base type, so the first switch needs a cast. Further on, we'll do something about this.
Again in the first switch, notice we use a Func as the switching condition. This goes beyond the simply constant-based comparison of the typical switch.
The second switch is a typical one, comparing just a value for equality.
Finally, we have the familiar default base case.

The whole thing 'returns' void, but to allow for chaining we need to pass through objects between the Case 'labels' obviously. We could go further and make the whole thing a valued expression, but let's not go there for now.

Another notable (but obvious) thing is the lack of break keywords. There's nothing to break after all, so we need to bake the semantics into the method calls. We'll stick with "no fall-through by default" but will provide an overload:

void Do(string name)
{
     new Switch(name)
          .Case(s => ((string)s).StartsWith("B"), s =>
          {
               Console.WriteLine(((string)s) + " starts with B.");
          }, true)
          .Case(s => ((string)s).StartsWith("Ba"), s =>
          {
               Console.WriteLine(((string)s) + " starts with Ba.");
          })
          .Default(s =>
          {
               Console.WriteLine(((string)s) + " starts with who knows what.");
          });
}

The true parameter to the first Case call indicates to fall through. Time for some implementation work. Here's a first set of (extension) methods:

static class SwitchExtensions
{
     public static Switch Case(this Switch s, object o, Action<object> a)
     {
          return Case(s, o, a, false);
     }

     public static Switch Case(this Switch s, object o, Action<object> a, bool fallThrough)
     {
          return Case(s, x => object.Equals(x, o), a, fallThrough);
     }

     public static Switch Case(this Switch s, Func<object, bool> c, Action<object> a)
     {
          return Case(s, c, a, false);
     }

     public static Switch Case(this Switch s, Func<object, bool> c, Action<object> a, bool fallThrough)
     {
          if (s == null)
          {
               return null;
          }
          else if (c(s.Object))
          {
               a(s.Object);
               return fallThrough ? s : null;
          }

          return s;
     }
}

Notice the way chaining works, by returning null to break the chain. Extension methods and classes make sense after all, although (exercise) you can still (?) work around it (what about a Switch.Break thingy?). Let's bring Default on the scene too:

     public static void Default(this Switch s, Action<object> a)
     {
          if (s != null)
          {
               a(s.Object);
          }
     }

This is where we close the loop by returning void, so that no subsequent Case or Default calls can be made (which really wouldn't make sense).

Exercise: What would it take to turn the whole thing in a valued expression?

Generic switch

Remember the first case 'label' of our first sample? A reminder:

           .Case(a => (int)a < 18, a =>
          {
               Console.WriteLine("Young");
          })

The cast is ugly and this became even more apparent in the second sample where we had to cast a string multiple times. Not only is this inefficient, it's ugly and is a bummer for IntelliSense. Let's fix this by introducing a generic Switch class:

class Switch
{
    public Switch(T o)
    {
        Object = o;
    }

    public T Object { get; private set; }
}

The extensions are simple once more:

     public static Switch Case(this Switch s, T t, Action a)
     {
          return Case(s, t, a, false);
     }

     public static Switch Case(this Switch s, T t, Action a, bool fallThrough)
     {
          return Case(s, x => object.Equals(x, t), a, fallThrough);
     }

     public static Switch Case(this Switch s, Funcbool> c, Action a)
     {
          return Case(s, c, a, false);
     }

     public static Switch Case(this Switch s, Funcbool> c, Action a, bool fallThrough)
     {
          if (s == null)
          {
               return null;
          }
          else if (c(s.Object))
          {
               a(s.Object);
               return fallThrough ? s : null;
          }

          return s;
     }

     public static void Default(this Switch s, Action a)
     {
          if (s != null)
          {
               a(s.Object);
          }
     }

This allows us to write our previous samples more concise:

void Do( string name)
{
     new Switch< string>(name)
          .Case(s => s.StartsWith( "B"), s =>
          {
               Console.WriteLine(s + " starts with B.");
          }, true)
          .Case(s => s.StartsWith( "Ba"), s =>
          {
               Console.WriteLine(s + " starts with Ba.");
          })
          .Default(s =>
          {
               Console.WriteLine(s + " starts with who knows what.");
          });
}

Much cleaner.

Type switch

Crazy or not, there's ~~most of the time~~ always something useful to it. What about capturing the following pattern?

void Do(Control c)
{
     if (c is Label)
     {
          Label l = (Label)c;
          // ...
     }
     else if (c is Button)
     {
          Button b = (Button)c;
          // ...
     }
     else
     {
          // ...
     }
}

This is a common pattern when dealing with extensions to UI code that need to process all sorts of controls, or when writing parsers as with System.Linq.Expressions where you have to switch on the type of the expression. Unfortunately, the code above isn't the most efficient one. First we do type checks, followed by raw casts. Use of the as keyword is better (even FxCop will tell you):

void Do(Control c)
{
     Label l;
     Button b;
     if ((l as Label) != null)
     {
          // ...
     }
     else if ((c as Button) != null)
     {
          // ...
     }
     else
     {
          // ...
     }
}

But soon it starts to become uglier. I'm not claiming to improve things concerning readability or efficiency in this post, suffice to say I'm capturing a pattern. Enter our type switch. We'd like to be able to rewrite the code above as:

void Do(Control c)
{
     new Switch(c)
          .Case<Label>(l =>
          {
               // ...
          })
          .Case<Button>(b =>
          {
               // ...
          })
          .Default(cc =>
          {
               // ...
          });
}

First of all, notice we piggyback on the non-generic switch. Every Case-'label' already has type information and can only be entered if the switch expression is of the specified type, therefore the body of each label's action body will have the original expression casted to the specified type. E.g. when typing b. in the second label, you'll see the IntelliSense list for a Button variable. The only drawback is the Default block where cc won't be of a more specific type. Obviously you could make it Default, passing in Control in the sample above.

Exercise: Think about the reason not to use the generic Switch in here (tip: see the implementation below).

On to the implementation. Almost trivial again:

public static Switch Case(this Switch s, Action a) where T : class
{
    return Case(s, o => true, a, false);
}

public static Switch Case(this Switch s, Action a, bool fallThrough) where T : class
{
    return Case(s, o => true, a, fallThrough);
}

public static Switch Case(this Switch s, Funcbool> c, Action a) where T : class
{
    return Case(s, c, a, false);
}

public static Switch Case(this Switch s, Funcbool> c, Action a, bool fallThrough) where T : class
{
    if (s == null)
    {
        return null;
    }
    else
    {
        T t = s.Object as T;
        if (t != null)
        {
            if (c(t))
            {
                a(t);
                return fallThrough ? s : null;
            }
        }
    }

    return s;
}

Default has been specified already although you could have a Default as well (as outlined previously).

Exercise: Why the generic constraint in the code above? Any way around it (without using plain casts and exception handling obviously...)? What about nullables?

Ultimately the same chaining is made possible with the above, but this time by switching on types. Notice fall-through is still relevant, not just because we have all sorts of conditions through Func but also because the type hierarchy we're dealing with. That is (one of the rules we're breaking): order matters.

To show you the above works like a charm:

Yes, there are ways around this with a classic switch using the Expression.NodeType enum value, but sometimes you want more or other (sealed) object-hierarchies lack such infrastructure vehicles.

Valued switches

So, what would it take to make the switch valued, meaning it doesn't return a void but any "projection" you want? In fact, this is much like functional languages where we have if-expressions (instead of statements), much like we have the ternary operator in curly brace languages (and the new If in VB 9.0). I won't nag too much about this, but such a construct isn't seldom seen. Take LISP for example, with:

(cond (e1 e1') (e2 e2') ... (en en'))

so that

(if e1 e2 e3) = (cond (e1 e2) ('T e3))

where if is redefined in terms of cond: if e1 evaluates true, e2 is returned; otherwise, car (old name for head, standing for current address register, a historical name) of the second argument is evaluated (i.e. 'T = true) and if that returns true (i.e. always) e3 is returned.

In order to enable this, we'll need to create a new generic Switch object that not only takes in a type specifying the source but also a target type. This is our definition:

class Switch
{
    public Switch(T o)
    {
        Object = o;
    }

    public T Object { get; private set; }
    public bool HasValue { get; private set; }
    public R Value { get; private set; }

    public void Set(R value)
    {
        Value = value;
        HasValue = true;
    }
}

It looks a bit like Nullable with the HasValue and Value properties. Essentially, once a value has been assigned (through Set), HasValue will flip to true which indicates we've found a match. The semantics are that the first match in a Switch-expression wins, although one could easily adapt this. However, notice this is less efficient that an early return from a function since we'll have to forward the result till the end of the method call chain that makes up the Switch-expression. Let's make it concrete with just three functions (it only gets easier it seems):

public static Switch Case(this Switch s, T t, Func f)
{
    return Case(s, x => object.Equals(x, t), f);
}

public static Switch Case(this Switch s, Funcbool> c, Func f)
{
    if (!s.HasValue && c(s.Object))
    {
        s.Set(f(s.Object));
    }

    return s;
}

public static R Default(this Switch s, Func f)
{
    if (!s.HasValue)
    {
        s.Set(f(s.Object));
    }

    return s.Value;
}

Actually this starts to look a little LINQ-familiar, with Func being a predicate (as in Where) and Func being a projection (as in Select*). The idea is simple: a case evaluates the condition only if the switch hasn't a final value yet. If the test (c) passes, the projection is carried out (f) and the value is set (Set). Default is unconditional and has to be supplied as the final 'projection' but it could well be trivial (especially when case labels are present for all cases that can occur, e.g. when switching on an enumeration or a fixed object hierarchy). Notice there's no type switch functionality (exercise). Here's a trivial sample of this switch at work:

var res =
     from x in typeof(string).GetMembers()
     select new Switch<MemberInfo, string>(x)
            .Case(m => m is MethodInfo, m => m.Name + " is a method")
            .Case(m => m is PropertyInfo, m => m.Name + " is a property")
            .Default(m => m.Name + " is something else");

foreach (var s in res)
    Console.WriteLine(s);

producing the following result:

Conclusion

Crazy but lot of fun. And much room for follow-up. Just a few ideas: Expression, Reflection.Emit. Anyway, enough for now. Have a nice week!