Switch Statement

A SwitchStatement is a more structured way (than multiple if...else statements) to express conditional logic in C-like languages. Usually looks something like this:

 switch(expression) {
   case a:
     DoSomething();
     break;
   case b:
     DoSomethingElse();
     break;
   default:
     HandleGeneralCase();
     break;
 }

Other non-C-like languages have similar constructs, often with the keywords "case" or "select", often without the need for an explicit "break".

A switch can be invaluable in parsing, or as part of an interface between ObjectOriented code and ProceduralCode. A switch is easier to read than a long if-then-else stack, and almost always faster as well.

Discussion on speed and Optimization

"...almost always..."

Can you substantiate this? Does it apply to all languages with switch statements?

[It is optimized by the compiler as a whole into typically one of three different machine code approaches, which is made possible by the fact that the case expressions are constants. If the case expressions form a dense range, then the whole thing is turned into an indirect jump through a table, for instance. Basically it is optimized to be as fast as it possibly could be. A series of if-then-else statements almost always has to be simply implemented as it was coded, therefore e.g. a sequence of 50 if-then-elses will be as much as 50 times slower than a switch. This is all well-known in the compiler world; it is not something controversial.]

[Some languages do not require the case expressions to be constants, e.g. Lisp (cond), in which case it is just syntactic sugar on top of if-then-elses and not a true SwitchStatement as invented in Pascal and C. COND is not a switch! A switch is spelled CASE in Lisp (see below).]

With optimizing compilers and so many different languages to consider, it is dangerous to generalize, but the basic speedup that a switch exploits is that expression is evaluated only once, then compared against each case until one matches. Also, since switch statements are so common, they tend not to be just syntactic sugar, but directly supported right down to the hardware.

: The big advantage is that most computers, in assembly language, support computed indirect jumps. In other words, I can make an array of target addresses, and then in order to switch off on a value X, I just use X as an index into this array, fetch the address of a piece of code, and jump to it. This is faster than if... else if... else if... in the same way that an array is faster than a linked list.

Actually, a single conditional jump is often faster than a single computed jump, mostly because of pipelining issues, but of course there is a trade-off point at which the computed jump is faster than the corresponding conditional jump cascade.

The advantage of a SwitchStatement is that you can have the compiler decide the trade-off point; i.e. if the SwitchStatement has only a few clauses, the compiler will probably produce a conditional jump cascade, but for many clauses, it will switch (sic) to a jump table approach. (Or one of several other approaches, depending on the data.)

And SwitchStatements almost always have many clauses, in the wild.

I did this for the ARM compiler 14 years ago. I'm sure it's commonplace now, but I was proud of it at the time :-) See this 6 year old posting http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&safe=off&selm=96-07-159%40comp.compilers

Fall Through

A feature (or some would say misfeature) of C's switch construct is that control flow can "fall through" from one case to another. For example:

  switch (command) {
    case CMD_SHOW_HELP_AND_EXIT:
      do_show_help();
      /* Fall thru */
    case CMD_EXIT:
      do_exit();
      break;
    case CMD_OTHER:
      do_other();
      break;
    /* ... etc. ... */
  }

Because the do_show_help(); line is not followed by a break, the do_exit() call will be executed after do_show_help() when CMD_SHOW_HELP_AND_EXIT is the command. It is traditional to use a comment such as /* Fall thru */ in the example to indicate to readers that a break was not left out unintentionally (a common mistake).

Proponents say that this technique leads to more efficient control dispatching when multiple cases share code. Critics say that using this technique leads to unstructured hard-to-maintain code, and is little better than using gotos.

Many optimizing C compilers will emit the same object code for the above code as for the below -- they merge the common trailing portions of case blocks.

  switch (command) {
    case CMD_SHOW_HELP_AND_EXIT:
      do_show_help();
      do_exit();
      break;
    case CMD_EXIT:
      do_exit();
      break;
    case CMD_OTHER:
      do_other();
      break;
    /* ... etc. ... */
  }

CeeLanguage switch (DuffsDevice)

    int q = (n+7)/8;
    switch (n%8) {
    case 0:do {foo();// Great C hack, Tom,
    case 7:foo();// but it's not valid here.
    case 6:foo();
    case 5:foo();
    case 4:foo();
    case 3:foo();
    case 2:foo();
    case 1:foo();
           } while (--q > 0);
    }

ForthLanguage (jump table from a chess program)

 1 CONSTANT Pawn  ( ... )  6 CONSTANT King
 CREATE genVector  \ each word is ( sq -- sq )
   ' NOOP , ' genPawn , ' genKnight , ' genBishop ,
   ' genRook , ' genQueen , ' genKing , ' genError ,

 : genSquare ( sq -- sq )
   DUP piece@ CELLS genVector + @ EXECUTE ;
 : genMoves
   ['] genSquare forEachSq ;

ForthLanguage (ANS Forth Core Extension wordset)

   : X ( n -- )
       CASE
       test1 OF ( -- ) ... ENDOF
       testn OF ( -- ) ... ENDOF
       ( n -- ) ... ( default )
       ENDCASE ...
   ;

which is semantically equivalent to cascading IF statements:

   : Y ( n -- )
     test1 OVER = IF DROP ... ELSE
     testn OVER = IF DROP ... ELSE
     ( n -- ) ... THEN THEN  ( big list of THENs to match the IFs)
   ;

CommonLisp switch

       (case k ((1 2) 'clause1)
               (3 'clause2)
               (nil 'no-keys-so-never-seen)
               ((nil) 'nilslot)
               ((:four #\v) 'clause4)
               ((t) 'tslot)
               (otherwise 'others)))

HaskellLanguage switch

Using explicit matching:

 take m ys               = case (m,ys) of
                            (0,_)       ->  []
                            (_,[])      ->  []
                            (n,x:xs)    ->  x : take (n-1) xs

Using implicit matching:

 take 0 _      = []
 take _ []     = []
 take n (x:xs) = x : take (n - 1) xs

JavaLanguage switch

       switch (k) {
       case 1:
       case 2:
          return "clause1";
       case 3:
          return "clause2";
       case 0:
          return "nilslot";
       case '4':
       case 'v':
          return "clause4";
       case 't':
          return "tslot";
       default:
          return "others";
       }

PerlLanguage switch (this is Perl 6: Perl 5 doesn't have explicit switches, although the language implementation documentation claims that the compiler will always optimize if-else into switch whenever possible; see e.g. "Programming Perl" by Larry Wall et al)

    given $number {
        when &is_prime   { warn "$_ is prime\n"; continue; }
        when /[13579]$/  { warn "$_ is odd"; }
        when /[02468]$/  { warn "$_ is even"; }
    }

In Perl 5, you can use something like this (adapted slightly from a common idiom of mine):

    &{{
        "foo" => sub { bar(); baz(); },
        "quux" => sub { wibble(); spam(); }
    }->{ $selector }};

If you have a dense set of integer cases, you could use an array instead of a hash. No good idea about implementing fall-through, though :)

    &{[
        handle_0,
        handle_1,
        complain_about_other_digits
    ]->[$digit]};

(Usually I use this setup to just pick a scalar value based on the input. Here I adjust that slightly by using code references for the values, and then invoking the relevant code.)

[Please note that this idiom (in Perl 5) -- using a scalar as an offset into an array or hash of function addresses -- is the basic paradigm of CeeLanguage "object-oriented" method dispatch, as in CeePlusPlus, ObjectiveCee, and various homebrews. Switch statements are generally viewed as a CodeSmell in object-oriented environments because they pessimize message dispatch, which a good OO environment makes faster than the corresponding switch. Also see the discussion under "SmalltalkLanguage switch", below.]

There is a working example of this mechanism in the interpreters for SnuspLanguage.

PlSql switch

 CASE grade
   WHEN 'A' THEN dbms_output.put_line('Excellent');
   WHEN 'B' THEN dbms_output.put_line('Very Good');
   WHEN 'C' THEN dbms_output.put_line('Good');
   WHEN 'D' THEN dbms_output.put_line('Fair');
   WHEN 'F' THEN dbms_output.put_line('Poor');
   ELSE dbms_output.put_line('No such grade');
 END CASE;

RubyLanguage switch

case $age when 0 .. 2 "baby" when 3 .. 6 "little child" when 7 .. 12 "child" when 12 .. 18 # Note: 12 already matched by "child" "youth" else "adult" end

SchemeLanguage switch

 (case (car '(c d))
   ((a e i o u) 'vowel)
   ((w y) 'semivowel)
   (else 'consonant))                    ===>  consonant

VHDL switch (combinational)

 library ieee;
 use ieee.std_logic_1164.all;

 entity CombinationalSwitch? is port
 (  A: in std_logic_vector (2 downto 0);
    B: in std_logic_vector (7 downto 0);
    F: out std_logic  );
 end CombinationalSwitch?;

 architecture RTL of CombinationalSwitch? is begin
    with A select
       F <= B(0) when "000",
            B(1) when "001",
            B(2) when "010",
            B(3) when "011",
            B(4) when "100",
            B(5) when "101",
            B(6) when "110",
            B(7) when "111",
            0 when others;
 end RTL;

VHDL switch (sequential)

 library ieee;
 use ieee.std_logic_1164.all;

 entity SequentialSwitch? is port
 (  A: in std_logic_vector (2 downto 0);
    B: in std_logic_vector (7 downto 0);
    F: out std_logic  );
 end CombinationalSwitch?; 

 architecture RTL of SequentialSwitch? is begin
    process (A, B) begin
       case A is
          when "000" => F <= B(0);
          when "001" => F <= B(1);
          when "010" => F <= B(2);
          when "011" => F <= B(3);
          when "100" => F <= B(4);
          when "101" => F <= B(5);
          when "110" => F <= B(6);
          when "111" => F <= B(7);
          when others => F <= 'Z';
       end case;
    end process;
 end RTL;

Notice that, since VHDL describes hardware, the Switch statement here is basically a multiplexer. In this case, this is an 8-to-1 multiplexer.

VisualBasic switch

Dim Number Number = 8 Select Case Number Case 1 To 5 Debug.Print "Between 1 and 5" Case 6, 7, 8 Debug.Print "Between 6 and 8" Case 9 To 10 Debug.Print "Greater than 8" Case Else Debug.Print "Not between 1 and 10" End Select

Hebrew (Ivrit)

    בדוק משתנה
    אם משתנה = 1 אזי
    שלום
    אם משתנה = 2 אזי
    להתראות
    אחרת
    לך-יא-מניאאק
    סיים

DelphiLanguage switch note uses sets, can work with characters and enumerated types instead of integers

 case I of
   1..5      : Caption := 'Low';
   6..9      : Caption := 'High';
   0, 10..99 : Caption := 'Out of range';
 else
   Caption := '';
 end;

 case Letter of
   Vowels      : Caption := 'Vowel';
   Consonants  : Caption := 'Consonant';
 end;

where Letter is a char and Vowels and Consonants are Sets.

SmalltalkLanguage switch

anObject aMessage

where aMessage is a method defined on the class of anObject.

(One can also define syntax for a SmalltalkCaseStatement within the language.)

How does that work when you're switching on 1..5, 6..10, and "other"?

A Smalltalk program implementing such a dispatch (and they are exceedingly rare, because the object/message paradigm so richly moots most situations that provoke a Switch) typically looks up a method name in an Array or Dictionary which they then perform: -- the equivalent of the C expression (*action[selector])().

In the interest of advancing the discussion, I built a test harness and measured the results. I simplified your example to dispatch on a number between 1 and 16 -- this gets at the most important aspects of your question. I defined a new class, SwitchTest, with sixteen methods with selectors #m01, #m02, ... , #m16. For the sake of testing, these methods were no-ops. I then built and measured the performance of three dispatch approaches:

"Array Dispatch": I constructed a "dispatchArray" of size 16, where the contents of index N is the integer offset of method m<N> in the methodsArray of SwitchTest. I then constructed a test method that counts an index from 1 to 16, looks up the corresponding offset in dispatchArray, then loads and evaluates the CompiledMethod at that offset.

 (aMethodsArray at: (dispatchArray at: anIndex)) executeWithReceiver: self andArguments: #()

"Perform": I constructed a dispatchArray of size 16, containing 16 symbols, which were #m01, m02, ..., #m16, such that the symbol at index N was #m<N>. I then constructed a test method that counts an index from 1 to 16, looks up the symbol corresponding to the index in dispatchArray, then performs that symbol in the receiver.

 self perform: (dispatchArray at: anIndex)

"Message send": I open coded the sixteen message sends -- this corresponds to the most common idiom in Smalltalk.

 self
 m01;
m02;
"..."
m16.

The above is a load of bull. This is hardcoding the evaluation of m01, etc. when in reality the values for switch statements are variable. Add the fact that objects must be coaxed into selectors before they can be sent to objects.

Smalltalk needs a switch statement, because nested ifTrue/ifFalse's are ripe with design smell.

I then used Time>>millisecondsToRun: to measure the time required to execute 1000000 repetitions of each of the above three cases (on a 3Ghz Pentium with 1G RAM running IBM Smalltalk). Here are the times I measured:

Array Dispatch Test: 637.5
Perform Test: 284.4
Message Send Test: 189.1

My interpretation: using the "switch statement" approach (perform) introduced about a factor of two in dispatch overhead, in comparison to a "regular" message send. It might be worth comparing that to the other languages and environments here.

Even CeeLanguage imposes some overhead to setup the switch statement, and it might be interesting measure that overhead in comparison to a straight procedure call on each of the resulting branches. (But of course a C switch/case statement is faster than a series of if-elses.)

My point in putting up the example is to put some quantitative bounds (in this case a factor of two) on the switch statement dispatch overhead imposed by the language. If I was parsing something and cared about the performance, I would probably write and debug the parser in Smalltalk and then recode the lexer into C, calling the lexer through the external function API (AlternateHardAndSoftLayers).

The interesting thing is the relative overhead of a switch-like mechanism in Smalltalk, as compared to its natural behavior (not to C). My point is that 99% of what one does with switch/case in a procedural language is mooted by a good polymorphic language (like Smalltalk). Switch/case is needed in C because you don't have polymorphic message sends (unless you want to roll your own tables of function pointers).

The C-style switch isn't needed in Smalltalk (as a separate language construct) because its function is implicit in the method dispatching behavior. If, in the pathological Smalltalk case where you DO need switch, I showed that it's about twice as slow as a "straight" message send (in Smalltalk). The performance arguments don't make sense until you look at much larger grain sizes, where the industry has learned (particularly help from AlternateHardAndSoftLayers) that C's fine-grained performance advantages do not, in general, translate to corresponding application performance differences.

CobolLanguage evaluate-when (COBOL 85)

 evaluate policy-holder-age also policy-holder-sex also smoking-flag
   when   20 thru 40        also "F"               also ANY           perform young-female-policy
   when   65 thru 999       also "M"               also "Y"           perform old-smokey
   when other                                                         perform all-other-cases
 end-evaluate

It's amazing; you can do anything. (I'm not claiming it can do anything well, just that it has an amazing array of expressive options. ;-)

Let's standardize on one particular form. A simple substitution process can then generate the correct syntax for your chosen language.

Um... not true, since different languages switching capabilities are different. No one generic form can be translated to each language.

Long ago, when memory was very limited, there wasn't room for either a jump list or a series of tests in a particular program, so I calculated the required address as a linear function of the switch expression (using a shift instruction for the multiplication), and then made a jump to the calculated address. Those were the days. ;-)

See SwitchStatementsSmell, CaseStatementsConsideredHarmful (summary: use OO/generic dispatch on the type (InternalPolymorphism) in preference to switch on the type ("TypeCase", "ExternalPolymorphism"); when looking at things other than the type, if Polymorphism is impossible or introduces too much extra machinery, then use switch in preference to if-else whenever possible. Contrary to some Purists, in some languages it is not possible to replace every switch statement with polymorphism, and in other languages it is not always desirable. Classic example: switch on char value in a lexical analyzer. But switch-on-type should indeed always be replaced.)