Pattern Matching

In the context of pure functional languages and of this page, PatternMatching is a dispatch mechanism: choosing which variant of a function is the correct one to call. Inspired by standard mathematical notations.

A feature of FunctionalProgramming and LogicProgramming languages (not to be confused with MatchingStrings, though AwkLanguage gives a good taste of how expressive pattern matching can be, even when restricted to just strings and arbitrarily separated fields - ScottVokes) which is fundamental to making these programming paradigms declarative rather than imperative. Secondarily, implementing ExternalPolymorphism and the TranslatorPattern is trivial in languages with PatternMatching.

Example:

factorial(0) ::= 1
factorial(n) ::= n * factorial(n-1)

The function has been given two definitions. Which is used for dispatch when a call is made will depend, in this case, on whether the actual parameter pattern matches 0 or not. The matching is not limited to constants; in general, functions have a TypeSignature? used for matching. Currying is integrated into this mechanism.

Pattern matching is the compiler's ability to compare both the form and the content of two expressions. It is used in two ways:

All or part of a data structure may be assigned to one or several variables in a single expression.
A function may have several definitions (clauses), according to the form and/or content of its arguments. This second use is usually combined with the first to initialize the variables that will be used in the clause.

In both these uses, a pattern match has the additional benefit of making an assertion about the form and/or content of a variable.

For an example of the first use (using ErlangLanguage), suppose we have the following data structure:

 Book = 
  {book,
"Extreme Programming Installed",
{isbn,0201708426},

["Ron Jeffries","Ann Anderson","Chet Hendrickson"]}.

Then, evaluating the following expression:

 {book, _, {isbn, IsbnNumber}, _, [FirstAuthor|_]} = Book.

asserts that the variable Book holds a tuple:
- with five elements
- whose first element is the atom 'book'
- whose third element is itself a 2-tuple
- whose last element is a non-empty list
assigns the value 0201708426 to the variable IsbnNumber
assigns the value "Ron Jeffries" to the variable FirstAuthor.

An example of the second use would be a function with multiple clause definitions:

search({book, _, ISBN, _, Authors}) -> ...
search({video, _, ISBN, Cast, Length}) -> ...
search({audio, _, ISBN, Tracks, Some, Other, Metadata }) -> ...
etc.

This uses pattern matching to simultaneously do function overloading and extract fields from the structure into local variables. The overloading is based on the actual values, though, not types.

A few things to note. The list of patterns to match is ordered, in that when determining a match, the various functions are called in order. There is no requirement that the clauses be disjoint. In the factorial example above, it should be noted that the second case doesn't specify n>0 as a precondition; the case of n==0 is handled by the first clause. (However, it might be nice if it did, for other reasons - the second pattern will happily match n==-1, which will result in an endless loop as implemented.)

Many examples of pattern matching in Haskell use a match-anything wild-card as the last clause, which matches any and all arguments.

It is an error if no patterns match.

 % Define type `expr' to have two constructors: 
 %Constant(anInteger) represents a constant
 %Sum(anExpr, anotherExpr) represents the sum of two expressions.

 datatype expr =
Constant of int
 | Sum of expr*expr

 fun eval(Constant c)=c% A constant evaluates to itself.
| eval(Sum(x, y))=eval(x)+eval(y)  % eval(x+y)=eval(x)+eval(y).

The compiler infers that eval takes an expr as its argument, and it makes sure that both Constant and Sum are handled. If someone comes back later and adds, say, a Product constructor to expr, any pattern-match that dissects an expr without somehow handling the case of Product will raise a warning.

Also, multiple assignment in ML is a special case of pattern matching on tuples. I can say:

 let (a, b) = (b, a+b)

(The parentheses are part of the tuple syntax in SmlLanguage. (But IIRC not in e.g. OcamlLanguage.))

ML just extends pattern matching to all datatypes: for example, because there is an expression

 elem :: rest

which returns a list where the element elem is prepended to the list rest, there is a pattern

 let (elem :: rest) = alist

to deconstruct a list into the first element and the rest. Consider this:

 let rec pairwiseswap = function
| x :: y :: rest -> y :: x :: pairwiseswap rest
| [x] -> [x]
| [] -> []

You can also define new "tagged" types, as in:

 type length = Meters of float
 let myHeight = Meters 1.8
 ...
 let Meters heightAsFloat = myHeight

After this, heightAsFloat has the value 1.8. "Meters" acts as a tagged wrapper to ensure type safety. Also:

 let combinedlength (Meters x) (Meters y) = Meters (x+.y) j

"(not to be confused with MatchingStrings)"

Actually, matching strings and matching data structures are really special cases of the same thing. Look at SchemaLanguage?s like TREX or RELAX: they define grammars over trees, but are otherwise based on the same constructs as string RegularExpressions. It is possible to define pattern languages that work equally well over directed graphs, trees, sequences/strings, and combinations of these. There is also a connection with type systems based on SemanticSubtyping.

It is useful to compare with the way this is implemented in a traditional imperative language, just to see the differences and their consequences. A recursive factorial implementation in C (we'll ignore issues of overflow and error-handling, and stay recursive so we have an apples-to-apples comparison) might be:

 int factorial (int n)
 {
if (n == 0) return 1; /* base case */
else return n*factorial (n-1);
 }

In the C version, the if/else logic replaces the pattern matching.

Some might say, so what?

In the Haskell case, the factorial implementation is - in many ways - two overloaded functions, rather than a single function. This has several nice consequences:

1) If the compiler can show which pattern would be matched by a given function invocation, it can emit code to call it directly, resulting in a performance optimization. 2) If the compiler can show that no pattern would be matched, it can produce an error. 3) Often useful as documentation, and in a way of expressing the programmers intent. 4) A succinct way of specifying preconditions.

Note that Haskell compilers can emit an "if-then" sequence if the pattern to use cannot be statically determined. In many cases, though, it can, due to Haskell's powerful TypeInference capabilities. Also note that a Haskell programmer could write functions the C way (without the pattern-matching), were she to wish to.

The DestructuringMacro page has some discussion of PatternMatching in Lisp (seems specific to CommonLisp). Destructuring-bind is standard in CommonLisp. Does it do only "assignment"?

This topic is much referenced on this wiki.

PatternMatching is not standard in the SchemeLanguage. The most advanced pattern matching for scheme seems to be Andrew Wright's "Pattern Matching for Scheme", http://www.call-with-current-continuation.org/match.ps (an earlier version is at http://download.plt-scheme.org/doc/103p1/html/match/). Unlike CommonLisp (surprise), Wright's mechanism specializes each of the binding forms, e.g. "match-let". He has a MzScheme implementation (match.ss, best found by Googling), and an older, apparently portable implementation (Google for match.tar.Z). You may find Wright's papers lacking in explanation; a page from the GaucheLisp? reference document may help http://gauche.sourceforge.jp/doc/gauche-refe_318.html.

Any other solutions for scheme?

It should be noted that for all the complexity above, PatternMatching is something children do (without even a calculator mind you) with simple algebra such as

a*x+b*x = x*(a+b)

Maybe the teaching is poor, but many children (even at age seventeen) make errors in exams relating to algebra as simple as that.

Not to mention the author that blew the pattern by commuting the * operator

Part of the problem may be that the rules are not made explicit. They can see the teacher write 3*p+p*4 = 7*p and it kind of makes sense yet there are several other rules involved, not just the one above but it is considered "obvious" , relies on learning the other rules previously in the same semi-implicit way. Most do get it correctly, but as you say, many don't. But both children and adults have been using PatternMatching as a MentalPattern long before cmputers, the point it should not be thought of as something esoteric.

Guarded PatternMatching: Patterns are matched only if predicate tests are proven true. E.g. max n m = n when (n>m). The when (n > m) is called a guard, and is different from other sorts of tests in that the whole max n m = line is simply considered irrelevant whent he guard doesn't pass; it needn't be evaluated. The use of guards complicates pattern matching a bit by preventing matches on pure structural data. Actual checks on values become required. Testing guards adds runtime complexity to the function, and optimizing the guards (e.g. to avoid redundant or unnecessary tests) can be quite difficult. OTOH, if guards are allowed, then patterns become the only necessary means to discriminate branches during evaluation.

Structural PatternMatching: If one can match not just on the value, but on the syntactic or structural properties of the value, this leads to a very complex sort of pattern-matching. Among other things, it allows a function to handle a rather broad domain, including domains where different values have very different syntactic or structural properties.

  f 0  = whatever0  -- f on 0
  f (n :: Nat) = whatever1  -- f on natural
  f (r :: Real)= whatever2  -- f on real
  f (x,(y :: Nat),z)= whatever3  -- f on tuple(product/covariant type)
  f (a:(Nat)|b:(Real)) = whatever4  -- f on variant (variant/coproduct type)

  g (x :: Rational)  = f x  -- full definition of g (Rational -> ???)

In this case 'g 0' should return 'whatever0', 'g 2/3' or 'g -1' should return 'whatever2', and 'g 8' should return 'whatever1'.

Note that this is different from the C/C++ style of static polymorphism, which first matches structure as one value then matches value as another. I.e. C++ would look at 'x :: Rational', decide that 'Rational <: Real', and promptly define 'g (x :: Rational)' = 'f (x :: Real)' = whatever2 for all cases.

Structual PatternMatching becomes far more complex when the match is heuristic rather than totally ordered, especially for product or coproduct types. E.g. if you have {a:1,b:2,c:3}, which does it "best match" of the structural-patterns {a:Nat * b:Rat} or {a:Int * b:Int} or {b:Int * c:Int} or {b:Int * c:Int} or {a:Real * b:Real * c:Real}? The coproduct is even more difficult to handle... i.e. if a value is a:n for some integer 'n', which does it "best match" of (a:Nat|b:Int), (a:Rat|c:Int), (a:Int|b:Real|c:Nat), etc.?

Anyhow, dispatch based on runtime parameter types requires an implementation of Structural PatternMatching, though the complexity of the implementation can vary based on the heuristic utilized to decide the best match and on the set of structures regarding which one is allowed to match.

If Structural or type descriptions are allowed to possess guards in the given language, then Structural PatternMatching is at least as broad as Guarded PatternMatching.

Partial-Structural PatternMatching: The above describes matching against complete structures, but it is possible to perform matches against partial or abstracted structures, where variables exist also in the structure description.

  f (s :: Stream 't)= whatever5  -- f on template stream-structure (incompletely defined over 't')

Allowing partial structure matches, such as the :: Stream 't described above, adds some complication to the matching. In particular, Stream 't may be a partial evaluation on a function that returns the 'stream' type (i.e. a 'template' function in C++), thus use of Stream 't requires that this function be reversible from its result so one can go from the resulting type back to find the 't'. Deciding whether the 't' should be passed into the body whatever5 is an open question. If it is passed as an extra parameter, then something like f ((x :: 'X),(y :: Nat), z) would pass 'X' to the body as the type-descriptor for 'x'. In a dependent-typing system, such type-descriptors are easy to find... but not particularly useful over a single value (since 'X' would be the type that accepts ONLY 'x'). In either case, this adds a lot of complexity to the function descriptors and to either the runtime or compile-time environment.

Heuristic PatternMatching: When multiple patterns match and there is no total order for which pattern to select, then some sort of heuristic resolution is needed to select the result. If more than one pattern is allowed to match (e.g. for multi-values and parallel or backtracking evaluation), then heuristic resolution is often still desired for best match. In languages with polymorphism on functions, this best match is the "most specific type" or, more relevantly, the most specific (structural) pattern among a finite set of such patterns, according to some concept of "most specific".

Performing heuristic pattern-matching requires a function that can offer an ordering on patterns given a value either at runtime or compile-time. A typical means would be to leverage a distance function, such that f value pattern offers a numerical distance, with some special value indicating infinite distance. The choice of this ordering function or distance function is often of enormous consequence. Among other things, it enables or disables multiple-dispatch vs. single-dispatch in language design.

A very big advantage of heuristic pattern matching in function description is that it does not require that the programmer be aware of how future programmers extend or specialize a function. It allows decentralization of function description.

Ordered PatternMatching: Many functional languages (including Haskell and ML) avoid the issues behind heuristic pattern matching via use of a total order on function variants. This order may be implicit (e.g. based on order of presentation) or explicit (e.g. a number on each pattern descriptor). In practice, this means that a programmer must be aware of all other variants to which a function might dispatch to properly place a new variant or specialization within that total order... which, in turn, means that ordered pattern matching ought, in practice, to be centralized, which diminishes the need for explicit descriptors. Thus, in practice, all ordered PatternMatching will define functions in a centralized manner and specify ordering implicitly.

  fib 0 = 1
  fib n = (fib (n-1)) + (fib (n-2))
  fib 1 = 1  -- never matched! 'fib n' implicitly matched prior to 'fib 1'.

The big cost of Ordered PatternMatching is that it, in practice, prevents decentralization and specialization of a function description. It isn't possible to extend the domain or add corner-cases based on the domain that optimize or handle special exceptions, at least not without wrapping the original function with another.

The big advantage is that the centralization makes it easier to verify properties on the total set of patterns, such as completeness on the domain or calculation of the domain. Additionally, it completely avoids issues such as how to unite functions and such at link-time or load-time.

Matching On Context

One major variant is to dispatch not only on inputs, but also on the return value. This moves a bit beyond mathematical functions, though; mathematical functions are from Range to Domain, and are independent of context. This moves beyond that into context-rich syntax and grammars.

Essentially, the structure and properties of the return value are either implied or made explicit by the context to which the Domain of a function is returned. To match on context requires that one determine the properties that must be possessed by the return-value, and to find variants of that function that best match or at least overlap those properties. I.e. if one asks for '.1' and '.2' parameters on the return value, one should consider only function variants that might have return values type-compatible with a tuple-pair. More explicit constraints may be offered through ascription or assertion. In fact, all such properties can be considered constraints; matching on context is a problem of ConstraintsBasedProgramming?.

With matching on context, one automatically gains context-dependent definitions, since any unit of a given Type can also be described as a function of ()->Type.

Allowing matching on context, though expensive to parse, is of fantastic value in making a language highly expressive. It allows a programming language to almost approach natural language in the richness and depth of evaluated statements, with only macro expansion and elimination missing (since matching on context implies exactly one return value).

Since Matching on Context isn't associated with mathematical functions, it may be a better feature to reserve for macros and syntax extensions than for functions.

Flexibility vs. Speed

Heuristic Partial-Structural Pattern Matching over decentralized function variants a heuristic that allows full multiple-dispatch on a covariant types and sequential inputs (e.g. tuples, records, functions returning functions) and return type... that's about as flexible as it gets! It is, unfortunately, also as slow as it gets.

In some fields, the cost of such advanced PatternMatching might be a necessary complexity. One example is the parsing and compilation of a language that allows for syntax extensions. The speed cost doesn't hurt; it'll be there like it or not. However, the generality provided by this system will tend to result in slow runtime and very complex link and load-time operations. For example, when loading two programmer-units together to intercommunicate, if both possess variants of a particular function that must be the same function between the two components, then those variants must also be united. However, if finding the "best match" must wait until load-time, many static optimizations become impossible to perform.

If a language that offers pattern-matching function dispatch and decentralized function description is to also allow advanced static optimizations, it must enforce some rigidity in order to allow for partial-evaluations prior to load-time. PartialEvaluation is, after all, the mother of all optimizations (with the father being RedundancyElimination?). In any case, this means that the functions are allowed to be decentralized to, at most, the set of link-time components (supposing a compile/link/load sequence). If, in this environment, functions are still desired that vary at or after load-time, they will probably require allowing load-time actions that manipulate state or access services, which breaks away from truly functional programming.