Cartesian Product

Cartesian product is an operator on sets (tuples, bags, ...). The result from the cartesian product of n sets, is a set of all possible ordered 2-tuples, containing on i-th place an element from i-th set. CartesianProducts are not concerned with any tuples larger than 2. To represent those you need to go dive into MatroidTheory.

For example, the cartesian product of {a,b,c} and {c,d,e} is {(a,c),(a,d),(a,e),(b,c),(b,d),(b,e),(c,c),(c,d),(c,e)}.

Formally, the cartesian product of two sets A and B (written as A x B) is the set of tuples (a, b) with a element of A and b element of B.

In CategoryTheory, the product of A and B is any object C for which there are projections p1 and p2 such that p1 : C -> A, p2: C -> B and that for any other object C' for which there are similar projections p1' and p2' there is a unique morphism norm : C' -> C that makes p1 . norm = p1' and p2 . norm = p2'. The cartesian product is any instance of a product in cartesian closed categories (SETS is an example of one).


I think it helps to picture this in basic arithmetic. The 'product' of two numbers is the sum of all possible combinations of the elements of those numbers. For example, 3 x 4 means that each "element" of 3 is matched at least once with each "element" of 4. The result is 12 pairs of elements. If we count the number of pairs, we get the arithmetic product of two numbers. If we get the set of all pairings, however, then we get the cartesian product.

e.g. 3 x 4 is analogous to something like (a,b,c) x (1,2,3,4):

If we count the above, we get 12 pairs (arithmetic product). If we just grab the collection of all pairs, then we have the cartesian product.


StructuredQueryLanguage query: select * from a, b

results in the cartesian product from rows in a table and rows in b table.

PythonLanguage ListComprehension: cartProd = [(x,y) for x in list0 for y in list1]


I've found the following to be an oddly useful ReFactoring tool.

When you have two axes of variation - a situation when you might want MultipleDispatch, for instance - recognize that you are dealing with the Cartesian product of the possibilities. You can implement it by iterating over axis A first and then axis B, or over axis B first and then axis A, or implement each pair in the product separately. For different problems, a different choice is usually appropriate, but whoever wrote your legacy code probably chose the wrong one. And in some cases you can find a way to avoid the Cartesian product entirely (perhaps with a BridgePattern).

Identifying hidden Cartesian products in a design is my favorite way of spotting code which can be refactored with really big benefits.


EditHint: Consider merge with CartesianJoin


EditText of this page (last edited November 17, 2014) or FindPage with title or text search