Partitioned Graph

A Partitioned Graph (in category RelationalPatterns) is a group of several tables of a RelationalDatabase schema, in which all records belong to partitions based on a key value, and no foreign key relations are allowed between rows not in the same partition. An example would be a single database instance used by multiple companies that do not share data.

The issue here is how to get a relational database to enforce the partitioning across all tables in the graph, and still follow the LawOfDemeter as much as possible, not propagating a direct dependency on the master partition key across an entire large graph. We can achieve a good compromise arrangement by implementing sub-partitions where appropriate, and having each table include a partition key value referring for its own level of the hierarchy, but no others.

Let's say that our database partitions data by company, and we want to represent companies, brands, product categories, products, product style, and inventory where a product can have a brand and a category, each product has multiple styles, and specified combinations of style, and product are allowed. An inventory item is of a specific product style.

Here's how we might make a partitioned graph schema for this case.

 company
  *company_id
   company_name

brand *brand_id company_id (fk to company) brand_name

product_category *product_category_id company_id (fk to company) product_category_name

product *product_id brand_id (part of fk to brand) product_category_id (part of fk to product_category) company_id (part of fk to both brand and product_category) (enforces partitioning by company) product_name

product_style *product_style_id product_category_id (fk to product_category (no partitioning key relation required for simple 1-m) product_style_name

product_product_style (m-m junction table) *product_id (part of fk to product) *product_style_id (part of fk to product-style) product_category_id (part of fk to both product and product-style ) (enforces partitioning by product_category - sub-partition of company)

inventory_lot *inventory_lot_id product_style quantity

Notice that the junction between product and product_style no longer needs a direct association to the company partition because product category is a sub-partition of company that constrains records in both of those tables as well as the junctions between them. As the graph becomes more complex, there can be many more levels of sub-partition, and in each case, changes at the higher levels need not ripple down more than one layer of the partition hierarchy.

If we extend the graph above, we might have to add some missing partition key fields to some of these tables, but effects don't ripple far, and we can even insert levels of partitioning at the top without suffering a catastrophic ripple across the entire graph.

Down-sides:

Alternatives: Both of the above alternatives have the advantage that if a business rule requires overriding a partitioning restriction in some cases, that can be done without changing the schema. Both alternatives also share the down-side that the partitioning rules are not evident in the schema or by looking at the EntityRelationshipDiagram.

-- SteveJorgensen


One could argue that perhaps each company should simply have its own database. See also WebStoresDiscussion.

-- Yes, one could argue that, and one would often be right, but not always.

There are also other types of situation in which the same issue arises and multiple-databases option is definitely not a good idea. For instance, let's say you need a metaschema to describe another database so you can describe data transformation operations to/from that database. You might have a table definitions, field definitions, index definitions, and relationship definitions. A table has fields, and an index has fields that must exist in its table. A relationship has relations between 2 tables based on fields that must belong to those same 2 tables.


See also:


EditText of this page (last edited October 13, 2009) or FindPage with title or text search