Slice Result Vertically

This is an SqlPattern.


Intent Construct complex tablic result by using a series of queries adding columns to a growing core table.


Context You need to extract complex tablic result generated by several groupings and joins from a relational database.


Forces


Solution Retrieve result from a stored procedure that creates a temporary table and fills it using several queries, each filling one or more columns with data.

The temporary table must be created with create table statement defining the entire structure, allowing NULLs in columns filled with the queries after the first. The first query is insert into .. select .. type statement, establishing 'the core' result: the columns uniquely identifying the resulting records with all auxiliary info that can be gathered in one statement. Subsequent queries (again insert into .. select ..) do not add new rows to the table, but are joins of the temporary table with others, using existing key columns for join condition.

The stored procedure returns the result built from several queries.


Resulting context You manage to execute much more complex operations in the database, avoiding:

However,


Rationale Implementing a complex composition of relational operations can be difficult with the limited expressive power of SQL. By building the result from vertical slices you manage the complexity of the result, building it on several steps.

-- NikolaToshev


This is a CodeSmell that often indicates a dysfunctional database schema. On occasions it is necessary, since we don't all have the luxury of changing the schema. Bad things to look for are use of temporary tables for selecting max() elements (as in time-series extractions of rows with latest date on some subkey), and grouped keys.

Well, the transaction processing part of your program would never need this if it is an OO program using relational database for persistance. If you have to make some analytical reports but full-scale OLAP is overkill, it becomes normal that SQL is too limited to get everything you need at once, even though the database is normalized and fine. -- NikolaToshev

This sounds like trying to fit a StarSchema in one table? Is the point of this to create one-temp-table-per-query instead of one-temp-table-per-dimension? Please compare and contrast with StarSchema.

StarSchema is about creating a schema suitable for analytical querying (no temporary tables here). SliceResultVertically is about constructing a query when you have the schema given (suitable to your needs to various degree).

Example

Let me try to show that you may need to SliceResultVertically with a good database design - even design suitable for analysis. Suppose you work with the schema from the StarSchema description (see http://c2.com/ppr/stars.html#5):

Now, you are asked to produce a report showing monthly sales of Gum Balls as a percentage from the yearly sales:

 Customer_Name      Month         Year        Monthly_Sales            Percentage_From_Yearly_Sales
 ---------------------------------------------------------------------------------------------------
 ....

You have to write a stored procedure that returns a single result set (it will be fed in the reports engine). Meaning of the fields should be obvious, Percentage_From_Yearly_Sales being Monthly_Sales/Yearly_Sales for that customer. The company has 1000 customers and operational data for 20 years. No additional filtering just to simplify things.

There are different solutions, can you see them? (I'll post one applying the pattern in a couple of days - I am not sure if it is the best in this case). -- NikolaToshev

It is probably the simplest and thus understandable approach. Nevertheless, temporary tables can unreasonably load the database since reports are often huge. So in this case I probably would generate the smallest table that does the job which would be yearly sales per customer per product. 20000 short rows. As opposed to the naive approach which would generate a much larger temporary table of 240000 longer rows. The resulting query requires a join to the synthesized table so it should be indexed appropriately. Since it is smaller it may remain in cache making the join almost free. I suppose my point is that this technique can be expensive, and can make it difficult to run OLAP with DSS in the same database. -- RichardHenderson.

Funny, there's no pages on OLAP (OnLineAnalyticalProcessing) or DSS (DecisionSupportSystems?).


I'm experiencing a similar context right now. I've got an online report which lists players in a trivia game, along with their scores for each of the rounds of trivia completed so far. Since each round is generated weekly, the number of columns for each player keeps expanding. I'm currently generating the contest scores by running a query once for each player to retrieve all the player's scores, then turning all those rows into columns to fit into the final report (one player per row). But since I run the score-generating query once for each player, as the number of players grows, I'm quickly reaching ASP's default 90 second time limit. I could always up the time limit, but I'd like to find a better solution that would not take too long to implement. I'm also limited in technology since the client's website only supports Access97 and ASP. I don't believe it's possible to dynamically generate temp tables in Access. Perhaps the best solution with these limitations is to run the query once for all players (rather than once for each player) and 'manually' join the columns in code? Any suggestions?


EditText of this page (last edited December 13, 2004) or FindPage with title or text search