Replication War Stories

Replication of data is a useful technique to provide a copy of data in a more compact, faster, local, or more digestible form for a particular user set. However, it is also fraught with potential gotchas.

Some of the more common ones include:

Some of the more involved ones include:


Batch Data Adjustment

Sometimes somebody spots a an error or omission in the data which may go back years. If the replication is based on a change-date, then a mass fix could overwhelm processes that are designed around a "typical" load size. I've seen update counts jump 2 orders of magnitude.


Junk Characters

I had a nightly process humming along for more than a year. One night the process get stuck on one record because it couldn't execute the SQL. After futzing around I discovered that it had unprintable characters in a "note" column that were confusing the SQL parser. I added an ASCII range character scrubber to every character column after that.

Of course, this problem is not unique to using replication.

Ideally I should have added scrubbing up-front, but I YagNi'd it for good or bad. The added scrubbing did add to the processing time, but turned out an acceptable cost in terms of resources.


CategoryStory, CategoryDatabase


EditText of this page (last edited June 29, 2010) or FindPage with title or text search