Replication of data is a useful technique to provide a copy of data in a more compact, faster, local, or more digestible form for a particular user set. However, it is also fraught with potential gotchas.
Some of the more common ones include:
Batch Data Adjustment
Sometimes somebody spots a an error or omission in the data which may go back years. If the replication is based on a change-date, then a mass fix could overwhelm processes that are designed around a "typical" load size. I've seen update counts jump 2 orders of magnitude.
Junk Characters
I had a nightly process humming along for more than a year. One night the process get stuck on one record because it couldn't execute the SQL. After futzing around I discovered that it had unprintable characters in a "note" column that were confusing the SQL parser. I added an ASCII range character scrubber to every character column after that.
Of course, this problem is not unique to using replication.
Ideally I should have added scrubbing up-front, but I YagNi'd it for good or bad. The added scrubbing did add to the processing time, but turned out an acceptable cost in terms of resources.
CategoryStory, CategoryDatabase