Data Comparing Algorithms

Here is a general algorithm for comparing two similar tables (or copies of tables).

 for r = each record of t1 not in t2  // based on primary key
   display record
 end-for

for r = each record of t2 not in t1 display record end-for

for key = each record common to both

r1 = getRecord(t1, key) r2 = getRecord(t2, key)

for each attribute in r1 not in r2 display key, attribute, and value end-for

for each attribute in r2 not in r1 display key, attribute, and value end-for

for attr = each attribute in r1 if r1[attr] not-equal-to r2[attr] display key, attribute, r1[attr], r2[attr] end-if end-for

end-for


Trees

One technique for comparing trees is to print them as text, and then use text-based difference finders common in most operating systems. Use recursion to create the outline-like format, and perhaps disable white-space sensativity so that different levels won't be over-emphasized. You may want to try it with and without the full node path, as each will emphasize different kinds of changes. Here is a pre-differenced sample text file:

  node 1 foo=7 bar=9
  ..node 1.2 foo=12 bar=7
  ..node 1.3 foo=9
  node 2 bar=6
  ..node 2.1 foo=12
  ....node 2.1.1 foo=12 bar=99
  ..node 2.2 bar=786  

(Dots used to prevent TabMunging, but may not otherwise be needed in actual file.)


EditText of this page (last edited January 31, 2007) or FindPage with title or text search