The way to correct GuruChecksOutput:
With RegressionTesting the GuruChecksOutput the first time the program is run, to see if it is correct, but future checks are done automatically. When later changes cause a test to fail, a guru may have to check the output to see if the change in output is a bug, or an intended new feature. This can save a lot of time.
Realize that manual checking of program output is tedious and error prone (in addition to being slow, as mentioned above). Your guru will make many mistakes, missing many errors, until they quit in frustration. -- JeffGrigg
Your guru can't be trusted to identify correct output manually, but can be trusted to write a test that blesses that output? Perhaps you should RefactorYourGuru?. -- DanilSuits
Put it this way, you cannot rely on a human (guru included) to do the same thing right every time, but you can rely on the computer to do the same thing the same way every time. So you tell the guru to automate the test and do it right only once, and you get the computer to do the correct checking every time. -- OliverChung
If your guru makes a mistake writing the TestCase initially, at least it is in code so it can be identified and fixed later. Making a mistake in a manual test like skipping a test condition will be impossible to identify at a later date: "Why isn't this piece of functionality working any more? I'm sure I tested it..."
Software is really too complex to manually check; for a guru or anyone else. One also needs to consider the familiarity problem. No matter how thorough someone may be the first time he test a program, he will be less thorough the second time, even less thorough the third time. What would you expect to happen by the time the program is 1 year old, 2 years old, 5 years old, 10 years old? The tester quickly begins to see what he expects to see and will miss "obvious" mistakes.
I have mediocre to negative experience of GuruChecksOutputDiffs?. Granted, it's better than no test (or GuruChecksOutput, which degrades to "untested" when the guru gets bored or complacent), but I think it can be too low-level in some cases.
The context is "source document goes in, changes are made, reformatted document is saved". Extra complexity is added because the guru is composite: a programmer who understands HTML, parsers and regexps plus a technical writer/editor who specifies layout etc. but does not necessarily want to know or care about the HTML level details.
I think the problem is caused by the order of coding:
Sorry, I've used far too many words to describe a vague problem. -- MatthewAstley
I think completely automated testing will succumb to the CaptchaProblem? in some (probably not most, certainly not all) situations. The solution is to "phrase" the output of a test in such a way that the property being testing for is as explicit as possible. In the case of a flickery border between two 3d objects, one tries to reduce it to a single boundary, with no other visual features. In the case of html formatting, to a rendering of the smallest piece of that html which makes sense. One must avoid the temptation to test more than one thing at a time, for the same reason one would avoid testing more than one thing that could possibly break in a single assertion (at least, with out some previous attempt to distinguish such cases... null pointer handling vs instanceof checks in java come to mind).
A possible rule of thumb would be that if a test case couldn't be mechanically updated as one updates the code, or hand-updated before one updates the code, then a completely automated test may not be practical. Even so, one should try quite hard to stay completely automated; failing that, to automate as much of the process as possible, and then limit the extent of the resulting captcha as much as possible.