Interpreters Are For Testing

The development process in some interpreted languages is test-driven by default. That is why interpreters are so valuable. When I am writing a Scheme function, for example:

The first thing I do is write a comment that shows what the function should do. Sometimes there is a particular example, and the rest of the time the description is general but it is easy to come up with particular examples. The whole idea here is that the comment is precise enough that it allows programmers to think of ways to test the function.
Then I try coding the function according to my own description.
Then I fire up the interpreter and see if the function really does behave the way my description specifies. I call the function several times, from the interpreter. If it works, great! I'm done. If not, I have to examine the function and re-code it.

This basically is TestDrivenDesign, and almost CodeUnitTestFirst, just like ExtremeProgramming except for these things:

Tests are not automatic. You have to go to the interpreter and test your functions manually. You also have to be able to look at the return values and recognize that they are correct.
Tests are not artifacts. When you type something at the interpreter, it runs, and then there is only the ghost of it on your screen, and when that scrolls off, there is nothing. However, you can always run the test again, or one like it. The artifact is the comment that describes the function's behavior in sufficient detail that you can write tests.
Any programmer should be able to make a new test for the function. If the tests are all slightly different, that's good, because then the code is more thoroughly exercised.

It's easy to change interpreted testing into ExtremeProgramming. Write functions that recognize correct return values automatically, and return "true" or "false." Embed the tests in the code, so that they become artifacts and run automatically.

However, interpreted testing has one advantage over ExtremeProgramming: Testing of Generalizations.

Suppose you want to write a test artifact that demonstrates that a sort function really does sort its input correctly. Your test should have the ability to generate random items. Then it should generate a list containing a random number of random items. The test should clone that list and sort the cloned list. Then it should loop through the original list and make sure that every item in the original list is in the sorted list. Then it should loop through the sorted list and make sure it is in sorted order. This is the only reasonable way a test artifact can express the idea that any items can be sorted. If the test is hard-coded to sort a specific list, then it only demonstrates that the function sorts that specific list. Programmers will feel the urge to write another hard-coded list into there to make sure that it gets sorted, too. Writing a loop to generate all possible lists and sort them would consume too much time; that's why the random numbers are used. (There also has to be some assurance that no one has "loaded" the random number generator.)

: Random testing creates tests that fail randomly. A sort algorithm might work correctly except in some strange corner case and a random test might have only a small probability of creating that corner case.

If you have an interpreter, you can handle the generation of items yourself. You know that the coder of the function could not have predicted what numbers you are going to ask it to sort, so you know it hasn't been coded to sort your specific numbers. You can run it a few times, or you can run it once on an input that is random enough that it couldn't have possibly been anticipated and random enough that the function probably has to call upon all its resources to solve it -- and determine, by induction and to your satisfaction, that it works.

It is possible, but apparently outside the scope of ExtremeProgramming, to write interactive tests for functions. Anybody could run such a thing and see for himself whether or not a function works. However, such a program might be hard to write in C++ or Java. In an interpreted language, the interpreter basically is such a program.

And so, I think interpreters are being sold wrong. I read of them being "faster than compilers," in the sense that when you type code into an interpreter, you don't have to wait for it to compile. That is the wrong tack because compilers are fast enough today to compile code in a few seconds. Milliseconds versus seconds is no difference in an interactive program. It's when you do something automatically, in a loop, that such time differences matter.

The best way to sell an interpreter is as a general-purpose testing environment. You can test your functions manually right after you code them. InterpretersAreForTesting.

-- EdwardKiser [Mon, Apr 16, 2001, 2:21 PM UTC]

This sounds like one of the things that TowardsEmpiricalComputerScience was trying to say.