Unit Tests And Databases

moved from ComplexDataSetupForAutomatedTests:

I'm currently working on a project developing a large information system. The biggest part of the effort of writing automated UnitTests and/or AcceptanceTests goes into setting up data in the (relational) DBMS. The system has a relatively large number of highly interdependent domain objects (business objects).

Using SQL scripts to load data before executing tests and deleting data after that is very tedious, especially when the database schema changes. Setting and cleaning up data from test code is somewhat better but still tedious - maybe we need to think about some sort of test data generator...

I was wondering whether anybody has experience with a similar situation and how you dealt with it...

-- OliverKamps

ShuntPattern feels your pain. Specifically, you use DependencyInversionPrinciple, a.k.a. AdapterPattern, to wrap your database with a simple persistence layer. Then at run-time you plug into the adapter a real database layer, but at test time you plug in a quickie Spoof layer that returns the fake biz objects you want the tests to use. This procedure is very common in that subset of the programming industry that tests its own code, and it's half the motivation to provide a persistence layer at all. -- PhlIp

That's definitely an interesting approach. I'll have to think about it some more, since a big part of the complexity in our system is persistence related: relationships between domain objects are complex, we have to provide history for most of the domain objects, etc. The tricky part is to retrieve and save the right versions of the objects. I'm not yet sure whether it's good enough to prove that the persistence layer works and that the application logic layer works with the fake data I'm passing to it. For AcceptanceTests, I might have to prove that the application logic layer works correctly with the persistence layer. -- OliverKamps

I've hit this problem a number of times, and figure that the only solution is to buckle down and write some code to populate (and clean) the tables with test data. Maybe next time I'll use XML to represent the test cases. -- JeffGrigg

You might want to look to JXUnit at http://jxunit.sourceforge.net/ then. When I first checked it out I was a bit confused, but really it's a good tool for setting up your JUnit tests with the data in XML files instead of hard-coded or otherwise needed to read in data. One of the nicest parts about it is the ability to pass data from one test case to the next. -- StevenNewton

Next time came around (quite some time later). I did not write and clean trees of data in the real database. I did MockObject. (MockObjectsInVb) It turned out to be easier and more effective than I had expected. -- JeffGrigg

I'm very much against using mocks if we're talking about mocking the connection object. More here: http://aperiplus.sourceforge.net/testing-data-access-classes.php. A simple string comparison (a mock expectation) cannot test the logic in an sql string. Even if the sql is OK, we still don't know if it will have the desired effect when it finally collides with the real schema. -- NoelDarlow

Solution: Automated, repeatable unit testing of your JDBC layer

As with everyone else here, I'm sure i've found this a fairly regular issue. In fact, I've taken pains to solve it and write a blog article describing how to get it going. It's not something that could replace integration testing against a database; but it will allow your rapid fire unit tests to proceed with a known solution.

View it here: http://www.bluedevel.com/blog/?p=12 - Jason Koch <jasonk@bluedevel.com>

There are two things needing to be tested - access to the database, and the business logic. You can simplify testing the business logic by using FakeTheSideEffects. By allowing your UnitTest to 'inject' query results and to intercept updates, you can test the business logic without going through the rigmarole of prepopulating tables and examining them later.

For testing the database access code itself, there's no substitute to having a volatile ('scratch') copy of your database that the UnitTest is free to destroy.

Write a simulator, and run it to populate your DB. See SimulateTheIndustry.

moved and copied from UnitTests:

When using a Relational DB for persistence, a universally supported UnitTest framework which is under-used is using the database's RelationalIntegrity checks. -- MarkSwanson

...and I have a question about the ChryslerComprehensiveCompensation project. The tests run in 10 minutes. Surely you don't have data im- or exports. But if you have to test all the cases against a database, how can you end up in such a short time? ... -- SebastianPetzelberger

We have data imports and exports. We have a few tests that go to the database, quite a few more that test file reads and writes. Most of the tests that do this we do against internal streams that do not do I/O. Then it is sufficient to have just a few tests that actually test the gateway to the database or the connection to the file system. -- RonJeffries

Question: What's the best way of UnitTesting Oracle Stored Procedures and related database table constraints?

I have done some crude UnitTests in the Oracle world, but they were a bit draconian:

Basic method was to run the test, redirect to a file, and compare it with expected. Inside the test, I call a procedure, then SELECT * FROM table; for all tables that mattered, ensuring the procedure had the right side-effects. The test has several procedure-call and SELECT cycles in it.
Because the output includes the system date, I had to use a sed script to change the date to a fixed word, prior to diff.
The first step in the test is to DELETE all the data from the tables that mattered. That is quite draconian, when other developers share the database. I also had to DROP and re-CREATE the SEQUENCEs to make them restart at 1

Deleting everybody's data is draconian, in a shared environment. It comes from DoTheSimplestThingThatCouldPossiblyWork.

From my own thinking (ExtremeProgrammingChallengeThirteenPointFive doesn't add much), The way I see it, I have a few unappealing options:

Ask for my own environment
Continue and hope that there aren't too many active developers trying to do stuff when I want to UnitTest
Invest time and energy to see how I can normalize the output without restarting the sequences
Invest time and energy to see how I can suppress the output of other people's data in the database
Invest time and energy to see how I can restore the database to what it was before the test (there are complications with ROLLBACK).

At least the UnitTest is fully automatic.

Any ideas on this one?

-- NickBishop

Question: What's the best way of UnitTesting Oracle Stored Procedures and related database table constraints?

Have you looked at utPLSQL (http://utplsql.sourceforge.net)? This is an open source UnitTesting framework for PL/SQL (hence the mouthful of a name) developed by StevenFeuerstein. I think it'll do what you want.

As for deleting other people's data, I would recommend each developer having their own schema in which to play during development.

In my own project, I have set up a cron job to do the following each night:

Drop and recreate a whole test schema from scratch using the latest scripts from CVS.
Populate the tables with relevant data.
Run all the UnitTests and email me the results to read in the morning

-- ChrisRimmer

I had problems for years with above stated 'populating tables with relevant data'. It's much easier with a generator. Try http://tdgteam.com

-- Igor Jese

My team is just now thinking about how to create UnitTests for the vast quantity of SQL queries in our system. This same question came up in our discussion. We decided that for INSERT/UPDATE/DELETE queries, we would issue BEGIN TRAN and ROLLBACK TRAN before and after each test in order to ensure that the tables remained unchanged after each test.

We haven't done this yet, so maybe you can save us some heartache. You mentioned that "there are complications with ROLLBACK", what were you referring to?

-- KenMcKelvey

I'm testing INSERT/UPDATE/DELETE with transactions and rollback. (November 2001) Works for me. I have more trouble with RI than transactions: You have to make sure the related records are there -- and records related to those records, etc... -- JeffGrigg

Does everyone have his or her own Oracle account? That's how we did it in our XpLeiden project. There's a file schema.sql in our cvs repository and the UnitTests automatically delete whatever is in your schema and then recreate it with schema.sql. Individual tests insert a few records, perform a query, compare the results with the expected output and then remove all records again. -- MartijnMeijering

I'm not sure how relevant or useful this is, but here is what we do at Memetrics. We use CommonLisp, and have a macro which defines an object-relational mapping between Lisp objects and relational tables. Now, because of the object model, most objects are "owned" by a long chain of objects, rooted at the class/table Customer. Most of our tests look like this:

 (deftest :package "Some test"
 (with-test-customer-tree (customer space experiment x-def1 ...)
(body of test)))

This creates new objects customer space experiment x-def1 etc, which are used and tested by body, and then removed unconditionally as part of the cleanup of the WITH-TEST-CUSTOMER-TREE form.

Another strategy, when we only want to test the semantics/methods on some class Foo, regardless of its database ownership relations to other classes (which are presumably tested in some other test), we do something like:

 (deftest :package "Test Foo Class"
  (without-database ()
(let ((the-foo (make-instance 'Foo :somearg 3 :etc etc)))
(run some tests on the-foo here))))

The WITHOUT-DATABASE lets us create an instance of Foo not present in the DB (otherwise, there would be constraint violations, i.e. we did not create the-foo's owner, etc.)

Thus we try to test of the DB semantics (ownership, constraints, creation, deletion, etc) separate from what those objects *do* when they've been instantiated.

Anyway, not sure how useful this is outside of a object<-->relational setting, so feel free to DeleteMe.

-- AlainPicard

I've been recently throwing together a really quick, read-only ObjectRelationalMapping layer (Java + Oracle), and I ended up using MockObjects to simulate most of the actual JDBC objects: Connection, Statement, ResultSet, etc. Not that I wanted to do it. But I was starting to have bugs involving how my ORM layer was actually talking to Oracle, and this seemed like the best way to CaptureBugsWithTests.

For example: I was having problems with Statement leaks - forgetting to close them when I was done -- so I wrote hooks into MockStatement? that allowed me to check if it had been closed. Then in my test class, I wrote a tearDown method that verified that there were no open statements. -- FrancisHwang

With UnitTestsAndDatabases, I typically follow two rules.

Do not let UnitTests commit to the database. Writing code to cleanup after yourself or refreshing the db quickly becomes a pain. If you do find that you need to commit then you are probably writing an AcceptanceTest, and that is a different story. If you use TransactionIsolation? to your advantage then it is easy to isolate your UnitTests and ensure nothing is ever persisted. The easiest way to do this is to start a transaction for each TestCase and roll it back each time.
Do not let UnitTests depend on any values in the database. During the running of each test, the testcase can populate the database if needed and even query for this data, but this data should not exist before or after each test. This results from having a robust, reliable O/R layer. It's easier to create an object and persist it with code, than it is to create the sql to do the same.

-- MikeRettig

Moved from UnitTest

Has anyone got a TestingFramework or experience in testing the correctness of DataConversion? programs (lots of SQL)? -- TomCrossland

I work on a "shrink-wrapped" system that is database-centric. The fact that the program is designed to run at remote sites on customers' own databases might make our environment a little different from a typical IT shop, where you have a single database and don't necessarily have to wrap up schema updates in a repeatable package. In our case, the whole schema update process is automated so that our program checks a database's schema version, then runs all the relevant schema updates in sequence to bring it up to current schema. This approach is essential; the database has to be largely self-managing in the field.

We leverage this feature in our unit tests. We keep a pool of test databases, each in its original state and schema version. Some of these were sent in by customers in the course of diagnosing bugs in the field, others were generated at some point as tests by hand or by throwaway programs. Keeping these databases in their original versions (or at least in versions no older than the last shipped version of the product) protects us against instability in the schema updates during development. If a schema update turns out to have problems, we haven't damaged our original reference copies of these databases.

When a unit test needs such a database, several things happen:

A developer-specific local pool is checked to see if a copy of the database is already there.
If a local copy is available, its date is checked to be sure it's newer than the date of the developer's schema update binary. (This is a DLL in our case.)
If either test fails, a wholesale copy of the database is made to the local pool, and the schema update code is run against it to bring it up to date. (The latter step happens automatically when our database access layer code opens the database.)
If the unit test will modify the database, yet another copy of the database is made from the local pool. The unit test can modify this copy at will, and the copy is thrown away after the test.
If the unit test is read-only, then the local pool copy is opened directly.

Part of the motivation for this elaborate process is that the schema updates can take a long time. (We have about six or seven years' worth of them now, I think.) A separate unit test program exercises schema updates specifically, so we don't need to repeat this testing in every other unit test.

We're just using a file-based DBMS at the moment, so this was relatively easy to set up. However, I would think that this could be adapted to a client/server DBMS, perhaps using backup/restore facilities for making the copies. Of course, for a very large database, the time overhead could become onerous, but I don't see how to avoid that kind of overhead.

The whole technique hinges on the fact that our schema updates are automated and repeatable. The check against the schema update binary date means that this stuff works correctly even if the developer is making changes in that part of the system.

-- DanMuller

Using SQL scripts to load data before executing tests and deleting data after that is very tedious

Try pushing the set up and tear down SQL scripts into database packages or code, then calling them from the unit test. I have found that it is useful to have common Initialize() and Terminate() functions called before and after the unit tests run. In our Initialize(), we first delete the old data and then create the new. We found this useful, because it is often helpful to look at the state of the database, especially after a failed test. (We have a parallel development database; if one has to run out of the production database, it may be necessary to do the clean up immediately following the test). I have also found that once the set up and tear down utilities are available in test code, one tends to find uses for them in the production code as well.

I've used various methods to write unit tests against a DB. The basic challenge is to create a clean sandbox for each test.

1. Begin a transaction in setUp() and rollback in tearDown(). A very effective and simple technique. In your unit test you have free reign to insert, update, and delete at will, but in the end your transaction is rolled back so changes magically disappear. Can be performance-intensive.

2. Mock the DB layer - I found that when I started mocking the DB layer, I was essentially writing my own JDBC driver or adding a complex abstraction layer that took a lot of time to maintain. Rather than spend time with this, I prefer to use a lightweight DB.

3. Lightweight DB - HypersonicSql works perfectly as a unit test database. It can be run in memory, or with simple flat files. I prefer to run it in memory. The tests are lightning fast, and you still get the warm fuzzy feeling of testing against a real database. In .NET, I use SqlServer desktop edition that ships with the SDK. If you are running in memory, then you don't have a persistent schema, so tests need to create the schema on the fly or load with each run.

4. Deletion of Business Objects by O/R layer - With an advanced O/R layer, it should be able to register DB changes with the O/R mapping layer. At the end of the test run, the changes are rolled back at the object level. This technique is particularly useful for AcceptanceTests. With acceptance tests, you can't reasonably keep a db transaction open for the duration of the test. You need real commits. Also, refreshing the DB before each run can be quite slow. When this works, the results can be quite impressive and give you a lot of confidence in your O/R layer. On the project where we did this, our acceptance tests ran for about an hour inserting and updating thousands of records in the database. At the end of the test run, the database would be in the exact state it was when the tests started. This included full compliance with referential integrity.

-- MikeRettig

Question:

We develop a product that basically provides a mapping layer to different types of datasources: LDAP directories and RDBMS for the moment.

UnitTesting for us involves defining domain objects and their mapping to a datasource, using that mapping to insert, update and delete objects and then checking that things are as expected by re-reading and searching them in the datasource through the mapping layer. Our unit tests run against different RDBMSs and directories.

We use the strategy used by the XpLeiden project: our unit test setups consist of creating a datasource instance from scratch and including the schema creation as part of the setup. Datasource instances and schemas created by the tests are contextualized using some specific information belonging to the client running the test (actually the name of the client machine). This way there is no interference between test campaigns and developers can run tests concurrently. It does imply that a global setup is necessary before the test suite can be run. Conversely a global teardown removes the datasource instance. The FakeTheSideEffects strategy doesn't fit for us because we want to test and validate our framework against different datasource vendors. This includes connection and authentication on the target datasource, pooling strategies and so on.

Up to now we consider that going all the way to the datasource is UnitTesting and not part of AcceptanceTesting (FunctionalTesting). Is this approach too coarse grained ?

-- StephMader

CategoryTesting