Yaxin Wang

Proud to be a ThoughtWorker


Choice of messaging:


Export database as sql script or as domain objects represented by xml? The benefit of sql script is simplicity, there is no difference between exporting this table and that table. The problem of sql script is that it does not have a meaningful business view, hard to add more content as script since the reference entity may be damaged(the domain objects graph) Serialize domain objects to xml? How to handle Hashtable or List kind of structure?


Process Only Classes

Basically the challenge is that we have to create objects which do not exist in real life in object oriented programming. Some times we need to create factory object which can produce other objects, some times we need to create register object which can record activities of other objects, more important is that lots of tasks we need to accomplish in our system is natually a process/procedure which need the orchestration of multiple objects. We need to encapsulate this process somewhere (mostly in XXXManager, XXXController, XXXOperator kind of objects) because this process will be reused and need to be tested seperatly.

For example to squeeze an orange, we can have an Orange object which just represent the natural orange, it has color, taste and other properties, it can be squeezed, sliced, fried, but we do not want the Orange object to know or couple to the squeeze/slice/fry process. To squeeze the orange is naturally an process in which we need to find/create a Squeezer, find/create an Orange and send the squeeze message to the Squeezer by call it's Squeeze method.

[Code] public Juice ProduceOrangeJuice?() {

    Squeezer squeezer = new Squeezer();
    IFruit orange = new Orange();
    return squeezer.Squezze(orange);
} [/Code]

We may want to create a JuiceProducer? object to encapsulate this process if we need to produce various kind of juice in our system.

[Code] public class JuiceProducer? {

    public static Juice ProduceAppleJuice?() {...}
    public static Juice ProduceOrangeJuice?() {...}
    public static Juice ProduceStrawberryJuice?() {...}
} [/Code]

The JuiceProducer? is not a natural object, it is created to encapsulate the juice producing process so that we can reuse this process instead of writting duplicated code.

There is a danger of overusing the JuiceProducer? kind of objects to encapsulate processes. It is not easy to create objects which has the right properties and behaviors to naturally represent the business domain and accomplishes the tasks. People new to object oriented programming tend to create lots of static classes which are actually function libraries instead of create the right objects.

We should try to minimize the numer of process only classes which contain those static methods. Actually most processes naturally fit into the behavior/method of the right object. Like the Squeeze method in the Squeezer contains the process of squeezing which may call methods in the Fruit object in a specifc sequence and may need the help of other objects too.

I mentioned the word "Natural" a lot. The object oriented programming has great power because it naturally represents the world around us, it is much easier to understand things which are natural, the complexity of a large system is significantly reduced if it can be represented naturally and easily understandable. You can not create/change/enhance a system unless you understand it.

A famous ThoughtWorker interviewed me when I tried to join ThoughtWorks, he said that he did not like any objects like XXXManager or XXXController because they are unnatural and not well object oriented. He think that objects in programming can be more powerful than objects in real life which can contain additional behaviors which are not possible in real life objects. The number of process only objects can be minimized if the regular objects can be more powerful. I still do not fully understand what he said. In my view I would rather keep regular objects natural and put those unnatural behaviors in additional objects than make regular objects less natural by putting unnatural behaviors there.

The current system contains a lot process only objects which contain a lot static methods. One problem of static methods is that they are hard to test. If ClassA.Method_1() calls ClassA.Method_2() and ClassB.Method_1() and we want to test ClassA.Method_1() only, we may need to mock out the ClassA.Method_2() and ClassB.Method_1(), we can mock out the methods contained in the same class by extending the class and override the methods that we want to mock, we can mock out methods contained in other classes by pass the mock instance into the caller, we can not do either of this for static methods. We can solve this problem by convert the classes which only contain the static methods into Singleton, the behavior of the system is not changed but the testability will be improved a lot.


Run time code generation

We can use System.Reflection.Emit to generate IL code directly, or we can generate C# code and use the System.CodeDom?.Compiler and Miscrosoft.CSharp to compile the generated c# code into IL code.

Delegate or Interface

When using delegate we are actually referencing the object being delegated in a less explicit way, the use of interface makes this reference more explicit. With the use of events we may actually reference a lot objects but forget to remove the reference later which results in memory leaks!

Duplicated Code in Unit Test

We need redirections to remove the duplications, redirections make code less explicit. In unit test code I prefer explicity!

High Cohesion, Low Coupling and Be Explicit

 High cohesion means that an object should concentrate on a single task, this is always a good practice in OOP.
 Low coupling means that while an object may need other objects' help to complete it's job, it should not depend on other objects' implementation.
 Low coupling is not always good in OOP because the interaction among objects can become so implicit which makes the code really hard to understand.

MartinFowler explains why we want to be explicit in his paper "http://www.martinfowler.com/ieeeSoftware/explicit.pdf"

The following are six ways that objects interact with each other ordered from the most explicit to the least explicit.

Bad: The caller object is coupled to the callee object's constructor.
Bad: Really hard to inject MockObject.
Good: We know which callee object is used in compile time.
Good: The usage of the callee object in the caller object is explicit.

Bad: The caller object is coupled to the callee object's constructor.
Bad: Need to provide hook for MockObject.
Bad: The usage of the callee object in the caller object becomes less explicit.
Good: We know which callee object is used in compile time.

Bad: We don't know which callee object is actually used.
Good: Be able to inject the MockObject.
Good: The usage of the callee object in the caller object is explicit.
Good: The caller object is not coupled to the callee object's constructor.

Bad: The creation of objects is managed by the container, it can be error-prone and sometimes it is hard to figure out which object is actually
injected in the runtime.
Bad: The callee object is injected as instance variable, it's usage in the caller object is less explicit.
Good: Can inject the MockObject
Good: The caller object's dependency is satisfied when it is created, the caller object is fully functional once it is created
Good: The caller object is not coupled to the callee object's constructor.

Bad: Same container problem as Constructor Injection, there is a distance between the type registration and it's actual usage.
Bad: The caller object is not fully functional until the dependent objects are injected.
Bad: The callee object is injected as instance variable, it's usage in the caller object is less explicit.
Good: Can inject the MockObject
Good: The caller object is not coupled to the callee object's constructor.

Bad: Produces the least explicit code if used improperly. In the runtime objects subscribe to and unsubscribe from events at different
locations and different time based on different scenarios. It can be extremly difficult to figure out what is going on by looking at
the code which represents the compile time structure, you have to debug it again and again in different scenarios to figure out how
the system works.
Good: Very low coupling among objects.

Waste of Resource in Agile - Continues testing by QA

Yes, it is a waste of time for QA to test features that maybe changed or removed in the future. This waste of resource is inevitable and recognized in the agile process. Even though QA spend much more time on the testing (throughout the development lifecycle and beyond) and some of the time is wasted, the whole team benefits tremendously from QA's effort, the developers get feedback much faster and the code quality is maintained, the business requirements are always satisfied and we have a shorter time to market rather than wait several months for QA to test the code after the development is done. Just like driving a car with a V6 4L engine, I know that I am going to waste more gas, but I will have a faster acceleration/speed and will probably reach the destination earlier!

Waste of Resource in Agile - Pair Programming

Obviously that pair programming is a waste of resource if things are so simple which can be easily handled by a single developer. But the problem is that we really do not know which part is simple and which part is hard. In many cases we think that the code should be simple enough to be done by a single developer but find out we can produce much better code by pair programming. Writing working code is easy which does not require pair programming in many cases, but writing working code with high testability, high maintainability, high cohesion and low coupling is very hard! From time to time I was surprised that how pairing programming can produce high quality elegant code even from seemingly simple codes. Let's assume that 50% of the seemingly simple codes are truly simple which does not need pair programming and the other 50% seemingly simple codes can actually be improved a lot by pair programming, I would rather waste some time pair programming on those truly simple codes than missing the opportunity of improving the quality of those seemingly simple but actually improvable codes. Waste is inevitable in pair programming because of our inability to predicate which part is truly simple and which part is only seemingly simple, but it is worthwhile to pair in all scenarios so that we won't miss any opportunity to make our code better! And another side effect for not pair programming on simple codes is that developers who are new to pair programming have a tendency to under-estimate the power of pair programming and always assume that the code is simple enough to be done by a single developer.

Write Document in Agile

Doing agile does not mean that we should not write document. We do emphasize more on face to face communication than written document because face to face communication is faster and is always up to date and preparing the document usually takes too much time and not being used after it is done because the document just does not conform to the reality. But there are many things which are hard to be expressed by spoken language or to be memorized which the written document/table/diagram. In agile, when we create documents it means that we really need those douments and those documents will be used by the team in the project. Just like every line of code should be used and be covered by test, every document we created in agile has a purpose and will actually be used.

Acceptance Testing in Agile

Do we still need testers in agile since the developers can do unit testing and acceptance testing? There is nothing that testers can do which developers can not do. Actually, a good developer can work as a project manager, a business analyst, a tester or even a mechanic if he wants to and has the determination to do it. The developers write code because they are trained to write code, the developers write unit test code because it is very important and only the developers can do it, not the testers. The developers may do some acceptance testing to verify their code works fine, but they should not do thorough acceptance testing because even though the acceptance testing is super important, it does not require coding skills and testers are specially trained to do it and can do it much better than developers. It's just like I can repair my car by myself but I prefer to let a mechanic do it, because he is trained to do it and he can repair the car better and faster than me.

The tester's role is changed in agile mainly because we require automated acceptance testing. The iterations from 1,2,3,to n-1 must be tested again when the iteration n is completed to make sure that iteration n does not break any previous iterations. This continuous acceptance testing is impossible to be done manually and has to be automated. It means that the traditional testers who do the manual testing by written test plan have to learn how to use automated testing tools and how to write test scripts. It is really hard for traditional testers and actually we had really strong resistance from client testers in the beginning, and eventually they love it because the automated testing did save them much time in the long run.

One thing I am not clear on is how to maintain the acceptance test scripts. Unlike the unit testing code, which can be compiled and run multiple times a day in CruiseControl, the acceptance test scripts cannot be compiled, are hard to refactor, and are too detailed and too slow to run in CruiseControl. It is OK to maintain 1 million lines of production code with 5 millions lines of unit testing code, but I cannot imagine how to maintain 1 million lines of acceptance test scripts. Actually the small UI changes are causing a lot of headache for us to update the existing test scripts even though we have only a few test plans.

I really doubt whether the testers can do effective acceptance testing just by oral communication with the business analysts or customers without a detailed written specification. Information lost or misunderstanding is inevitable in oral communication and it can be partially compensated in development phase because the developers have to get the information in order to complete the story card. But there is nothing (no mechanism in agile) to push the testers to get the complete information they need to test the application. If the acceptance testing is based on the tester's understanding of the system, how do we know the tester really understands the system even though the tester thinks that he understands the system? The testers can only test the application superficially, based on their superficial understanding of the system. With TDD and continuous integration, we can make the application 99% correct just by face to face communication with BAs and customers; we need the extra rigor of written document and tester's patience and thoroughness to make it 99.99% correct. The reason we hate written written document is because it takes too much time and it is not always correct. But by start with a simple story card and evolve it into a rigorous document (or test script, but test script may not be expressive enough) we have the time of the whole iteration to enhance the story card into the specification and it is always correct with the reality and reflects the intention of our development. The testers can do thorough testing of the latest completed iteration based on the newly completed specification and this written document will be maintained/synchronized with code just like the acceptance testing script has to be maintained/synchronized with code.

Can the BA just writes simple story cards but must include very detailed success criteria which the testers can use to test the story card? After all it is BA who knows best what the customer wants and understands the business logic and they are the final gate keeper.

I feel the hardest thing in agile is the communication. We encountered problems, including: The customers do not really know what they want and give us wrong requirements. The customers assume developers understand what they want and do not actively check/verify what has been done in the last iteration. The BAs do not understand what the customer wants. The BAs assume the developers know what the customer wants. The developers do not actively pursue information from the BAs or the customers. The developers assume they know what the customers want just by story card. The developers have misunderstanding/confusion about what the customers want between themselves. The testers do not actively pursue information from the BAs or the customers. The team has the habit to rely on written document and people just do not like talking. Lots of confusion in the team and sometimes I feel it is chaos instead of agile.

Put DB Updates Unit Test Code in Transaction

And rollback the transaction after the test is done. It is a fast and easy way to keep the Database state unchanged.

Good Document or No Document

It is great to have a good document which is accurate, complete and updated. It is also good to have no document at all so that people have to ask/talk/communicate all the time so that the information is always in the brains of the whole team which is always updated. If the information is so complicated that we are not able to remember it, then we can create/improve our unit test code, test script, production code to let the code tell us what is going on. After all, only the code is true/real, all the others can be false, and the code can be compiled, tested. That is why we prefer to either let the Story Card have title only without any content so that the developers and QAs have to ask the questions all the time, or we start with a simple Story Card and constantly update it with more details as we discover them and finally evolve the Story Card into a specification which the whole team can reference from time to time. It is very bad for the team members to take a simplified Story Card as spec and do only what is in the Story Card without asking/exploring further deeper questions.

Communication

Whether it is WaterFall or Agile, document or conversation, we need clear communications in order to make sure that we are doing the right thing. Well written document is very welcomed but it takes too much time to create and hard to maintain. In agile we rely more on face to face communication. But what if people do not like to talk to each other, the developers, BAs, QAs do not want to talk to each other, it is their culture, it is their personality, how do we implement agile in this environment?

Conquer the Complexity

When the lines of code increases, the complexity of the system is increased and the system becomes hard to understand and maintain because you do not know how it will affect the working system by what you are going to change. Developers may choose to add tons of duplicated code just to add a small new feature because they do not want to change the existing working code, they are afraid of breaking the working system, this only makes the existing complex system even more complex. This problem can be partially solved by good Object Oriented design, modular approach and layered approach. But the same problem still exists that the change in one module may break other modules and the module itself may be hard to understand.

What we need is a mechanism that can tell us what we are doing is right or wrong immediately so that we are not afraid of making changes, we can either resolve the problems or rollback our changes, no damage will be caused to the working system. We can make changes even though we are not fully understand the system since this mechanism can tell us what we are doing is right or wrong.And we can even try some refactoring to simplify the system knowing that our refactoring won't break anything.

This mechanism is a HIGH TEST COVERAGE!

Do not be afraid of making changes, do not be afraid of to simplify code and improve the design and implementation by refactoring, our overall high test coverage will ensure that what we have done is correct!

Do the right thing and Keep the right thing from going wrong! If a test is executed for 1000 times, only the first execution serves the purpose of doing the right thing, the other 999 times of execution serves the purpose of keep the right thing from going wrong, which is more important.

Refactoring and Unit Testing: In small scale, unit testing actually impedes the refactoring because you not only have to refactor the production code, you also have to refactor the unit testing code, but in the large scale the high test coverage (the mini integration testing, functional testing) ensures that our refactoring does not break the system. I would rather write extra code than having the risk of breaking the system. The confidence in refactoring is the corner stone of tackle the complexity, otherwise we won't dare to touch the working code and ended up scatter the duplicated code all over the system, which causes even more complexity and make the maintenance a nightmare!

The Iterative Development is super important in tackling the complexity, the overall system may seem to be overwhelming, by using the iterative approach we can attack the problem one small piece a time, but the crucial thing is that although we may move slow, we are moving ahead and won't go back! It means that our Iteration N will not break any working code from Iteration 0 to Iteration N-1. And it is guaranteed by the high test coverage. Once the Iteration N is done, we run all the tests from Iteration 0 to Iteration N and make sure everything is still working!

The cost of writing unit tests is a sequential value compared to the cost of writing production code; suppose that we spend one hour on unit testing for each hour spent on production coding, the relation of the cost will be 2x. But the cost of fighting broken and messed codes (back and forth so many times) and finding hidden bugs which are caused by not having Unit Tests is an exponential value. It maybe a waste of time to write unit tests for a simplest project which has only one class and takes only one hour to code it, but it will only cost 1 million hours to write the unit tests for a project which has 10,000 classes and takes 1 million hours to write the production code instead of putting 10 million hours and ended up project failure.

Why Unit Testing

Why TDD


Half of Yaxin and Xiaoyan.

Xiaoyan did her thesis defense on Oct 6th 2004, Xiaoyan is a Ph.D of Pharmacology now!

Yaxin and Xiaoyan got married on Oct 7th 2004!

Yaxin became a ThoughtWorker on Nov 1st 2004!

Xiaoyan got offer from http://www.scripps.edu Scripps Institute on Jan 31th 2005!

04/01/2005 - Travel to San Diego

Hilton Head????


CategoryHomePage


EditText of this page (last edited August 29, 2011) or FindPage with title or text search