Space Station Manager Post Mortem

Note: the PostMortem below is available at http://web.archive.org/web/20041009220859/http://www.mistaril.com/about/post_mortem_ssm.php.

I [PhlIp] couldn't easily find this PostMortem anywhere, so I decided to park it here and link it to XP.

A few years ago, Kai-Peter Backman wrote up a post-mortem of his Space Station Manager which was developed using XP. It is, to my knowledge, the only game ever developed this way. His site seems to be down at the moment, but I have a copy of the text:

Post Mortem: Space Station Manager by KaiPeterBackman?, Mistaril 18th of February 2004

This is a post mortem of Space Station Manager, the first downloadable game from www.mistaril.com . The goal for the project was to create a game for entering the downloadable games market with and develop the required e-commerce infrastructure for distribution and marketing over the Internet. A secondary goal was to produce enough profit from the game to sustain continued game development activities. The project met both of these goals.

>From a technical standpoint the project was interesting because it employed a slightly modified version of the ExtremeProgramming (XP) software development process. XP is an AgileProcess that strives to maximize the output from the project and minimize the associated effort. It is also very suitable for high-risk projects with a lot of uncertainty in the requirements. The process relies on a small set of best practises that are fundamentally easy, but might require some experience to make sense. In general, the XP process proved to be unsuitable for the project and a more specific conclusion can be found in the end of the PostMortem.

Things that went right

1. Player involvement

Having players involved in the game, right from the first installable builds, was the biggest success during the project. The first tester group consisted of just a few interested players, but having a few more pairs of eyes on the game made a huge difference in productivity and quality. Early controlled playtesting sessions also uncovered a lot of UI improvements and changed many fundamental things about gameplay.

After the public beta release, the amount of testers multiplied overnight. Forming a close relationship with actual players of the game helped focusing on the important changes. Having a high degree of player involvement also helped build the player community, much to the delight of the players and the developer.

2. Frequent iterative releases before launch

Releasing early and releasing often was successfully used in minimizing risks in the project. Having the first version in the hands of testers in about a month after initial project start helped solve a lot of simple problems early in the development cycle. The quality of the game was kept at a high level from the beginning, which helped in supporting a relatively rapid development schedule.

The weekly releases also formed an important pulse for development. Identified risks requiring a lot of player feedback, like the user interface, could be placed early in the schedule to manage the risk more proactively. A weekly release also helped the project reach closure each week, and the visible progress kept both the developer and player morale high through the project.

3. Supporting the game after release

Working with the game after the release turned out to be a solid business decision. Small changes in the game could radically alter the conversion rate from downloads to sales or lower the refund rate (there is a 60 days no questions asked refund period after you buy the game). Experimenting with these changes was a very small development effort, but measuring the effects of a change could take days or weeks. Consistently improving on these things got the product revenues sustainable much sooner, and improving on both conversion and refund rate created more satisfied players. Outside game quality itself, optimizing the demo for selling probably had the greatest effect on the bottom line.

4. Going full time from the beginning

One of the key successes of the project was working full time from the beginning. There were two types of benefits experienced: external and internal. The external benefits were focus related, having a solid workweek to organize development and business tasks greatly helped increase productivity. There are several layers of context switching when you work on two or more jobs (as all part time independent developers know by heart) and the boost in productivity comes both from an extended number of absolute hours and a higher efficiency of those hours as you are able to minimize the time lost in these switches.

The internal benefits from going full time were mainly improved motivation and job satisfaction. Going full time and cutting out any other commitments helped to focus and run the business more consistently. It was much easier to sustain high levels of professionalism when the immediate and long-term future depended on it. Working full time also helped ease the relationship to players and other developers, it was easier to spare time for building lasting relationships.

5. Investing in solid tools

Full time commitment makes larger investments in tools and development practises more viable. Investing in tools was a particularly good choice in content production, the toolchain consisted of proven tools like Maya, Photoshop and Sound Forge. The initial investment in time and money was higher than with other available tools, but this translated to much greater productivity later in the development cycle.

Things that went wrong

1. Frequent iterative releases after launch

When all you have is a hammer everything looks like a nail. Weekly iterative releases, which worked fantastically during initial development and the public beta period, broke down when the game reached initial release as 1.0. Game regression testing was impossible to automate as most really interesting test cases depended almost purely on human intuition. Small changes sometimes affected play balance in large and unplanned ways, adding weeks of stabilizing effort to get everything on a right keel. All this greatly destabilized the game after launch and crippled any further attempts at changing things.

2. Too little up-front design

The low level engine libraries benefited greatly from the agility inherent in XP. However, the gameplay code would have needed more design from the beginning. This was painfully evident when player modding was implemented later, something that should have been a breeze turned out to swallow a lot of resources just to keep the game stable. Designing for more flexibility would have made the bigger changes easier in the later stages of pre-launch development, as well as those changes that occurred after launch. The lack of clear up-front design also stifled the codebase, making it more or less unsuitable for upcoming projects.

3. Underestimating e-commerce set-up costs

Setting up the e-commerce site physically was just a two-day job. Finding the correct way to start, in this case using an external payment processor, took a little longer. The real job was all those small things that needed to be done after the initial site was launched. Most of these tasks are one timers, which means that you only need to set them up once irrespective if you release more games or not. Still the amount of tasks was overwhelming at times. Many times the tasks themselves took just some hours or days worth of effort, but researching and learning to understand the backgrounds and making the right choices took much, much longer.

Here is one key area where one can benefit greatly by asking for advice from fellow business owners. Many times a seasoned developer can shed light on a hard problem and make it essentially a no-brainer. Setting up is basically easy, but to someone inexperienced with selling software over the Internet there are many new things to learn.

4. Communication about project status

Communication about project status with testers and other players worked well before and during the public beta. Estimates for implementation status of features and their schedules were reasonably accurate. Tester feedback was quickly and appropriately responded to, with most testers feeling fairly confident in the status of their feedback. The confidence in handling feedback also helped most testers be frank in their responses, which contributed greatly to improving overall game quality.

After release this communication was no longer as accurate. First, due to time requirements to run the e-commerce structure about half of the development time was no longer available. Much worse, time required to work on the e-commerce tasks was needed irregularly and usually with great haste and no warnings. Second, after launch large changes were no longer easy to incorporate. Valid estimates and estimation techniques during the prelaunch period turned out to be very unreliable after launch.

5. Using a custom 3D engine

Building a custom graphics engine for the game was a two edged sword. Initially building the graphics engine using an existing in-house GUI library made the start up phase faster and helped get the game to a playable point. As the engine was small and built for one purpose, it was easy to get working and relatively stable. However, as the engine had no previous real world exposure, there were plenty of defects found later on in the cycle. Fixing these later problems with stability and adding new functionality slowly grew the codebase and made the engine grow in complexity. In retrospect it would have been worthwhile to pick an external general-purpose engine instead, and accept the added up-front effort required in adopting it to the game. The other external libraries used on the project turned out to be stable and easy to integrate. This supports the decision to adopt an external graphics engine as well.

Conclusions

The most important conclusion from this project is that the eXtreme Programming process turned out to be an unsuitable life-cycle model. The main reason was that the project violated two of the basic preconditions for using XP:

Lack of automatic testing for core functionality like play balance and the fidelity of audiovisual output
Inflexibility of art assets when refactoring the game for improved design

The first point stems from the fact that most games are highly balanced systems, where many of the balancing variables can easily create almost chaotic effects through small changes. Testing unstable systems like this is error prone and takes a lot of human resources. Specific cases of balance may be testable through automatic simulations, but the more general notion of gameplay balance is very difficult to quantify. It is also difficult to calculate an absolute value for game difficulty. Other important aspects like fidelity of audio and graphics is also hard to measure using automatic test. Changes in the program code can result in changes to either, and the result can only be verified using a human developer.

Secondly, many art assets are produced in discrete steps, with some of these steps requiring destructive operations disconnecting later stages from earlier. As the amount of content grows the program code is severely limited in refactoring choices. The project was hampered by this several times, leading to compromises when refactoring and subsequently a lowered quality of the program code.

Not identifying these limitations and using XP through the whole development process is the direct reason for problems 1, 2 and 4. The main reason this was not realized earlier was that many parts of the XP process were working. Iterative and incremental releases and pushing risk intensive tasks to the front of the schedule greatly helped risk management for the game. Code review and pair programming sessions helped keep the code clean from several problems. Also, low-level utility libraries were compatible with XP and completing them early reinforced the false belief. Also not adhering completely to the XP process, mainly by designing in flexibility now and then, helped get the program code over the first few bumps. All these issues contributed to the fact that the insuitability of XP was only seen after the project was well beyond the point of no return.

Recommendations for future projects

The proper software development life cycle model for a game project needs to have requirements gathering, analysis and design front loaded in the development schedule. Agile models, like the Spiral model, might be appropriate to minimize initial risk, but when content production starts in full redoing earlier stages to any great extent is difficult. This needs to reflect in the software development life cycle as well. During initial development, daily builds for the development team and weekly builds for testers seem like the most appropriate iteration length. However, after launch, the iteration period should be about one month.

The size of changes should slowly diminish towards initial product launch, and then sharply drop off. If larger changes are needed a separate upgrade project should be started. This project should be scheduled and managed outside the maintenance branch.

To minimize the inflexibility inherent in working with more linear software development life cycles, tested libraries should be used when possible. This includes implementing core features as an internal library. Even if this limits the technical options in building the game, it greatly reduces project size and thus inherent risk.

There should be a clearly marked point when a project transits over from development to maintenance. This transition, and the changes in development methods it means should be clearly communicated to the testers and players.

Final words

All in all, the first downloadable project from Mistaril was a great success. The game found an appreciating target audience and managed to generate enough profit to let the company continue developing games. A lot was learnt during the development process, which translates into more efficient and fun development in the future. A satisfying first project.

reposted, without permission, from the XP mailing list:

..The goal for the project was to create a game for entering the downloadable games market with and develop the required e-commerce infrastructure for distribution and marketing over the Internet. A secondary goal was to produce enough profit from the game to sustain continued game development activities. The project met both of these goals.

Hmm, met *both* goals.

Hmm, *great success*

Yet the overall conclusion is that "XP failed" because in their mind some straw man alternative process that they haven't even tried yet *might* have been better?

Not to mention that it clearly sounds like their first attempt at XP and they clearly don't get many of its principles.( "Also not adhering completely to the XP process, mainly by designing in flexibility now and then, helped get the program code over the first few bumps.") I guess I forgot the XP practice of "Never design in flexibility."

Mind boggling.

DarylRichter? http://itsallsemantics.com

Designing for flexibility that is not specifically addressing a requirement (eg. making solutions more generalist than DoTheSimplestThingThatCouldPossiblyWork) is frequently proscribed as YagNi. In this case, one can argue incomplete requirement analysis is why the team had to opt for higher flexibility. Perhaps the concepts on YagNi and flexibility should be reconsidered in the case of ambiguous requirements?

I think this postmortem was very insightful - it highlights the primary limitation of XP - that is, BigDesignUpFront is needed in cases the usage of the system will generate large amounts of brittle assets that will need to be supported. The interface to these assets must be thoroughly thought out before the content-creators are allowed to handle them, since the brittleness of content obstructs refactoring. The same thing happens with programming languages, public libraries, etc. There will always be external content (code, artwork, etc) that, for whatever reason, cannot be refactored along with the codebase. As such, the interface benefits from BigDesignUpFront. Likewise it also brings another (rather obvious) problem to light: systems with extremely complex emergent behaviour (like gameplay balancing) are very difficult to UnitTest. However, BigDesignUpFront equally fails miserably at designing play balance - it seems that gameplay balancing requires a unique intermediate process that focusses on stabilizing the gameplay to avoid the cyclic mayhem of nerfage/buffage and the creation of endless lists of "exceptions" to gameplay rules that make understanding the gameplay impossible. Rather than unit-testing the gameplay balance, I would've focussed on unit-testing the invdividual attributes of the gameplay for stability (simply asserting that action X still has Y effect) and then leave game balance as an exercise for human testers.

I think the PostMortem, to me, shows that a mixed approach is required for gaming - since in gaming, brittle content (the artwork) is being developed in parallel to the product - along with the usual XP challenge of maintaining backwards-compatibility with user-developed content post-release.

Of course, this is all being written neither having played the game nor seen the code - so I'm just ArmchairDeveloping?.

-- MartinZarate

Get more abuse at: http://tech.groups.yahoo.com/group/extremeprogramming/message/122251

CategoryXpCritique