Agile Architecture: 4 Common Strategies

At ThoughtWorks, our preferred way to start a project is by doing a set of workshops and sessions with stakeholders for about two weeks. That’s what we call Inception. After the Inception we usually have a product backlog for the project and are ready to start writing production code.

During this period we come up with a technical vision for the system. It is a very incomplete artefact, pretty much just a system metaphor, barely good-enough to start the project. At this stage I always get funny looks from the client’s architecture/development team and questions like: “So, in this agile thing we have no architecture? Just start coding without thinking about how things are structured in the higher level?”

Well, not really. We do tackle architecture problems, but in a way that is very different from what other methodologies may suggest. When a project starts, I don’t care about having a sound and complete architecture. What I do care about is having a reasonable strategy for dealing with architecture decisions.

In an agile project User Stores have in themselves both business value and technical investigation/architecture work. The usual way to deal with that is just to pile up those cards in an iteration backlog and play them by priority order. This works really well when the new feature requires just design changes, but is not so great for architecture decisions. Architecture, remember, may be defined as “things that people perceive as hard to change”;.

As a Tech Lead/Iteration Manager I’ve learned and experimented with many different strategies for dealing with those “hard to change” decisions. In this article I want to discuss four of those.

##1 - Iteration Zero

One way of dealing with architecture issues is to address them in the very first iteration for a project, in what is often called the Iteration Zero. The team will work on all technical investigation and architecture decisions that they know will have to be addressed at once, before they actually start writing some production code.

In theory, this strategy benefits from some economy of scope; it is probably more efficient to group your technical/architectural problems while solving them. User Stories would leverage the architecture defined in the first iteration; they should contain 100% business value and no known technical risk should have to be addressed while playing them.

This has hidden problems, though:

Wasteful: You will be spending time designing and architecting things for cards that may never be played.
Doesn’t deliver business value: you are introducing agile methods to a client, the first thing you want to tell them is that every iteration they will get some business value. After all the work in convincing them, you start the project and don’t deliver anything in the first iteration. This can become such a big problem that account managers inside ThoughtWorks sometimes ask teams following this strategy to call this “Preparation”, “Technical Investigation Phase” or “anything but an iteration!”.
You don’t know what you don’t know: You may try to address all technical investigation now but if your project is really agile what you know now is just a tiny fraction of what you will know in two weeks time. You are not making commitments in the last responsible moment, and that really reduces the benefits of agile methods.
Not enough time: It is very common for Iteration Zero to break the timebox and go on for many weeks, even months. The amount of technical investigation and work -that, remember, may or may not be actually necessary- just won’t fit in a regular iteration.

I do use Iteration Zero, but I don’t use it as the main point for technical investigation and design. My iteration Zero aims to get a story done, from end to end. This has to be an important story, so that the work performed to deliver that can be reused in some others. Given we have no structure in place yet, this is often that is too much work to accomplish by a single developer or pair. What I usually do is to break the story into tasks, Scrum style.

That way we will have something to show the client in the showcase and, accordingly to the Rule of the Second Card, will have made just enough design to reduce risk for the upcoming iterations.

2 - Tech Story

This is the style suggested by Philippe Kruchten. In this strategy, architecture and design work have their own cards that get prioritised and played together with business-value cards. We call those cards Tech Stories, as opposed to User Stories. Theoretically, most architecture decisions will be made while playing the Technical Stories, the developer playing a User Story would benefit from the fact that all decisions were already made.

The biggest benefit of this strategy is that it creates visibility on the technical risks and design work required for a project. When you mix technical and business requirements in a single story, it is very hard to understand how much of the effort in there is technical and how much is business-related; you often have to justify to the business why “add a link to the home page” is actually a lot of work because your architecture was not ready for that.

It’s good that it doesn’t hurt the just-in-time nature of agile methods. You can schedule Tech Stories to be played in any iteration, ideally just before playing a dependant User Story.

In my experience, this is not so great, though:

Wasteful: Just as the Iteration Zero approach, Tech Stories mean that you may waste time investigating and designing things that will not be useful when requirements change. And they will change.
Architecture becomes low-priority: If those cards are put in the product backlog that means that they will have to be prioritised by the business, and it is really hard for them to understand the value of those tech tasks. Developers have to keep having the same debates, trying to convince that those acronyms have real value.
Creates a dependency chain: The dependency between Tech Stories and User Stories is really hard to manage. If a Tech Story takes longer than expected it may block not one but many user stories!
There’s also something missing: Regardless of time spent in doing up-front technical investigation, you will not be able to cover all possible things for all user stories. Technical investigation and design will happen inside a user story anyway.

Tech Stories will happen in any non-trivial project. I try to minimise their number, I don’t want my client paying up-front for a technical effort that she may or may not need. I want the client to understand that every time we add a feature there is a cost, and that part of this cost derives from purely technical work.

I also hate to ask the client to prioritise things that they don’t understand. How can I explain to a product manager that “Refactor user class” is more important than “Edit user details”? Many important technical decisions cannot be directly mapped to tangible benefit. You can try and convince your client, but you are going to have this same conversation every single week, for the rest of the project.

When I work on projects where, for some reason, Tech Stories are the norm, I usually apply a technique called technical budget (Krutchen calls it “buffers”). In this technique, every iteration has a fixed capacity (often expressed as a % of the Velocity) allocated to Tech Stories. This works, but it’s very inefficient. Budget sizes will always be too long or too short; you often end up having to re-negotiate buffer increases every iteration.

3 - Spike

Doing any up-front design or investigations is mostly about mitigating risk. Risk can only be mitigated by generating information (see Managing the Design Factory for an excellent discussion on this). The canonical way to generate information in agile projects is called a Spike.

A Spike is a experiment whose goal is to generate information. May be something like “can we use JSON.Net to serialise our data instead of doing it using a template engine?” or “should we split this blob into a three Layers system?”.

A developer or pair playing a Spike will try to generate the required information, often by writing some code. This effort is time-boxed and the Spike does not add to the team’s Velocity. A Spike will not remove the need for technical investigation and the risk from the user stories completely. It will just reduce the required effort by providing some information acquired before the card is played.

My problems with this technique:

Architecture becomes low-priority: Just like with Tech Stories, Spikes need to be prioritised by the product manager.
Some will be useless: Spikes may end up generating no useful information whatsoever. Surely the developers will learn something, but this information may not be relevant to the project at all.
Throw-away: The code wrote for a Spike should be disposed once its goal is met (i.e. information was generated). Spike code tends to have very low quality and it is possible for developers to just hack some quick solution just to find out later that is not applicable to the production code.

I use Spikes all the time. Most of them are not disposable code, though. I often require my team to have the same quality standards with Spikes as we have with production code.

As of prioritisation, it is less of a problem than with Tech Stories. Tech Stories are supposed to have value by themselves and that is hard to communicate to business people. With Spikes, though, I usually ask for two estimates: one for “if this card is played before the Spike” and other for “if that card is played after the Spike”. In many projects, not playing the Spike before the User Story means a 60% higher estimate.

4 - Assembly Line

The last strategy I want to discuss today is what I call an Assembly Line. In this strategy, the development bit of a user story is split in two or more sections, one for technical investigation/architecture and other for actual development (i.e. meeting the acceptance criteria).

User Stories are first placed in some Technical Investigation queue and, after someone does the necessary investigation and design, they are moved to the actual Development queue.

This is really useful when there is a huge difference in the work and/or skill set required for designing a solution and just meeting the acceptance criteria.

The problems I see here:

What’s “done”?: It is really tricky to define what means “done” for the work performed in the investigation queue. Expect many endless (and pointless) discussions.
Cards are blocked forever: When using this strategy, many times I had up to two development pairs idle and no cards free to be played. They were all in the Technical Investigation queue, and this queue often has less developers than the Development queue.
Developers vs. “Architects”: It is fair too easy to create a distinction between people that do technical investigation and people that only play cards. The developers that are only playing cards often end up with very little interest and commitment to the project. When faced with a technical problem they are most likely just to move the card back to the Technical Investigation queue instead of working on it.
Throw over the wall: The separation of work in two steps tend to create a “throw over the wall” culture.

This technique is, sometimes, the right opposite of what we are trying to achieve with agile methods. I see the Assembly Line as a strategy that makes sense only sometimes –when recovering a project is a good example– and only for a limited period. Impacts on morale for a team using this strategy for longer than strictly necessary are massive, and the division of work creates all sort of problems regarding information sharing and blaming culture.

It is possible that your project requires some heavy up-front design for some cards, but if this is a common scenario for you there is probable something wrong. Design should not be apart from implementation.