Keeping the build fast is a necessity for CI to provide rapid feedback. Developers need to know quickly that the build has been broken in order to fix it and allow other team mates to work with a functional version. But what to do when building your project takes a long time? Say two hours? Or if you have extensive testing that involves use of another servers and it takes about a day to complete? Should you reject the continuous integration idea at all or is there a solution for you?
The answer to these questions is staged build. Basically it is about dividing one large build into smaller pieces, called stages and forming a relationships between them.
Say your compile takes two hours. Ok, maintaining rapid feedback from the whole compilation is impossible. But you can stage your compile into module layers e.g. data access, business layer, myapplication. Add or modify these to your liking and compile only those modules/libraries on each stage that falls into appropriate application layer. Next, setup your build server in a way that it will form a dependency tree according to your build server abilities. It will take a form of one of following pictures.
Figure 1 - Triggered build
Figure 2 - Build dependencies
What is different with this setup?
-
Each time there is a change somewhere in data access, whole chain of dependencies is recompiled, but you will see the status of data access compilation much earlier than before.
-
Each time there is a change somewhere in myapplication, only myapplication is recompiled and yet again you will see its status faster than before.
Same technique applies to tests. Huge automatic tests which needs a long time to complete can be split into several stages and usually they can even run in parallel. Consider this dependencies schema.
Figure 3 - Tests schema
The idea is obvious. The tests make a dependency tree. If any of these shall fails, the whole branch will fail as well (or not continue with integrating depending on your setup). On the figure both feature tests from groups 1 and 2 wait for db tests to complete, then run in parallel, possibly on different machines. Only when they both finish the last project myapplication integrated is run. This actually does't do anything (maybe deployment), just signals the final health status of integration status of all stages and provide functional, well tested build artifact.
When to use staged build
Multi stage build has more advantages than just keeping build time low. It provides additional clarity to the continuous integration process. There is a much better feedback about where exactly (in what module/what phase/what stage) the build failed if it happens. If it is properly setup, it is possible to avoid error promotion of low priority modules into big applications. The whole idea of integration makes more sense with this setup. It is especially useful for bigger teams or companies consisting of several teams working on large projects.
It does not make much sense for a one .NET solution that compiles under 5 minutes where there are no more than 3 people working on it.
How many stages do your project need? There is no merit to that. Use your own judgment and follow two basic practices:
-
Keep the build time fast.
-
Keep the project structure as complex as necessary, but no more.