Category Archives: Design

SCM Documentaion is Useful

It’s very common for software engineers to document procedures. Perhaps it should be required for everyone. For some, it’s always been this way. To others, this is new. I’m not sure why it’s not always done right up front since documentation makes us all more efficient. But what makes good documentation?

First, let’s talk about the most common type of documentation: the data dump. This is the kind of documentation that barely meets the requirements of ‘it’s documented.’ Technically it might be documented, but since there is too much information presented it may not be useful.   This is the kind of documentation you get from people who don’t believe in documentation, but are forced to do it – created when the writer’s time is deemed more important than the reader’s time.

I like to follow the principles as outlined in the book, Presenting To Win: the art of telling your story by Jerry Weissman.  It means looking over the pile of information and deciding how it needs to be presented. It means putting the user’s time ahead of the writer’s time. After having done this for many process documentation projects, I now have a pattern to share that improved my documentation.   Hopefully, it will improve yours.

Start with the different types of information contained in the documentation, such as domain information and technical information.   An example of domain knowledge is a specific servers name and port, or the policies regarding naming conventions, such as “all build machines are on the build.xxx.com sub net.   Technical knowledge is the command used to change the name. So if the documentation includes changing a host name to meet the standard convention, there is some domain knowledge and some technical involved. Start by separating the two types.

When considering the user of the software engineering documentation, the user needs can be found in the four quadrants of the matrix below, based on domain vs technical knowledge.

Junior Software Engineer Senior Software Engineer
new to domain both domain information
domain veteran technical instructions none

 

Since you won’t always know which quadrant the user is in, it’s best to write documentation with the varying user needs in mind. Assuming senior engineers are more highly paid than junior engineers, domain information should come first. A new senior level system administrator will have the technical knowledge but not the domain knowledge, so make it easy for to find the domain knowledge without having to read through all the technical details. Here is are a couple examples

  1. Change the system name to meet the standard naming convention
    1. Naming conventions are found here…
    2. Instructions for changing system names are found here…
  2. Change the IP address
    1. Available IP address are found here…
    2. Instructions for changing system IP addresses are found here…

The senior engineer who has been with the company can stop at reading the numbered step instructions. The senior engineer who just got hired will read the top level steps, plus dive into the second level sections as needed. The junior engineer who has been with the company will read the top level steps, possibly will skip some second level sections, but will read others.   The junior engineer who just got hired will probably need all sections of the documentation.

The goal here it so make it easy for the end user to find just the right amount of information needed. Eliminate wasted time reading stuff already known by the user. At the same time, be sure to include all of the information needed so the SCM team isn’t needed.

SCM Documentation is Useful

SCM is Modular

Just as we design software to be modular, I like the idea of designing SCM to be module. After all, SCM is Software. A simple design pattern taught to me in the early 1990’s has proven effective when applied to build engineering. Note that from my perspective, the term build is a general term used in reference to any integration activity including compile, test, bundle, package, and deliver.

The basic input-processing-output pattern used by COBOL programmers has proven itself to be valuable as the basic building block pattern for continuous integration of enterprise software build products; and there is no reason to think it wouldn’t work just as well for continuous delivery, and any other continuous (i.e. fully automated) activity. The pattern is flexible and scalable to fit any size modular software component or product build.

The intent of using this as a build pattern is to keep your SCM system modular and flexible:  write your source code to be version control system agnostic to maintain and capable of downloading binary sources from any type of artifact repository [the inputs]; design your builds and tests to be continuous integration system agnostic [the processing]; and maintain the ability to publish artifacts to any artifact repository or deliver software changes to any software system [output],   The intent of this pattern is to remain agnostic of the supporting SCM systems.  Internally coupling any of the three parts of the build pattern to a single manufacturer’s application should be avoided.

Keeping the SCM system modular and flexible also means you must prevent instances of the the core build pattern from becoming tightly coupled with each other. In other works, preventing external coupling together of the builds is just as important as preventing internal coupling if we are to meet the objective of maintaining system flexibility. This can be a challenge since there are multiple tools [and of course, more of those magical plugins] available to assist you with activities like triggering builds.   You will be wise to stay out of these traps. In some cases, their main purpose seems to be keeping you locked into a specific vendors software and paying to lease software. Has this ever happened to you?

But let’s also be careful not to chain together builds [or tests, or deployments, or deliveries] by tightly coupling build processes together within a continuous integration system. It’s loosely coupled when you have build process A producing artifact A for consumption by build B; and more tightly couple if the output of build A is directly input to build B. Both may seem the same, but they are quite different. This difference becomes apparent when adding a new build C, with a need for consuming the output of build A. With the loosely coupled solution, builds A, B, and C can all be easily be executed on different continuous integration systems.

SCM is Modular