I’m constantly at odds with my developers over the importance of documenting why a piece of code does what it does. Having been in the code maintenance business for a long time, I have learned the hard way that a particular implementation is only valid for a particular set of conditions. Unless those conditions are well documented, there is no way to effectively determine if the code is valid in another (perhaps the same since there is no way to know) situation.
Some examples of questions that should have documented “why”s:
- What types of processes are expected to communicate with the code and by what means (theaded / non, etc)?
- What conditions are expected to never / always occur?
- Is a call expected to block?
- Is there an expected and required order to a particular set of calls?
- Do items automagically maintain themselves (e.g. will a map shrink as entries are removed)?
- Can an item be reused w/o reconstruction and what are the constraints to reuse?
- Does an item expect to be reused in a different set of conditions?
Pulling an example from my own code:
“There are certain optimizations that have been made in the writer based
on the fact that the send timeout is a constant and is based on the time
at which a message is added to the queue (i.e. the queue will contain
monotonically increasing timeout values). This implies that until the
currently active message’s (the message currently being written) timeout
occurs, no other message in the queue needs to be checked.”
As time went on, it was determined that there would be messages that never timed out. This means that the constraint that the timeout values are monotonicly increasing was no longer valid and therefore the implementation was no longer valid. Only by specifying the conditions under which the code was written (assumptions that were made) was it known that the implementation needed to be changed.
It is common for the conditions under which an implementation is written to be defined in other systems or documents such as requirements or the bug tracking system. Unless the conditions are presented either within the code itself or the same directories as the code the correlation is lost. Also, the implementation typically has its own specific set of conditions that would not be found in requirements.
There is little actual overhead in serializing these conditions as, by definition, they are all known at developement time. In other words, the conditions are all known, they simply must be written out. Once a suitable convention has been established for this documentation and the developers overcome the initial inertia of performing this task, it becomes very natural. Any minimal time lost over the process of typing is overshadowed by the extra level of communication that it provides.