The Psychology of Quality and More
CHAPTER 10 : Programming Usage
10.2 Performance programming
'Performance' is an excuse regularly used to justify code that is difficult to understand. It is a trap that some programmers fall into where performance is constantly the highest priority, and readability is degraded, even seriously, in order to save microseconds (it is not uncommon for purported performance tweaks to actually take longer!).
Against this, there is the maxim, "Get it right, then make it faster," which proposes that all considerations of performance be ignored, with style constantly as the highest priority. This approach can result in gross underperformance, with consequent tweaks spoiling the original style of the code.
The best course is a compromise between these two extremes. Performance problems should be recognized and dealt with individually, preferably at the design stage. Even with this care, there will still be bottlenecks which will only be found after coding (typically with a profiling tool).
Any performance-critical points in the code should, of course, be clearly commented (to warn future maintainers of potential problems), with any decrease in readability being balanced with an increase in commenting.
This process can be clarified by specifying limits for required performance, outside which performance is or is not an issue. For example, in a screen animation system, a performance of less than 10 frames per second may be unacceptable, whilst performance above 25 frames per second would be unnecessary.
There are a few classic situations where performance problems may be met:
The repetitive nature of loops means that much processing is likely to take place within them. They should thus be coded with care, particularly in known performance bottleneck areas. This situation is geometrically compounded in nested loops, where the deeper you go, the more critical the code becomes.
There are a few ways of attacking this problem: Items which are not a part of the loop's computation should be put outside the loop. Hints may be given to the compiler by using register, volatile and const variables (note that compilers often have only 2 or 3 registers available for register use). Revising the design, possibly to a more data-driven approach, may also help to reduce loops.
10.2.2 Layered code
Multi-layered systems may be clearly designed, but can result in a simple piece of code calling a function which calls a function which calls a function, and so on, in order to perform a possibly simple task. If each layer of code calls the next layer from within a loop, then we have a hidden nested loop. Lower layers should take greater care over performance issues, and should 'understand' the performance needs of upper layers.
Interfaces are another possible bottleneck area, for example where a complex interface requires a lot of setting up before a call, or I/O interfacing results in long waits for data.
Care should be taken in the design of interfaces to minimize complexity of use. It may even to necessary to write new functions to perform specialized actions. For example, an editor may have to include a 'repaint entire screen' function, which accesses the hardware, rather than use system calls which take 10 seconds to repaint the screen! Note that it is easier to do this if the interface is legally extensible (see 10.1.3).
Interface performance problems often vary between different systems. Thus you should beware of ported programs becoming sluggish on other (and supposedly faster) machines (see 10.8).
10.2.4 Algorithms and designs
Different algorithms for the same type of situation (such as sorting) have different characteristics. Knowing which to use, and where, is usually more a matter of having the right reference book, rather than inventing a new wheel.
More generally, the design of a piece of code may have significant effect on its performance; it is often better to spend time designing for performance, rather than trying to code it in.
10.2.5 Run-time problems
On one computer system, the performance bottleneck can change from being the processor to memory to the disc system, as more or less of each resource is made available to the program. In this situation, the program may contain controls to enable its operation to be changed, depending on the bottleneck. These may be manually adjusted by the user, such as the allocation of system file buffers to the program, or may be self-adjusting, such as an screen update scheme that varies with the amount of user interaction.
And the big