The Psychology of Quality and More
Variation part 2: Measuring centre and spread
The measurements of variation a process can vary in two different ways, in terms of their centre and their spread, as illustrated in Fig. 1. The centre (also called accuracy or central tendency) of a process, is the degree to which measurements gather around a target value. The spread (also called dispersion or precision) of the process is the degree of scatter of its output values.
Fig. 1. Spread and Centre
Ways of measuring the centre
To measure the centering of a process requires that the center point of the set of results be identified. The accuracy of the process can then be determined by comparing it with target values. There are three ways of measuring this center point: the mean (or average), the median and the mode (see Fig. 2).
Variation in centre and spread
The most common way of measuring the center point of a set of measurements is with the average, or mean (i.e. the sum of all measurements divided by the total number of measurements).
The mean is useful for further mathematical treatment, as it considers all values (although a few extreme values can cause the mean to become unrepresentative of the rest of the values).
If the measurements are listed in numeric order, then the median is the number half-way down the list. If there is an even number of measurements, it is half-way between the middle two numbers. The median is not distorted by extreme values, but it can be very unrepresentative of the other values, particularly in a distribution which is not symmetrical.
The mode is the most commonly occurring measurement. In a distribution graph, this is the highest point. The mode is also not distorted by extreme values, and is useful for measuring such as average earnings. However, there can be more than one mode, and it is not as good as the mean for mathematical treatment.
In a symmetrical distribution such as a Normal distribution, these three measures are the same. In an asymmetrical (or skewed) distribution, as below, there is a simple rule-of-thumb formula which can be used to estimate one, given the other two:
Mean - Mode = 3 x
(Mean - Median)
Ways of measuring spread
There are two main ways of measuring the degree of spread of a set of measurements: the range and the standard deviation.
The range of a set of measures is simply the difference between the largest and the smallest measurement value. This is easy to calculate, but there are several problems with using it:
· Special causes of variation can cause an unrealistically wide range.
· As more measurements are made, it will tend to increase.
· It gives no indication of the data between its values.
The standard deviation is a number which is calculated using a simple mathematical trick (calculating the square root of the average of squares) to find an 'average' number for the distance of the majority of measures from the mean, as in Fig. 3.
Fig. 3. Calculating standard deviation
The standard deviation is of particular value when used with the Normal distribution, where known proportions of the measurements fall within one, two and three standard deviations of the mean, as in Fig. 4.
Fig. 4. Standard deviation in the Normal distribution
Thus, given a set of measures, the mean and the standard
deviation can be calculated, and from this can be derived the probability of
future measures falling into the three bands, provided that the distribution is
68.3% of scores will be +/- 1 s.d. (= between 50 and 62)
95.4% of scores will be +/- 2 s.d. (= between 44 and 68)
99.7% of scores will be +/- 3 s.d. (= between 38 and 74)
The remaining 0.3% of scores will be below 38 or above 74.
Next time: Histograms
part 1: creating them
This article first appeared in Quality World, the journal of the Institute for Quality Assurance
And the big