Appendix E: Useful Statistical Procedures
E.1 Distribution Functions
The following table shows how to calculate a standard uncertainty from the parameters of the two most important distribution functions, and gives an indication of the circumstances in which each should be used.
Example: A chemist estimates a contributory factor as not less than 7 or more than 10, but feels that the value could be anywhere in between, with no idea of whether any part of the range is more likely than another. This is a description of a rectangular distribution function with a range 2
=3 (semi range of
=1.5). Using the function below for a rectangular distribution, an estimate of the standard uncertainty can be calculated. Using the above range,
=1.5, results in a standard uncertainty of (1.5/
) = 0.87.
| Rectangular distribution | ||
| Form | Use when: | Uncertainty |
![]() |
|
|
| Triangular distribution | ||
| Form | Use when: | Uncertainty |
![]() |
|
|
| Normal distribution | ||
| Form | Use when: | Uncertainty |
![]() |
|
u(x) = s u(x)=x.( u(x)= u(x) = c/2 u(x) = c/3 |
E.2 Spreadsheet Method for Uncertainty Calculation
E.2.1 Spreadsheet software can be used to simplify the calculations shown in Section 8. The procedure takes advantage of an approximate numerical method of differentiation, and requires knowledge only of the calculation used to derive the final result (including any necessary correction factors or influences) and of the numerical values of the parameters and their uncertainties. The description here follows that of Kragten [h.12].
E.2.2 In the expression for u(y(x1, x2...xn))
provided that either y(x1, x2...xn) is linear in xi or u(xi) is small compared to xi, the partial differentials (
y/
xi) can be approximated by:
Multiplying by u(xi) to obtain the uncertainty u(y,xi) in y due to the uncertainty in xi gives
u(y,xi)
y(x1,x2,..(xi+u(xi))..xn)-y(x1,x2,..xi..xn)
Thus u(y,xi) is just the difference between the values of y calculated for [xi+u(xi)] and xi respectively.
E.2.3 The assumption of linearity or small values of u(xi)/xi will not be closely met in all cases. Nonetheless, the method does provide acceptable accuracy for practical purposes when considered against the necessary approximations made in estimating the values of u(xi). Reference h.12 discusses the point more fully and suggests methods of checking the validity of the assumption.
E.2.4 The basic spreadsheet is set up as follows, assuming that the result y is a function of the four parameters p, q, r, and s:
- Enter the values of p, q, etc. and the formula for calculating y in column A of the spreadsheet. Copy column A across the following columns once for every variable in y (figure e2.1). It is convenient to place the values of the uncertainties u(p), u(q) and so on in row 1 as shown. more...
- Add u(p) to p in cell B3, u(q) to q in cell C4 etc., as in figure e2.2. On recalculating the spreadsheet, cell B8 then becomes f(p+u(p), q ,r..) (denoted by f (p', q, r, ..) in figures e2.2 and e2.3), cell C8 becomes f(p, q+u(q), r,..) etc. more...
- In row 9 enter row 8 minus A8 (for example, cell B9 becomes B8-A8). This gives the values of u(y,p) as
-
u(y,p)=f (p+u(p), q, r ..) - f (p,q,r ..) etc.
- To obtain the standard uncertainty on y, these individual contributions are squared, added together and then the square root taken, by entering u(y,p)2 in row 10 (figure e2.3) and putting the square root of their sum in A10. That is, cell A10 is set to the formula
-
SQRT(SUM(B10+C10+D10+E10))
E.2.5 The contents of the cells B10, C10 etc. show the squared contributions u(y,xi)2=(ciu(xi))2 of the individual uncertainty components to the uncertainty on y and hence it is easy to see which components are significant.
E.2.6 It is straightforward to allow updated calculations as individual parameter values change or uncertainties are refined. In step i) above, rather than copying column A directly to columns B-E, copy the values p to s by reference, that is, cells B3 to E3 all reference A3, B4 to E4 reference A4 etc. The horizontal arrows in Figure E2.1 show the referencing for row 3. Note that cells B8 to E8 should still reference the values in columns B to E respectively, as shown for column B by the vertical arrows in figure e2.1. In step ii) above, add the references to row 1 by reference (as shown by the arrows in figure e2.1). For example, cell B3 becomes A3+B1, cell C4 becomes A4+C1 etc. Changes to either parameters or uncertainties will then be reflected immediately in the overall result at A8 and the combined standard uncertainty at A10.
E.2.7 If any of the variables are correlated, the necessary additional term is added to the SUM in A10. For example, if p and q are correlated, with a correlation coefficient r(p,q), then the extra term 2Xr(p,q) Xu(y,p) Xu(y,q) is added to the calculated sum before taking the square root. Correlation can therefore easily be included by adding suitable extra terms to the spreadsheet.
E.3 Uncertainties from Linear Least Squares Calibration
E.3.1 An analytical method or instrument is often calibrated by observing the responses, y, to different levels of the analyte, x. In most cases this relationship is taken to be linear viz:
This calibration line is then used to obtain the concentration xpred of the analyte from a sample which produces an observed response yobs from
It is usual to determine the constants b1 and b0 by weighted or un-weighted least squares regression on a set of n pairs of values (xi, yi).
E.3.2 There are four main sources of uncertainty to consider in arriving at an uncertainty on the estimated concentration xpred:
- Random variations in measurement of y, affecting both the reference responses yi and the measured response yobs.
- Random effects resulting in errors in the assigned reference values xi.
- Values of xi and yi may be subject to a constant unknown offset, for example arising when the values of x are obtained from serial dilution of a stock solution
- The assumption of linearity may not be valid
Of these, the most significant for normal practice are random variations in y, and methods of estimating uncertainty for this source are detailed here. The remaining sources are also considered briefly to give an indication of methods available.
E.3.3 The uncertainty u(xpred, y) in a predicted value xpred due to variability in y can be estimated in several ways:
From calculated variance and covariance.
If the values of b1 and b0, their variances var(b1), var(b0) and their covariance, covar(b1,b0), are determined by the method of least squares, the variance on x, var(x), obtained using the formula in Chapter 8. and differentiating the normal equations, is given by

and the corresponding uncertainty u(xpred, y) is ![]()
From the calibration data.
The above formula for var(xpred) can be written in terms of the set of n data points, (xi, yi), used to determine the calibration function:
where
,
is the residual for the ith point, n is the number of data points in the calibration, b1 the calculated best fit gradient, wi the weight assigned to yi and
the difference between xpred and the mean
of the n values x1, x2....
For unweighted data and where var(yobs) is based on p measurements, equation E3.4 becomes
This is the formula which is used in example 5 with
Sxx =
.
From information given by software used to derive calibration curves.
Some software gives the value of S, variously described for example as RMS error or residual standard error. This can then be used in equation E3.4 or E3.5. However some software may also give the standard deviation s(yc) on a value of y calculated from the fitted line for some new value of x and this can be used to calculate var(xpred) since, for p=1
giving, on comparison with equation E3.5,
E.3.4 The reference values xi may each have uncertainties which propagate through to the final result. In practice, uncertainties in these values are usually small compared to uncertainties in the system responses yi, and may be ignored. An approximate estimate of the uncertainty u(xpred, xi) in a predicted value xpred due to uncertainty in a particular reference value xi is
where n is the number of xi values used in the calibration. This expression can be used to check the significance of u(xpred, xi).
E.3.5 The uncertainty arising from the assumption of a linear relationship between y and x is not normally large enough to require an additional estimate. Providing the residuals show that there is no significant systematic deviation from this assumed relationship, the uncertainty arising from this assumption (in addition to that covered by the resulting increase in y variance) can be taken to be negligible. If the residuals show a systematic trend then it may be necessary to include higher terms in the calibration function. Methods of calculating var(x) in these cases are given in standard texts. It is also possible to make a judgement based on the size of the systematic trend.
E.3.6 The values of x and y may be subject to a constant unknown offset (e.g. arising when the values of x are obtained from serial dilution of a stock solution which has an uncertainty on its certified value). If the standard uncertainties on y and x from these effects are u(y, const) and u(x, const), then the uncertainty on the interpolated value xpredis given by:
E.3.7 The four uncertainty components described in E.3.2 can be calculated using equations Eq. E3.3 to Eq. E3.8. The overall uncertainty arising from calculation from a linear calibration can then be calculated by combining these four components in the normal way.
E.4: Documenting Uncertainty Dependent on Analyte Level
E.4.1 Introduction
E.4.1.1 It is often observed in chemical measurement that, over a large range of analyte levels, dominant contributions to the overall uncertainty vary approximately proportionately to the level of analyte, that is u(x)
x. In such cases it is often sensible to quote uncertainties as relative standard deviations or, for example, coefficient of variation (%CV).
E.4.1.2 Where the uncertainty is unaffected by level, for example at low levels, or where a relatively narrow range of analyte level is involved, it is generally most sensible to quote an absolute value for the uncertainty.
E.4.1.3 In some cases, both constant and proportional effects are important. This section sets out a general approach to recording uncertainty information where variation of uncertainty with analyte level is an issue and reporting as a simple coefficient of variation is inadequate.
E.4.2 Basis of Approach
E.4.2.1 To allow for both proportionality of uncertainty and the possibility of an essentially constant value with level, the following general expression is used:
where
- u(x) is the combined standard uncertainty in the result x (that is, the uncertainty expressed as a standard deviation)
- s0 represents a constant contribution to the overall uncertainty
- s1 is a proportionality constant.
The expression is based on the normal method of combining of two contributions to overall uncertainty, assuming one contribution (s0) is constant and one (xs1) proportional to the result. Figure E.4.1 shows the form of this expression.

NOTE: The approach above is practical only where it is possible to calculate a large number of values. Where experimental study is employed, it will not often be possible to establish the relevant parabolic relationship. In such circumstances, an adequate approximation can be obtained by simple linear regression through four or more combined uncertainties obtained at different analyte concentrations. This procedure is consistent with that employed in studies of reproducibility and repeatability according to ISO 5725:1994. The relevant expression is then ![]()
E.4.2.2 The figure can be divided into approximate regions (A to C on the figure):
- The uncertainty is dominated by the term s0, and is approximately constant and close to s0.
- Both terms contribute significantly; the resulting uncertainty is significantly higher than either s0 or xs1, and some curvature is visible.
- The term xs1 dominates; the uncertainty rises approximately linearly with increasing x and is close to xs1.
E.4.2.3 Note that in many experimental cases the complete form of the curve will not be apparent. Very often, the whole reporting range of analyte level permitted by the scope of the method falls within a single chart region; the result is a number of special cases dealt with in more detail below.
E.4.3 Documenting Level-Dependent Uncertainty Data
E.4.3.1 In general, uncertainties can be documented in the form of a value for each of s0 and s1. The values can be used to provide an uncertainty estimate across the scope of the method. This is particularly valuable when calculations for well characterised methods are implemented on computer systems, where the general form of the equation can be implemented independently of the values of the parameters (one of which may be zero - see below). It is accordingly recommended that, except in the special cases outlined below or where the dependence is strong but not linear*, uncertainties are documented in the form of values for a constant term represented by s0 and a variable term represented by s1.
E.4.4. Special Cases
E.4.4.1. Uncertainty not dependent on level of analyte (s0 dominant)
The uncertainty will generally be effectively independent of observed analyte concentration when:
- The result is close to zero (for example, within the stated detection limit for the method). Region A in figure e.4.1
- The possible range of results (stated in the method scope or in a statement of scope for the uncertainty estimate) is small compared to the observed level.
Under these circumstances, the value of s1 can be recorded as zero. s0 is normally the calculated standard uncertainty.
E.4.4.2. Uncertainty entirely dependent on analyte (s1 dominant)
Where the result is far from zero (for example, above a 'limit of determination') and there is clear evidence that the uncertainty changes proportionally with the level of analyte permitted within the scope of the method, the term xs1 dominates (see Region C in figure e.4.1). Under these circumstances, and where the method scope does not include levels of analyte near zero, s0 may reasonably be recorded as zero and s1 is simply the uncertainty expressed as a relative standard deviation.
E.4.4.3. Intermediate dependence
In intermediate cases, and in particular where the situation corresponds to region B in figure e.4.1, two approaches can be taken:
a) applying variable dependence
The more general approach is to determine, record and use both s0 and s1. Uncertainty estimates, when required, can then be produced on the basis of the reported result. This remains the recommended approach where practical.
NOTE: See the note to section e.4.2.
b) applying a fixed approximation
An alternative which may be used in general testing and where
- the dependence is not strong (that is, evidence for proportionality is weak)
- the range of results expected is moderate
leading in either case to uncertainties which do not vary by more than about 15% from an average uncertainty estimate, it will often be reasonable to calculate and quote a fixed value of uncertainty for general use, based on the mean value of results expected. That is,
either
a mean or typical value for x is used to calculate a fixed uncertainty estimate, and this is used in place of individually calculated estimates
or
a single standard deviation has been obtained, based on studies of materials covering the full range of analyte levels permitted (within the scope of the uncertainty estimate), and there is little evidence to justify an assumption of proportionality. This should generally be treated as a case of zero dependence, and the relevant standard deviation recorded as s0.
E.4.5. Determining s0 and s1
E.4.5.1. In the special cases in which one term dominates, it will normally be sufficient to use the uncertainty as standard deviation or relative standard deviation respectively as values of s0 and s1. Where the dependence is less obvious, however, it may be necessary to determine s0 and s1 indirectly from a series of estimates of uncertainty at different analyte levels.
E.4.5.2. Given a calculation of combined uncertainty from the various components, some of which depend on analyte level while others do not, it will normally be possible to investigate the dependence of overall uncertainty on analyte level by simulation. The procedure is as follows:
- Calculate (or obtain experimentally) uncertainties u(xi) for at least ten levels xi of analyte, covering the full range permitted.
- Plot u(xi)2 against xi2
- By linear regression, obtain estimates of m and c for the line u(x)2 = mx2 + c
- Calculate s0 and s1 from s0 =
, s1 =
- Record s0 and s1
E.4.6. Reporting
E.4.6.1. The approach outlined here permits estimation of a standard uncertainty for any single result. In principle, where uncertainty information is to be reported, it will be in the form of
[result]±[uncertainty]
where the uncertainty as standard deviation is calculated as above, and if necessary expanded (usually by a factor of two) to give increased confidence. Where a number of results are reported together, however, it may be possible, and is perfectly acceptable, to give an estimate of uncertainty applicable to all results reported.
E.4.6.2. Table e.4.1 gives some examples. The uncertainty figures for a list of different analytes may usefully be tabulated following similar principles.
NOTE: Where a 'detection limit' or 'reporting limit' is used to give results in the form "<x" or "nd", it will normally be necessary to quote the limits used in addition to the uncertainties applicable to results above reporting limits.
Table E.4.1: Summarising uncertainty for several samples
| Situation | Dominant term | Reporting example(s) |
| Uncertainty essentially constant across all results | s0 or fixed approximation (sections e.4.4.1. or e.4.4.3.a) |
Standard deviation: expanded uncertainty; 95% confidence interval |
| Uncertainty generally proportional to level | xs1 (see section e.4.4.2.) |
relative standard deviation; coefficient of variance (%CV) |
| Mixture of proportionality and lower limiting value for uncertainty | Intermediate case (section e.4.4.3.) |
quote %CV or rsd together with lower limit as standard deviation. |
* An important example of non-linear dependence is the effect of instrument noise on absorbance measurement at high absorbances near the upper limit of the instrument capability. This is particularly pronounced where absorbance is calculated from transmittance (as in infrared spectroscopy). Under these circumstances, baseline noise causes very large uncertainties in high absorbance figures, and the uncertainty rises much faster than a simple linear estimate would predict. The usual approach is to reduce the absorbance, typically by dilution, to bring the absorbance figures well within the working range; the linear model used here will then normally be adequate. Other examples include the 'sigmoidal' response of some immunoassay methods.


