3. Analytical Measurement and Uncertainty
3.1. Method Validation
3.1.1. In practice, the fitness for purpose of analytical methods applied for routine testing is most commonly assessed through method validation studies [h.7]. Such studies produce data on overall performance and on individual influence factors which can be applied to the estimation of uncertainty associated with the results of the method in normal use.
3.1.2. Method validation studies rely on the determination of overall method performance parameters. These are obtained during method development and interlaboratory study or following in-house validation protocols. Individual sources of error or uncertainty are typically investigated only when significant compared to the overall precision measures in use. The emphasis is primarily on identifying and removing (rather than correcting for) significant effects. This leads to a situation in which the majority of potentially significant influence factors have been identified, checked for significance compared to overall precision, and shown to be negligible. Under these circumstances, the data available to analysts consists primarily of overall performance figures, together with evidence of insignificance of most effects and some measurements of any remaining significant effects.
3.1.3. Validation studies for quantitative analytical methods typically determine some or all of the following parameters:
Precision. The principal precision measures include repeatability standard deviation sr, reproducibility standard deviation sR, (ISO 3534-1) and intermediate precision, sometimes denoted sZi, with i denoting the number of factors varied (ISO 5725-3:1994). The repeatability sr indicates the variability observed within a laboratory, over a short time, using a single operator, item of equipment etc. sr may be estimated within a laboratory or by inter-laboratory study. Interlaboratory reproducibility standard deviation sR for a particular method may only be estimated directly by interlaboratory study; it shows the variability obtained when different laboratories analyse the same sample. Intermediate precision relates to the variation in results observed when one or more factors, such as time, equipment and operator, are varied within a laboratory; different figures are obtained depending on which factors are held constant. Intermediate precision estimates are most commonly determined within laboratories but may also be determined by interlaboratory study. The observed precision of an analytical procedure is an essential component of overall uncertainty, whether determined by combination of individual variances or by study of the complete method in operation.
Bias. The bias of an analytical method is usually determined by study of relevant reference materials or by spiking studies. The determination of overall bias with respect to appropriate reference values is important in establishing traceability [b.12] to recognised standards (see section 3.2). Bias may be expressed as analytical recovery (value observed divided by value expected). Bias should be shown to be negligible or corrected for, but in either case the uncertainty associated with the determination of the bias remains an essential component of overall uncertainty.
Linearity. Linearity is an important property of methods used to make measurements at a range of concentrations. The linearity of the response to pure standards and to realistic samples may be determined. Linearity is not generally quantified, but is checked for by inspection or using significance tests for non-linearity. Significant non-linearity is usually corrected for by use of non-linear calibration functions or eliminated by choice of more restricted operating range. Any remaining deviations from linearity are normally sufficiently accounted for by overall precision estimates covering several concentrations, or within any uncertainties associated with calibration (appendix e.3).
Detection limit. During method validation, the detection limit is normally determined only to establish the lower end of the practical operating range of a method. Though uncertainties near the detection limit may require careful consideration and special treatment (appendix f), the detection limit, however determined, is not of direct relevance to uncertainty estimation.
Robustness or ruggedness. Many method development or validation protocols require that sensitivity to particular parameters be investigated directly. This is usually done by a preliminary 'ruggedness test', in which the effect of one or more parameter changes is observed. If significant (compared to the precision of the ruggedness test) a more detailed study is carried out to measure the size of the effect, and a permitted operating interval chosen accordingly. Ruggedness test data can therefore provide information on the effect of important parameters.
Selectivity/specificity. Though loosely defined, both terms relate to the degree to which a method responds uniquely to the required analyte. Typical selectivity studies investigate the effects of likely interferents, usually by adding the potential interferent to both blank and fortified samples and observing the response. The results are normally used to demonstrate that the practical effects are not significant. However, since the studies measure changes in response directly, it is possible to use the data to estimate the uncertainty associated with potential interferences, given knowledge of the range of interferent concentrations.
3.2. Conduct of Experimental Studies of Method Performance
3.2.1. The detailed design and execution of method validation and method performance studies is covered extensively elsewhere [h.7] and will not be repeated here. However, the main principles as they affect the relevance of a study applied to uncertainty estimation are pertinent and are considered below.
3.2.2. Representativeness is essential. That is, studies should, as far as possible, be conducted to provide a realistic survey of the number and range of effects operating during normal use of the method, as well as covering the concentration ranges and sample types within the scope of the method. Where a factor has been representatively varied during the course of a precision experiment, for example, the effects of that factor appear directly in the observed variance and need no additional study unless further method optimisation is desirable.
3.2.3. In this context, representative variation means that an influence parameter must take a distribution of values appropriate to the uncertainty in the parameter in question. For continuous parameters, this may be a permitted range or stated uncertainty; for discontinuous factors such as sample matrix, this range corresponds to the variety of types permitted or encountered in normal use of the method. Note that representativeness extends not only to the range of values, but to their distribution.
3.2.4. In selecting factors for variation, it is important to ensure that the larger effects are varied where possible. For example, where day to day variation (perhaps arising from recalibration effects) is substantial compared to repeatability, two determinations on each of five days will provide a better estimate of intermediate precision than five determinations on each of two days. Ten single determinations on separate days will be better still, subject to sufficient control, though this will provide no additional information on within-day repeatability.
3.2.5. It is generally simpler to treat data obtained from random selection than from systematic variation. For example, experiments performed at random times over a sufficient period will usually include representative ambient temperature effects, while experiments performed systematically at 24-hour intervals may be subject to bias due to regular ambient temperature variation during the working day. The former experiment needs only evaluate the overall standard deviation; in the latter, systematic variation of ambient temperature is required, followed by adjustment to allow for the actual distribution of temperatures. Random variation is, however, less efficient. A small number of systematic studies can quickly establish the size of an effect, whereas it will typically take well over 30 determinations to establish an uncertainty contribution to better than about 20% relative accuracy. Where possible, therefore, it is often preferable to investigate small numbers of major effects systematically.
3.2.6. Where factors are known or suspected to interact, it is important to ensure that the effect of interaction is accounted for. This may be achieved either by ensuring random selection from different levels of interacting parameters, or by careful systematic design to obtain both variance and covariance information.
3.2.7. In carrying out studies of overall bias, it is important that the reference materials and values are relevant to the materials under routine test.
3.2.8. Any study undertaken to investigate and test for the significance of an effect should have sufficient power to detect such effects before they become practically significant.
3.3. Traceability
3.3.1. It is important to be able to compare results from different laboratories, or from the same laboratory at different times, with confidence. This is achieved by ensuring that all laboratories are using the same measurement scale, or the same 'reference points'. In many cases this is achieved by establishing a chain of calibrations leading to primary national or international standards, ideally (for long-term consistency) the Systeme Internationale (SI) units of measurement. A familiar example is the case of analytical balances; each balance is calibrated using reference weights which are themselves checked (ultimately) against national standards and so on to the primary reference kilogram. This unbroken chain of comparisons leading to a known reference value provides 'traceability' to a common reference point, ensuring that different operators are using the same units of measurement. In routine measurement, the consistency of measurements between one laboratory (or time) and another is greatly aided by establishing traceability for all relevant intermediate measurements used to obtain or control a measurement result. Traceability is therefore an important concept in all branches of measurement.
3.3.2. Traceability is formally defined [h.4] as:
"The property of the result of a measurement or the value of a standard whereby it can be related to stated references, usually national or international standards, through an unbroken chain of comparisons all having stated uncertainties."
The reference to uncertainty arises because the agreement between laboratories is limited, in part, by uncertainties incurred in each laboratory's traceability chain. Traceability is accordingly intimately linked to uncertainty. Traceability provides the means of placing all related measurements on a consistent measurement scale, while uncertainty characterises the 'strength' of the links in the chain and the agreement to be expected between laboratories making similar measurements.
3.3.3. In general, the uncertainty on a result which is traceable to a particular reference, will be the uncertainty on that reference together with the uncertainty on making the measurement relative to that reference.
3.3.4. Traceability of the result of the complete analytical procedure should be established by a combination of the following procedures:
- Use of traceable standards to calibrate the measuring equipment.
- By using, or by comparison to the results of, a primary method.
- By using a pure substance RM.
- By using an appropriate matrix Certified Reference Material (CRM).
- By using an accepted, closely defined procedure.
Each procedure is discussed in turn below.
3.3.5. calibration of measuring equipment
In all cases, the calibration of the measuring equipment used must be traceable to appropriate standards. The quantification stage of the analytical procedure is often calibrated using either a pure substance reference material of the, whose value is traceable to the SI. This practice provides traceability of the results to SI for this part of the procedure. However, it is also necessary to establish traceability for the results of operations prior to the quantification stage, such as extraction and sample clean up using additional procedures.
3.3.6. measurements using primary methods
A primary method is currently described as follows:
"A primary method of measurement is a method having the highest metrological qualities, whose operation is completely described and understood in terms of SI units and whose results are accepted without reference to a standard of the same quantity."
The result of a primary method is normally traceable directly to the SI, and is of the smallest achievable uncertainty with respect to this reference. Primary methods are normally implemented only by National Measurement Institutes and are rarely applied to routine testing or calibration. Where applicable, traceability to the results of a primary method is achieved by direct comparison of measurement results between the primary method and test or calibration method.
3.3.7. measurements using a pure substance reference material (RM).
As well as their use in calibrating equipment, traceability can be demonstrated by measurement of a sample composed of, or containing a known quantity of, a pure substance. This may be achieved, for example, by spiking or by standard additions. However, it is always necessary to evaluate the difference in response of the measurement system to the standard used and the sample under test. Unfortunately, for many chemical analyses and in the particular case of spiking or standard additions, both the correction for the difference in response and its uncertainty may be large. Thus, although the traceability of the result to SI units can in principle be established, in practice, in all but the most simple cases, the uncertainty on the result may be unacceptably large or even unquantifiable. If the uncertainty is unquantifiable then traceability has not been established.
3.3.8. measurement on a certified reference material (CRM)
Traceability may be demonstrated through comparison of measurement results on a certified matrix CRM with the certified value(s). This procedure can reduce the uncertainty compared to the use of a pure substance RM where there is a suitable matrix CRM available. If the value of the CRM is traceable to SI, then these measurements provide traceability to SI units and the evaluation of the uncertainty utilising reference materials is discussed in 7.5. However, even in this case, the uncertainty on the result may be unacceptably large or even unquantifiable, particularly if there is not a good match between the composition of the sample and the reference material.
3.3.9. measurement using an accepted procedure.
Adequate comparability can often only be achieved through use of a closely defined and generally accepted procedure. The procedure will normally be defined in terms of input parameters; for example a specified set of extraction times, particle sizes etc. The results of applying such a procedure are considered traceable when the values of these input parameters are traceable to stated references in the usual way. The uncertainty on the results arises both from uncertainties in the specified input parameters and from the effects of incomplete specification and variability in execution (see section 7.8.1.). Where the results of an alternative method are expected to be comparable to the results of such an accepted procedure, traceability to the accepted values is achieved by comparing the results of the two methods.