## Abstract

A precision profile, the relationship between the concentration of a substance and its measured precision, is a convenient way of conveying the ability of an immunoassay to accurately measure the concentration of a substance in blood serum. A precision profile is characterized by the definition of precision. Historically, precision has been evaluated as the standard error of an estimator of the concentration in a sample conditional on the true concentration. In this paper, Bayesian predictive inference is used to develop a new measure of precision based on the accuracy with which an assay could infer the concentration in a hypothetical new sample. This leads to a natural procedure for evaluating a precision profile that avoids using approximations such as those inherent in traditional methods.

## 1. Introduction

In an assay, independent responses are observed for a set of standards (samples containing known concentrations of test substance) together with specimens containing unknown concentrations of test substance. The purpose of the assay is to make inferences about the unknown concentrations. The standards are the primary source of information about the dose–response curve, the relationship between the responses and the concentrations, from which the unknown concentrations are inferred. Standard and unknown samples are processed identically. Usually, several responses are obtained for each standard, and there are many more unknown samples than standard samples. Historically, replicated responses were also obtained for the unknown samples, although recently there has been a trend towards a singleton measurement (Sadler & Smith 1990*a*).

A precision profile for an immunoassay is the relationship between the concentration of a substance in blood serum and its measured precision, where precision quantifies the uncertainty or noise in a measurement (incorporating baseline errors, errors in dispensing samples and reagents and errors counting antibodies at the completion of the reaction). The precision evaluates the performance of an assay across the full range of concentrations of interest, whereas other measures of assay performance, such as the minimal detectable concentration (MDC; Davidian *et al*. 1988; Brown *et al*. 1996; O'Malley & Deely 2003), tend to focus on a specific concentration (e.g. 0). As such, the precision profile is extremely useful to practitioners trying to determine which assay is best for a given application.

The use of precision profiles in assays dates back to Ekins *et al*. (1972) and Ekins (1978, 1983). On occasion, they have also been referred to as imprecision profiles (e.g. Sadler *et al*. 1988; Sadler & Smith 1990*a*,*b*). To cater for five different analytical requirements of assays, Ekins (1983) defines five distinct precision profiles. Analytical requirements may encompass one or multiple samples of the substance being tested from the same individual (intra-sample versus inter-sample), multiple repetitions of the assay (intra-assay versus inter-assay) and multiple laboratories (inter-laboratory). The intra-sample, intra-assay precision profile that incorporates the error in the randomness of the response counts and the error in the fitted dose–response curve is of interest here. A precision profile along these lines appears to be generally desired by practitioners (O'Connell *et al*. 1993). However, as elaborated on in §6, the introduction of modern machine assays that use only singleton measurements has led to precision profiles that do not account for the error in the fitted dose–response curve. By contrast, the methods developed in this paper encapsulated both sources of errors and can be adapted to correspond to a variety of analytical requirements.

Precision has been expressed in terms of the estimated variance, standard deviation or coefficient of variation of the sampling distribution of an estimator of the concentration in a sample by Ekins and co-workers (O'Connell *et al*. 1993; Sadler *et al*. 1988; Sadler & Smith 1990*a*). A concern with the use of measures based solely on variability is that the concept of ‘bias’ is not considered. For such a measure to be valid, bias must be negligible—if bias is not part of the evaluation, then constants become perfect estimators.

In this paper, a new measure of precision (and hence a new precision profile) is derived from a Bayesian model for the analysis of an immunoassay experiment. Although the use of Bayesian methods in the analysis of assays dates back at least to Ramsey (1972), Bayesian precision profiles have not been developed. However, Brown *et al*. (1996) and O'Malley & Deely (2003) incorporate Bayesian ideas in defining the MDC, and empirical Bayes methods have been used to characterize intra-assay variation (Giltinan & Davidian 1994). Similar to the Bayesian measures of MDC in O'Malley & Deely (2003), a Bayesian measure of precision is constructed using posterior predictive distributions.

The rest of this paper is organized as follows. Model details and notation are presented in §2. Existing measures of precision are reviewed in §3, while in §4 a Bayesian measure of precision is developed. In §5, the traditional and the new Bayesian precision profiles are applied to data from a serum thyroxine radioimmunoassay. The results are compared between the estimators and the implications of using different precision profiles in practice are described. Section 6 presents the concluding remarks.

## 2. Assay model and notation

For simplicity, no distinction will be made between random variables and their realizations. Lower-case letters in bold-italic denote vectors, while upper-case letters in bold-italic denote matrices and concatenated vectors.

Let denote the vector of *r*_{i} (*i*=1, …, *n*) exchangeable responses at known concentrations *x*_{i} (*i*=1, …, *n*_{s}) or at unknown concentrations *η*_{i} (*i*=*n*_{s}+1, …, *n*_{s}+*n*_{u}), where *n*_{s} is the number of known specimens (standards); *n*_{u} is the number of unknown specimens; and *n*=*n*_{s}+*n*_{u} is the total number of specimens. The sets of indices for the standard and the unknown samples are denoted by *S* and *U*, respectively. Let and denote the respective vectors of known and unknown concentrations, respectively. The sample mean of *y*_{i} is denoted by . The concatenated vector denotes all the observed responses.

It will be convenient to discuss a ‘general’ or ‘future’ vector of *m*_{f} exchangeable responses. Such a response vector at a concentration *x*_{f} will be denoted by *y*_{f} and the associated mean (of the elements of *y*_{f}) by .

Let the probability distribution of *y*_{ij} at concentration *η*_{i} be with associated mean and variance , where ** β** and

**are unknown vectors of parameters. Traditionally, is a normal density (Brown**

*θ**et al*. 1996).

The mean function is assumed to be monotonic in *η*_{i} but may depend nonlinearly on *η*_{i} and ** β**. Two commonly used mean functions are the sigmoid function (Rodbard & Hutt 1974; Finney 1976)and its affine-transformed version (O'Connell

*et al*. 1993)(2.1)where ‘mt’ and ‘nsb’ are assay-specific constants. Equation (2.1) may be increasing or decreasing depending on the way in which the assay is performed (Davies 1994); in this paper, the decreasing case is assumed, implying

**≥0.**

*β*The variance function is typically an increasing function of the mean response such as the power function(2.2)In most immunoassays, the variance of the response counts exceeds the variance of a Poisson-distributed random variable. This possibility is accommodated in (2.2) by allowing .

The prior distribution of a parameter *λ* will be denoted by *p*(*λ*) and the resulting posterior distribution, given data ** D**, is denoted by

*p*(

*λ*|

**). From Bayes' theorem,where is the marginal distribution of**

*D***. Non-informative prior distributions will be assumed in this paper (see §5). An estimate of**

*D**λ*is denoted by .

### (a) Estimation of unknown concentrations

Owing to the limited range of , the estimation of *η*_{i} for *i*∈*U* must be examined carefully. Classical estimators of *η*_{i} are typically determined by back-fitting the mean response to the corresponding concentration *η*_{i}. The mapping in the case of (2.1) is given by(2.3)where and, by assumption, ** β**>

**0**. In order for the mapping to be well defined, the inverse function needs to be defined over the full range of the data, which in the case of (2.1) requiresThat is, the affine-transformed mean response must lie between the expected counts at zero (

*β*

_{1}) and infinite (

*β*

_{4}) concentrations. In practice, the standards are typically designed so that their concentrations span the concentrations observed in most (or almost all) clinical specimens. However, very rare cases of a mean response occur in practice, in which case a zero (or a smaller value than

*L*, where

*L*is an estimate of the smallest concentration that can be distinguished from zero) concentration is assigned since interpolation is precluded. Similarly, if a mean response is less than the fitted response of the highest concentration standard, then a larger than

*M*concentration is reported, where

*M*is the largest concentration that could reasonably be expected in practice. When using Bayesian inference, a prior distribution that specifies

*η*

_{i}≥0 with a probability of 1 will ensure that only non-negative values of

*η*

_{i}can be inferred, eliminating the need for

*ad hoc*fixes and potentially yielding more efficient inferences.

Traditional estimators of *η*_{i} typically depend only on the data from sample *i* through exclusively and on the data from other samples through exclusively; denotes such an estimator. However, because Bayesian inferences about *η*_{i} also incorporate information from the dependence of the variance function *v*(*m*, ** θ**) on the mean

*m*, the more general notation is used for Bayesian estimators.

## 3. Current method of calculating precision

Historically, the precision of an assay at concentration *x*_{f} (the concentration in some hypothetical future sample) is based on the variance or coefficient of variation of the sampling distribution of , where *y*_{f} is a response measurement associated with *x*_{f}. This calculation cannot, in general, be performed in a closed form. In this section, we describe two methods of approximating the sampling variance of : the analytical method and the empirical method.

### (a) Analytical method

When is a closed-form function of and , an approximation based on a Taylor series expansion (TSE) may be used to estimate the precision of an assay. The inverse estimator in (2.3), which corresponds to the pseudo-maximum-likelihood estimator of *η*_{i} for fixed ** θ**, has the required form. However, when the variance function depends on

*η*

_{f}directly or through the mean as in (2.2), the likelihood equations cannot, in general, be solved for

*η*

_{f}, implying that Taylor series methods may not be applicable to the full maximum-likelihood estimator.

Ekins (1983) uses a first-order TSE of about to approximate the mean and variance of . This ignores the variability in ** β** (i.e. treats

**as known) and so does not incorporate all of the uncertainty in the estimated concentrations. In §5, it is shown that this method yields smaller values of precision than those of other methods. The approach of O'Connell**

*β**et al*. (1993) is based on the TSE of about and , which is given by(3.1)where

*η*

*denotes the partial derivative (or gradient function in the case of a vector argument) of*

_{z}*η*(

*z*,

*w*) with respect to

*z*. The first-order approximations of the mean and variance of are thus(3.2)and(3.3)where is the covariance matrix of . To enable (3.3) to be evaluated, parameter estimates are substituted for the unknown parameters in , , and .

The estimated precision at concentration *x*_{f} is and when precision is defined in terms of the standard deviation and the coefficient of variation, respectively. Equations (3.2) and (3.3) do not depend on the distribution of the response counts.

Approximations based on second-order (or higher) TSEs are computed analogously. Although these have the potential to be more accurate, the resulting expressions tend to be unwieldy, even if the assumption that the response counts are normally distributed is invoked, and so are not displayed here. However, in §5, precision profiles based on first- and second-order TSEs are plotted.

### (b) Empirical method

An alternative approach to deriving the precision profile is to fit a model to a dataset of estimated concentrations. Let denote the concentration back-fitted from , and . A variance function of in terms of the true concentration *η*_{i} is then estimated using a parametric model, such as(3.4)where ** α** is a vector of unknown parameters and

*N*(

*μ*,

*σ*

^{2}) denotes the density of a normal distribution with mean

*μ*and variance

*σ*

^{2}. We recommend the use of restricted maximum likelihood to fit the model. The efficiency of can be improved by using the fact that , for (O'Malley

*et al*. in press). Expressions for and are derived from the fitted model. For example, if

*r*

_{i}=2, the fitted variance function associated with (3.4) is given by

The empirical method implicitly assumes that the fitted values of the model parameters are the true values or, in other words, that the variability of the response counts is the only source of uncertainty in the unknown concentrations that is of interest. This viewpoint is equivalent to that of the analytical method based on a TSE in alone. However, the empirical method involves fewer assumptions because it does not rely on the distribution, mean function or variance function of the response counts.

More extensive descriptions of the analytical method are given in Sadler *et al*. (1988) and Sadler & Smith (1990*a*,*b*). A key point is that the empirical method was not intended for use on data from a single batch of samples since, in order to compute the variability of the estimated concentrations, it relies on replicated samples (thus making it impossible to apply to modern machine assays). Rather, the empirical method has historically been used in situations where the assay includes replication of some sort. For example, if the same assay is calibrated at the start of the day (thus, separate batches of samples are obtained each day), a precision profile can be determined from the standards and any other samples that are repeatedly analysed. However, for comparison purposes, the empirical method is included in the analysis.

## 4. Bayesian measure of precision

The key terms in a Bayesian analysis of an assay are described in the following. For the model described in §2, the likelihood function is given bywhere *I*(event)=1 if ‘event’ is true and 0 otherwise. The joint prior distribution of the unknown parameters is denoted by *p*(** β**,

**,**

*θ***). Using Bayes' theorem, the posterior distribution of the model parameters (**

*η***,**

*β***) is computed asThe posterior distribution contains all the information in the assay about (**

*θ***,**

*β***).**

*θ*The posterior distribution conditional on of the unknown concentration *η*_{f} in a new sample is given by(4.1)where the prior if *η*_{f} is *a priori* independent of (** β**,

**). To minimize the influence of the prior on the analysis of the serum thyroxine radioimmunoassay in §5, we set , where**

*θ**U*(0,

*M*) denotes the uniform distribution on the interval (0,

*M*); as defined earlier,

*M*is the largest concentration that could realistically be observed in practice. There is no point in using a value larger than

*M*as an upper limit because such parameter values would never be supported by the data. In practice, the distribution of the concentration

*η*

_{i}of a substance in a test population is often well known. However, we chose not to incorporate such

*a priori*information in this analysis as we desired to compare the procedures in terms of experimental information alone.

If *y*_{f} was observed, it would contain information about (** β**,

**), and hence the posterior of**

*θ**η*

_{f}would be computed as(4.2)

### (a) Concentration distribution of assay noise precision

We desire to describe the precision (or level of uncertainty) in an assay's measurement of an unknown concentration *η*_{f} when the true concentration is *x*_{f}. From the perspective of evaluating the precision of an assay conditional on the observed data (** Y**,

**),**

*x*

*y*_{f}has not been observed and so does not represent ‘data’. Therefore,

*y*_{f}should not be used to update the posterior distribution of (

**,**

*β***), and hence (4.2) is not appropriate for evaluating precision (see O'Malley & Deely 2003 for further comments). In the following, we overcome this problem.**

*θ*Let (*y*_{f}) and (*x*_{f}) indicate that *y*_{f} and *x*_{f} are not used to update in a given computation. To extend (4.1) to account for the variability in *y*_{f} at concentration *x*_{f}, computethe posterior distribution of *η*_{f} conditional on . Finally, to account for the uncertainty in , compute(4.3)

The probability distribution function on the l.h.s. of (4.3) is the predictive distribution of , the posterior distribution that would be used for inference about *η*_{f} if (** β**,

**) was known and**

*θ*

*y*_{f}was unobserved. Moreover, fully describes the precision (or noise) of an assay's measurement at

*x*

_{f}; this distribution is referred to as the

*concentration distribution of assay noise*(

*CDAN*) at

*x*

_{f}.

### (b) CDAN precision and CDAN precision profile

The final step in the evaluation of a Bayesian measure of precision is to reduce to a summary statistic to be plotted against *x*_{f} to yield the precision profile. The expected root mean squared error (RMSE), or relative RMSE, about *x*_{f} of a random variable drawn from CDAN(*x*_{f}) is a natural choice. The mean squared error (MSE) of CDAN(*x*_{f}) is the posterior predictive value of . In this regard, the MSE of CDAN(*x*_{f}) is directly related to the quantity that would be reported if, having observed *y*_{f}, *η*_{f} was estimated by minimizing the expected squared-error loss. An advantage of RMSE relative to the standard deviation or the coefficient of variation is that it accounts for the bias or accuracy of the assay in addition to its variability. For the general loss function, denoted by , the CDAN precision is evaluated as . For example, if we desire to compute , we would let ; in the special case where *η*_{0}=0, this yields a measure of the MDC similar to the measures derived in O'Malley & Deely (2003).

Measures of the CDAN precision may be evaluated using Monte Carlo averages involving a large sample of *η*_{f} drawn from CDAN(*x*_{f}). The values of are obtained using a Markov chain Monte Carlo (MCMC) algorithm, involving Gibbs sampler (Geman & Geman 1984; Gelfand & Smith 1990; Casella & George 1992) and Metropolis–Hastings (Metropolis *et al*. 1953; Hastings 1970) steps, to draw values of (** β**,

**) from the joint posterior distribution. Next, conditional on each draw of (**

*θ***,**

*β***), we generate a hypothetical new response**

*θ*

*y*_{f}at concentration

*x*

_{f}. Finally, we generate

*η*

_{f}from the posterior distribution of

*η*

_{f}given (

*y*_{f},

**,**

*β***). After many draws of**

*θ**η*

_{f}(e.g. 10 000 or more), the precision of the assay at

*x*

_{f}can be accurately approximated. If interpolation is used to directly link the simulated values of precision, the resulting precision profile will depict the level of simulation error in the computation. To obtain a smooth precision profile, a parametric family of functions (e.g. such as the variance function in (3.4)) or a non-parametric regression function may be fitted to the simulated values of precision.

## 5. Analysis of radioimmunoassay data

In this section, the Bayesian CDAN precision profile is evaluated and compared with traditional precision profiles using data from a serum thyroxine radioimmunoassay performed in the Department of Nuclear Medicine at Christchurch Public Hospital (NMCPH), New Zealand. Thyroxine is a hormone secreted by the thyroid gland and its measurement is a primary tool in the investigation of thyroid function. The radioimmunoassay uses seven standard samples with thyroxine concentrations of 0, 35, 70, 140, 210, 280 and 350 nmol l^{−1}, and it has a reference range (encompasses 95% of subjects free of thyroid disease) of 55–140 nmol l^{−1}. In these assays, the zero standard was replicated four times, while all other samples were measured in duplicate. The total number of unknown samples was 72, bringing the total number of distinct samples in the assay to 79. For further description of such assays and the data they generate, see O'Malley & Deely (2003) and O'Malley *et al*. (in press).

Because the response counts are large (ranging from 3000 to 20 000), the conditional distribution of the response count, given the concentration in the sample, is likely to be closely approximated by a normal distribution (see O'Malley *et al*. (in press) for justification). Therefore, we assume that the response counts follow a normal distribution with the mean and variance functions in (2.1) and (2.2), respectively. The non-informative prior is used for Bayesian analysis; this is Jeffrey's prior in the case when *θ*_{2}=0 and it is assumed that *β*, *θ* and *η* are independent, suggesting that the data will have close to maximal weight on the ensuing inferences. For brevity, in the following we omit details of the parameter estimates and posterior distribution of (*β*, *θ*) and, instead, focus on the precision profiles derived from the fitted model.

Figure 1 presents six different precision profiles for this serum thyroxine radioimmunoassay. Figure 1*a*,*b* shows the precision profiles based on the first- and second-order TSEs in only and in and , respectively. Figure 1*c* is the precision profile obtained by applying the empirical approach with the variance function in (3.4). Finally, figure 1*d* is the Bayesian precision profile based on the CDAN. Precision is evaluated for two replicates using the coefficient of variation for the traditional definitions and the relative RMSE of the CDAN for the new Bayesian definition.

All of the precision profiles are decreasing functions for an initial range of concentrations and then are increasing functions over larger concentrations. Thus, each profile identifies a point of maximal relative precision and yields finite-width intervals of the concentrations that can be measured within various thresholds of precision (e.g. the 0.05 threshold is depicted in figure 1).

The first- and second-order analytical precision profiles in figure 1*a*,*b* are almost indistinguishable, implying that the addition of the second-order terms has little impact on the evaluated precision. Thus, results for the first- and second-order analytical approaches may be discussed as one.

The analytical precision profile in figure 1*a* and the empirical precision profile in figure 1*c* are close for *x*_{f}≤150 nmol l^{−1} after which the trajectory of the empirical precision profile is flatter. To study this finding in more detail, note that the estimated parameters for the model in (3.4) are , implying that the empirical precision profile has the form . Therefore, as *x*_{f} becomes large, precision becomes proportional to . Hence, the empirical precision profile is a slowly increasing function in *x*_{f}.

The precision profiles in figure 1*a*,*c* incorporate a translated form of the error in the response counts but ignore the variability in , and hence are, not surprisingly, the lowest profiles. A precision profile that ignores the uncertainty in the fitted assay model would potentially be of interest when the statistical design (the number and concentration of the standard samples, and the number of replicated measurements performed on each sample) is not considered an attribute of the assay. In this case, precision reflects only the variability in the response counts. However, if as in this paper, the objective of the analysis is to quantify an assay's ability to measure the concentration in an unknown sample, the uncertainty in the fitted model should be incorporated in the precision profile.

When the variability in is accounted for, the analytical precision profiles (figure 1*b*) attain much higher values of precision, especially over larger concentrations, than the corresponding precision profiles that consider only the uncertainty in (figure 1*a*,*c*). However, for *x*_{f}≥80 nmol l^{−1}, the precision profile in figure 1*b* is well above the Bayesian precision profile in figure 1*d*. Thus, for a large range of concentrations, the analytical precision profile gives a more pessimistic assessment of the quality of the assay than the Bayesian (CDAN) precision profile. The discrepancy arises because the Bayesian precision profile uses all of the information in the assay and the impact of (differences in the level of) information is magnified when *x*_{f} is large. The analytical method culls information by invoking simplifying assumptions and ignoring the dependence of *v* on *m* and the information about ** θ** in replicated unknown samples.

A reassuring feature of the Bayesian precision profile is that it is a compromise between the precision profiles in figure 1*a–c*. It is similar in shape to the plots in figure 1*a* but, by virtue of accounting for the uncertainty in the fitted assay model, shifts upwards and has a greater curvature (but as noted above, these adjustments are not as great as for the analytical precision profile in figure 1*b*).

### (a) CDAN for the NMCPH data

To study the precision of the assay in a broader sense, CDAN(*x*_{f}) is plotted for *x*_{f}=0, 2, 5, 10, 100 and 300 nmol l^{−1} (figure 2*a*–*f*, respectively). As *x*_{f} increases, the spread of CDAN(*x*_{f}) increases. As expected, CDAN(*x*_{f}) is skewed to the right when *x*_{f}≃0 and CDAN(*x*_{f}) is nearly symmetrical when *x*_{f} is large.

## 6. Conclusion

In this paper, Bayesian inference was used to develop a new precision profile of an assay. An advantage of the Bayesian CDAN precision profile is that mathematical approximations are avoided by using exact Bayesian inference (to the accuracy of the finite-length MCMC simulation) to evaluate moments of CDAN(*x*_{f}) at each *x*_{f}. Therefore, the new Bayesian precision profile takes a full account of the information in both the standard and the unknown samples. The Bayesian precision profile would appear to yield a more informative representation of the quality of an assay than traditional methods that rely on simplified forms of estimators or other mathematical approximations.

The new Bayesian precision profile was compared with existing procedures using a serum thyroxine radioimmunoassay. The differences in the results are non-trivial, particularly over medium to large concentrations, implying that the use of traditional precision profiles may give misleading representations of the assay quality.

The Bayesian method developed in this paper can be easily extended to accommodate analytical requirements involving multiple samples or laboratories by specifying a hierarchical model that allows certain parameters to vary between samples, assays or laboratories but linking them through a common underlying distribution that describes the differences between them. If the same assay is repeated on multiple occasions (e.g. each day of the year), it is possible that the concentration of the standard samples will degrade over time. To accommodate drift in the concentration of the standards, the model can be extended by allowing the parameters to change over time (a description of the importance of modelling carry-over and drift is given in Daniel 1975).

## Acknowledgments

Appreciation is extended to William A. Sadler for supplying data from radioimmunoassay experiments at the NMCPH and to two anonymous reviewers for their valuable comments on an earlier version of the manuscript.

## Footnotes

One contribution of 13 to a Theme Issue ‘Mathematical and statistical methods for diagnoses and therapies’.

- © 2008 The Royal Society