Dimensionality and noise in energy selective x-ray imaging-Part 1

In the next few posts I will discuss my paper, Dimensionality and noise in energy selective x-ray imaging, available for free download here. I will elaborate on the physical and mathematical background and explain how to reproduce the figures.

With my approach to energy selective imaging[1], the x-ray attenuation coefficient is approximated as a linear combination of functions of energy multiplied by constants that are independent of energy. The number of functions required is the dimensionality. The basic premise of the paper is that the dimensionality is really a pragmatic tradeoff between more information, which requires larger dimensionality, and the increase in noise, which requires higher dose and more expensive equipment to reduce it to a level where the resultant images are clinically useful. The bottom line of the paper is that with biological materials such as soft tissue, bone, and fat, only two dimensions are practical but if an externally administered contrast agent with a high atomic number element such as iodine is included then three and maybe more dimensions are possible.

Dimensionality--intrinsic and noise based

Two approaches have evolved to define dimensionality--intrinsic and noise based. The intrinsic dimensionality is the number of functions required to approximate attenuation coefficients to an accuracy better than the errors in standard physics tables. I applied the singular value decomposition (SVD) to study the intrinsic dimensionality in the early 1980’s. My work at that time is summarized in a book chapter[2], which based on two Stanford technical reports that I wrote in 1981 and 1982, available here and here. Intrinsic dimensionality and the SVD is also discussed in several of my posts, here and here. These posts are included in my free ebook that you can get by emailing me.

My PhD dissertation and initial paper[1], used a definition of dimensionality that was actually close to the noise-based approach discussed in this post. I found that by using a two function basis set I was able to approximate the best attenuation coefficient data I could find[3] very accurately--more accurately than the noise in the CT scanners of that day. With this approach, the noise-based approach, the dimensionality depends not only on the properties of the attenuation coefficients but also, and perhaps more importantly, on the noise in the x-ray system measurements. These are limited by x-ray quantum noise and ultimately on the amount of dose that we are willing to use. This depends on the medical usefulness of the dimensionality information, so we need a method to study the trade-off between dose and dimensionality. That is the purpose of the paper.

The two approaches to defining dimensionality are, of course, closely related. Although the mathematics discussed in the paper does not explicitly refer to the intrinsic dimensionality, the basis functions are drawn from the space of attenuation coefficients and their properties affect the results. As I will show in this post, the increase in noise depends on the projection of the additional basis function on the other basis functions. Unless the additional function has a significant component orthogonal to the previous functions, the noise with the larger number of functions will be much larger than the noise with the smaller basis set. The ability to choose additional functions orthogonal to the previous functions depends on the intrinsic dimensionality of the attenuation coefficient space. If it is low, then you cannot find additional orthogonal functions and the noise will keep you from extracting the information.

Increase in CRLB covariance with dimension

The increase in noise covariance with the number of parameters estimated is well known and is covered in textbooks such as Parameter Estimation for Scientists and Engineers[4] by Adriaan van den Bos. Van den Bos bases his discussion on the Fisher information matrix, F, which is the inverse of the Cramèr-Rao lower bound (CRLB). I have discussed the CRLB in several posts, see the ebook.

I discussed the Fisher matrix in this post. This matrix, which I will denote by F, has components which are the negative of the expected value of the second derivative of the log of the likelihood ℒ = log[Pr(N;A)]

F_ij = − ⟨(∂²ℒ)/(∂A_i∂A_j)⟩.

Notice that if we add an additional A-vector component, the F matrix terms that correspond to the lower dimension components do not change. With this observation, we can apply Corollary B.4 of van den Bos[4] to compute the magnitude of the additional variance when we increase the dimension. Suppose F_K − 1 is the Information matrix with a K-1 function basis set and R_K − 1 = F^− 1_K − 1 is the corresponding CRLB covariance. If we add an additional basis function, then the K dimension information matrix can be partitioned as

(1) F_K = ⎡⎢⎢⎢⎢⎢⎢⎢⎣ F_K − 1 q \hline q^T F_KK ⎤⎥⎥⎥⎥⎥⎥⎥⎦

where q is a column vector and F_KK is a scalar. According to the corollary, the inverse of F_K is

(2) R_K = F^− 1_K = ⎡⎢⎢⎢⎢⎢⎢⎢⎣ R_K − 1 + (1)/(u)R_K − 1qq^TR_K − 1 − (1)/(u)R_K − 1q \hline − (1)/(u)q^TR_K − 1 (1)/(u) ⎤⎥⎥⎥⎥⎥⎥⎥⎦

where the scalar

(3) u = F_KK − q^TR_K − 1q.

Eqs. 1↑ and 2↑ are true for any information matrix. We can gain further insight by using the constant covariance approximation to the CRLB for energy selective data. This approximation assumes that the information matrix F does not depend on the A-vector. As discussed in the Appendix of the paper and in a previous post, the approximation is valid for the large expected counts normally used in medical imaging. With the approximation, the information matrix is

(4) F = M^TC^− 1_LM

where, as usual, M = ^∂L⁄_∂A is the effective basis functions for the measurement x-ray spectra and C_L is the covariance of the log of the measurements. See the paper or my previous post for a more detailed discussion.

Using Eq. 4↑ for the three function case, we can partition M as

M = ⎡⎢⎢⎢⎣ | | μ₁ | μ₂ | μ₃ | | ⎤⎥⎥⎥⎦.

With the partitioned matrix notation

(5) F₃ = M^TC^− 1_LM = ⎡⎢⎢⎢⎢⎢⎢⎢⎣ μ^T₁ \it − − − \it − − − \it − − − μ^T₂ \it − − − \it − − − \it − − − μ^T₃ ⎤⎥⎥⎥⎥⎥⎥⎥⎦C^− 1_L⎡⎢⎢⎢⎣ | | μ₁ | μ₂ | μ₃ | | ⎤⎥⎥⎥⎦

Comparing this to Eq. 1↑ for the additional covariance in going from two to three basis functions, the q vector is

(6) q = ⎡⎢⎣ μ^T₁C^− 1_Lμ₃ μ^T₂C^− 1_Lμ₃ ⎤⎥⎦.

That is, q is the projection of the additional basis vector μ₃ onto the first two basis vectors using a weighted inner product with the measurement spectra as the weighting function. From Eq. 2↑, if q is small, that is μ₃ is nearly orthogonal to the [μ₁ μ₂], then the additional covariance will be small. Conversely if the projection is large, then the additional basis function does not add much information and the additional covariance will be large.

The additional covariance multiplier, u

Since I wrote the paper, I studied the scalar, u, in Eq. 3↑. The additional covariance in Eq. 2↑ is multiplied by ¹⁄_u so it has a strong effect on it. Using the symbolic math program Maxima to do the matrix calculations, I was able to show that u is

(7) u = (|F_K|)/(|F_K − 1|),

where the notation | | denotes a determinant. The maxima code to demonstrate this result is included with the the code for this post.

Eq. 7↑ has an interesting interpretation. As the name implies, the Fisher matrix F behaves like information and its determinant in some respects measures the amount of information. Another way to look at this is that the CRLB, which is the best estimate covariance, is the inverse of F. Therefore, the larger the determinant of F the smaller the determinant of the covariance and a smaller determinant implies smaller variance. This fact is used in the design of data collected for experiments. Designs that maximize |F| are called D-optimal and are widely used. Matlab provides functions such as candgen in their Statistics toolbox to help to produce D-optimal designs. From any of these points of view, the expression for u in Eq. 7↑ implies that the larger the information with the additional parameter compared with the information with the previous parameters, the smaller ¹⁄_u and the additional covariance.

Numerical results

I tested Eqs. 2↑ and 7↑ numerically using Matlab. The relevant code is in the box below. The code computes the difference between the formulas for the additional covariance and the u constant is round-off error and their actual values. The complete code is available here.

1%%  test additional covariance formulas
2nbasis = 3;
3nbins = 6;
4specdat.idx_threshold = OptimNbinsThresholds(specdat,nbins);
5specdat.mus = zeros(numel(specdat.egys),nbasis); % use zeros since 
6  % using 0 thickness in PoissonLambda
7CLinv = diag(PoissonLambda(zeros(1,nbasis),specdat)); % inverse of covariance of log
8additional_materials = {’adipose’,’iodine’};
9ncases = length(additional_materials);
10v = zeros(ncases,4); % summary data
11mus_add = [mus_body_materials(:,3) musI]; % adipose, iodine atten. coeff.
12for kcase = 1:ncases
13    specdat.mus = [mus_body_materials(:,1:2) mus_add(:,kcase)];
14    M = mubarNbinNdim(specdat,zeros(1,nbasis));
15    F = M’*CLinv*M;
16    R = inv(F);
17    M2 = M(:,1:(nbasis-1));
18    F2 = M2’*CLinv*M2;
19    R2 = inv(F2);
20    q = F(1:(nbasis-1),nbasis);
21    u = F(nbasis,nbasis) - q’*R2*q;
22    u_calc = det(F)/det(F2);
23    u_error = u - u_calc;
24    z = R2*q;
25    R_add = (z*z’)/u;
26    R_error = norm(R(1:(nbasis-1),1:(nbasis-1)) - (R2 + R_add))/...
27        norm(R(1:(nbasis-1),1:(nbasis-1)));
28    v(kcase,:) = [u,u_calc ,u_error,R_error];
29    fprintf(’kcase %d nbasis %d nbins %d u-error %g addtl-cov error %g\n’, ...
30        kcase,nbasis,nbins,u_error, R_error);
31end

The numerical results are in Table 1↓. Notice that the errors, in the last two columns on the right, are essentially round-off errors.

Table 1 Numerical validation of additional covariance formula

	u	u calc	u error	frac-R error
adipose	1.30e-07	1.30e-07	5.79e-17	4.44e-10
iodine	3.57e-03	3.57e-03	2.07e-16	5.31e-14

Conclusions

I have presented the mathematics behind the increase in covariance with dimensionality and new results for the multiplication constant u in the formulas. The results show that the more information the additional parameter provides, the smaller the additional covariance. The next post will show simulations that will quantify the additional covariance for two cases that are of medical interest: imaging fat in addition to soft-tissue and bone, and selectively imaging an externally administered contrast agent.

--Bob Alvarez

Last edited Aug 19, 2014

Linking is allowed but reposting or mirroring is expressly forbidden.

References

[1] R. E Alvarez, A. Macovski: “Energy-selective reconstructions in X-ray computerized tomography”, Phys. Med. Biol., pp. 733—44, 1976.

[2] Leonard A Lehmann, Robert E Alvarez: Energy Selective Radiography: A Review. Plenum Press, 1986.

[3] Michael E Phelps, Edward J Hoffman, Michel M Ter-Pogossian: “Attenuation Coefficients of Various Body Tissues, Fluids, and Lesions at Photon Energies of 18 to 136 keV”, Radiology, pp. 573—583, 1975.

[4] Adriaan van den Bos: Parameter Estimation for Scientists and Engineers. Wiley-Interscience, 2007.