2 Algebra and statistics embedded in the simplex

In the previous section, we reminded helpful algebra and statistics tools to study compositional data which often made use of the isometric logratio transformations.
In this section, we will develop a more intrinsic point of view on compositional linear algebra and statistics, in order to prepare for Chapter 3, where we will suggest an adapted ICS method for compositional data.

Code

source("../functions.R")
plot_data(BDDSegXcomp, grid = FALSE, col = TRUE)

Figure 2.1: Pairwise ternary diagrams of the `BDDSegX` data set (compositions).

From now on, we transform the volumes from the data set BDDSegX in compositions denoted by \(\mathtt{S_A}, \mathtt{S_B}, \mathtt{S_C}, \mathtt{S_D}\) and \(\mathtt{S_E}\) by using the closure operator. Note that in the case of \(D=3\), it is possible to plot compositional data by using a ternary representation. When \(D>3\), as in the example BDDSegX where \(D=5\), we can plot a matrix of ternary diagrams by considering in turn two coordinates and aggregating the other coordinates of the composition (see Figure 2.1).

2.1 Linear algebra in the simplex

Since \(\mathcal{S}^D\) is canonically isometric to the \(\mathop{\mathrm{clr}}\)-space \(\mathop{\mathrm{\mathcal{H}}}\), we can naturally identify linear mappings on the simplex and on \(\mathop{\mathrm{\mathcal{H}}}\), associating to a mapping \(u \in \mathcal{L}\left(\mathcal{S}^D\right)\) the mapping: \[\label{eq:Psi} \Psi\left(u\right): \begin{array}{rcl} \mathop{\mathrm{\mathcal{H}}}& \to & \mathop{\mathrm{\mathcal{H}}}\\ \mathbf{y}& \mapsto & \mathop{\mathrm{clr}}\left(u\left(\mathop{\mathrm{clr}}^{-1}\left(\mathbf{y}\right)\right)\right) \end{array} \in \mathcal{L}\left(\mathop{\mathrm{\mathcal{H}}}\right)\]

To get a matrix description of the algebra \(\mathcal{L}\left(\mathop{\mathrm{\mathcal{H}}}\right)\), it is more convenient to see it as embedded into \(\mathcal{L}\left(\mathbb{R}^D\right)\), through the isomorphism: \[\begin{array}{rcl} \left\{v \in \mathcal{L}\left(\mathbb{R}^D\right)\,|\,v\left(\mathop{\mathrm{\mathcal{H}}}\right) \subseteq \mathop{\mathrm{\mathcal{H}}}, v(\mathbf{1}_D) = 0\right\} & \to & \mathcal{L}\left(\mathop{\mathrm{\mathcal{H}}}\right) \\ v & \mapsto & v\vert_{\mathop{\mathrm{\mathcal{H}}}} \end{array}\] because then the conditions to belong to \(\mathcal{L}\left(\mathop{\mathrm{\mathcal{H}}}\right)\) can be translated into matrix conditions (called ‘zero-sum property’ in (Pawlowsky-Glahn, Egozcue, and Tolosana-Delgado 2015)).
These intuitions can be summed up into the following theorem:

Theorem 2.1

The algebra \(\mathcal{L}\left(\mathcal{S}^D\right)\) of linear mappings on \(\mathcal{S}^D\) is isomorphic to the algebra: \[\mathcal{M}= \left\{\mathop{\mathrm{A}}\in \mathbb{R}^{D \times D}\,|\,{\mathbf{1}_D}^\top \mathop{\mathrm{A}}= 0 \text{ and } \mathop{\mathrm{A}}\mathbf{1}_D = 0\right\}\] of \(D \times D\)-matrices verifying the zero-sum property (row-wise and column-wise), called ‘\(\mathop{\mathrm{clr}}\)-matrices’, on which the identity element with respect to the multiplication is the matrix: \[\mathop{\mathrm{G}}_D = \mathop{\mathrm{I}}_D - \frac{1}{D} \mathbf{1}_D {\mathbf{1}_D}^\top\] and the multiplicative inverse of a rank \(D-1\) matrix \(A \in \mathcal{M}\) is: \[A^{-1} = \left(\mathop{\mathrm{A}}+ \mathbf{1}_D {\mathbf{1}_D}^\top\right)^{-1} - \mathbf{1}_D {\mathbf{1}_D}^\top = \mathop{\mathrm{ilr}}_\mathcal{B}^{-1}\left({\mathop{\mathrm{ilr}}_\mathcal{B}\left(\mathop{\mathrm{A}}\right)}^{-1}\right) \in \mathcal{M}\] for any basis \(\mathcal{B}\).
The following mapping is an isomorphism: \[\Phi: \begin{array}{rcl} \mathcal{M}& \to & \mathcal{L}\left(\mathcal{S}^D\right) \\ \mathop{\mathrm{A}}& \mapsto & \left(\mathbf{x}\mapsto \mathop{\mathrm{A}}\boxdot \, \mathbf{x}\right) \end{array}\] and its inverse is: \[\Phi^{-1}: \begin{array}{rcl} \mathcal{L}\left(\mathcal{S}^D\right) & \to & \mathcal{M}\\ u & \mapsto & \left(\mathop{\mathrm{clr}}\left(u\left(\varepsilon_1\right)\right), \dots, \mathop{\mathrm{clr}}\left(u\left(\varepsilon_D\right)\right)\right) \end{array}\] where \(\left(\varepsilon_1, \dots, \varepsilon_D\right)\) is the generating set of \(\mathcal{S}^D\) obtained when applying \(\mathop{\mathrm{clr}}^{-1}\) to the canonical basis of \(\mathbb{R}^D\).

Proof.

We prove that \(\Phi\) is an isomorphism by writing \(\Phi = \Phi_2 \circ \Phi_1\) thanks to the compatibility of the matrix-vector product on the simplex from Proposition 1.5, where \(\Phi_2=\Psi^{-1}\) (as defined in
1. is clearly an algebra isomorphism and: \[\Phi_1: \begin{array}{rcl} \mathcal{M}& \to & \mathcal{L}\left(\mathop{\mathrm{\mathcal{H}}}\right) \\ \mathop{\mathrm{A}}& \mapsto & \left(\mathbf{x}\mapsto \mathop{\mathrm{A}}\, \mathbf{x}\right) \end{array}\] is well defined because if \(\mathop{\mathrm{A}}\in \mathcal{M}\) and \(u = \Phi_1\left(\mathop{\mathrm{A}}\right) \in \mathcal{L}\left(\mathbb{R}^D\right)\), we have \(\mathop{\mathrm{A}}\mathbf{1}_D = \mathbf{0}_D\) so \(u\left(\mathbf{1}_D\right) = \mathbf{0}_D\) and \({\mathbf{1}_D}^\top \mathop{\mathrm{A}}= \mathbf{0}_D\) so for every \(\mathbf{x}\in \mathbb{R}^D\), \(\left([\right)a]{\mathbf{1}_D,u\left(\mathbf{x}\right)} = 0\) i.e. \(u\left(\mathbf{x}\right) \in {\mathbf{1}_D}^\perp = \mathop{\mathrm{\mathcal{H}}}\) so \(\Phi_1\left(\mathop{\mathrm{A}}\right) \in \mathcal{L}\left(\mathop{\mathrm{\mathcal{H}}}\right)\). Then \(\Phi_1\) is the restriction/corestriction of an injective algebra homomorphism and the subspaces \(\mathcal{M}\) and \(\mathcal{L}\left(\mathop{\mathrm{\mathcal{H}}}\right)\) have the same dimension \(\left(D-1\right)^2\).
To prove the expression of \(\Phi^{-1}\), it is sufficient to write it as \(\Phi_1^{-1} \circ \Phi_2^{-1}\).
We have \(\Phi^{-1}\left(\mathop{\mathrm{Id}}_{\mathcal{S}^D}\right)= \mathop{\mathrm{G}}_D\).
If \(\mathop{\mathrm{A}}\in \mathcal{M}\) is a rank \(D-1\) \(\mathop{\mathrm{clr}}\)-matrix, we consider the linear mapping \(v \in \mathcal{L}\left(\mathbb{R}^D\right)\) associated with \(\mathop{\mathrm{A}}+ \mathbf{1}_D {\mathbf{1}_D}^\top\) which stabilizes \(\mathop{\mathrm{\mathcal{H}}}\) and \(\mathbb{R}\mathbf{1}_D\), is invertible and verifies \(v_{\vert \mathop{\mathrm{\mathcal{H}}}} = \Phi_1\left(\mathop{\mathrm{A}}\right)\) and \(v_{\vert \mathbb{R}\mathbf{1}_D} = \mathop{\mathrm{Id}}_1\), from where we deduce the formula.

Remark.

The isomorphisms from the theorem are canonical, in the sense that they do not rely on a choice of orthonormal basis of \(\mathcal{S}^D\).
This theorem can be seen as a particular case of theorem 11.3.10 in (Egozcue et al. 2011) in which the linear mappings are endomorphisms. But it is more precise because it gives algebras isomorphisms and a formula to compute the inverse of a rank \(D-1\) \(\mathop{\mathrm{clr}}\)-matrix.
One could as easily prove a formula such as, for every \(\mathop{\mathrm{A}}\in \mathcal{M}\): \[\det\left(\mathop{\mathrm{A}}^*\right) = \det\left(\Phi\left(\mathop{\mathrm{A}}\right)\right) = \det\left(\mathop{\mathrm{A}}+ \mathbf{1}_D {\mathbf{1}_D}^\top\right)\]

2.2 Whitening data in the simplex

Thanks to the formula of the inverse of a rank \(D-1\) \(\mathop{\mathrm{clr}}\)-matrix \(\mathop{\mathrm{A}}\in \mathcal{M}\): \[\mathop{\mathrm{A}}^{-1} = \left(\mathop{\mathrm{A}}+ \mathbf{1}_D {\mathbf{1}_D}^\top\right)^{-1} - \mathbf{1}_D {\mathbf{1}_D}^\top\] it is possible to compute the inverse square root of the (co-)variance \(\mathop{\mathrm{clr}}\)-matrix \(\mathbb{V}\left[\mathbf{X}\right]^{-\frac{1}{2}}\), in order to whiten a compositional random variable admitting the first two moments without computing any isometric logratio transformation: \[\mathbf{Z}= \mathbb{V}\left[\mathbf{X}\right]^{-\frac{1}{2}} \boxdot \left(\mathbf{X}\ominus \mathbf{g}\left[\mathbf{X}\right]\right) = \mathop{\mathrm{clr}}^{-1}\left(\mathbb{V}\left[\mathop{\mathrm{clr}}\left(\mathbf{X}\right)\right]^{-\frac{1}{2}} \left(\mathop{\mathrm{clr}}\left(\mathbf{X}\right) - \mathbb{E}\left[\mathop{\mathrm{clr}}\left(\mathbf{X}\right)\right]\right)\right)\] with \(\mathbb{V}\left[\mathbf{Z}\right] = \mathop{\mathrm{G}}_D\) and \(\mathbf{g}\left[\mathbf{Z}\right]= \frac{1}{D} \mathbf{1}_D\).

Code

plot_data(
  list(acompmargin(BDDSegXcomp, c("S_A", "S_B")), whiten(acompmargin(
    BDDSegXcomp, c("S_A", "S_B")
  ))),
  col = "Blues",
  grid = T,
  cov_ellipse = T,
  width = 6,
  height = 3
)

Figure 2.2: Original (*left*) and whitened (*right*) subcompositions of `BDDSegX` data set (compositions).

Example 2.1 Figure 2.2 illustrates the impact of whitening data in the simplex on the BDDSegX data set. We consider \(\mathtt{S_A}\) and \(\mathtt{S_B}\) and aggregate the other coordinates in order to get compositions in 3 dimensions. The mean (geometric mean) of the data points is plotted as a small circle while the scatter is represented by an ‘ellipse’ transformed back to the simplex.

2.3 Elliptical distributions on the simplex

In this subsection, we will define elliptical distributions on the simplex, generalizing the definition of the Gaussian distribution on the simplex in Theorem 6.4 by Pawlowsky-Glahn, Egozcue, and Tolosana-Delgado (2015).

Example 2.2 On ?fig-gaussian, we generated \(n=100\) observations following a Gaussian distribution in \(\mathcal{S}^3\) and plot the associated ternary diagram with an ‘ellipse’ contour on the left plot. The same data are plotted on the right part of the figure after whitening.

Code

gaussiancomp <- rmvnormmix.acomp(100,
  mu = c(1, 6, 4),
  sigma = 0.1 * identity.acomp(3)
)$data
plot_data(
  list(gaussiancomp, whiten(acomp(gaussiancomp))),
  width = 7,
  height = 4,
  grid = TRUE
)

Figure 2.3: Gaussian data in \(\mathcal{S}^3\), before (left plot) and after whitening (right plot).

Figure 2.4: Gaussian data in \(\mathcal{S}^3\), before (left plot) and after whitening (right plot).

Proposition 2.1 Consider two orthonormal bases \(\mathcal{B}_1\) and \(\mathcal{B}_2\) of the simplex \(\mathcal{S}^D\) with associated contrast matrices \(\mathop{\mathrm{V}}_1\) and \(\mathop{\mathrm{V}}_2\) respectively and a random composition \(\mathbf{X}\). For \(i\) in \(\left\{1,2\right\}\), let \({\mathbf{X}_i}^* = \mathop{\mathrm{ilr}}_i \left(\mathbf{X}\right)\) represent its orthonormal coordinates according to \(\mathcal{B}_i\). If \(\,{\mathbf{X}_1}^*\) follows a \(\left(\mathbf{\mu}_1, \Sigma_1\right)\)-elliptical distribution, then \({\mathbf{X}_2}^*\) follows a \(\left(\mathbf{\mu}_2, \Sigma_2\right)\)-elliptical distribution, where \(\mathbf{\mu}_2 = {\mathop{\mathrm{V}}_2}^\top \mathop{\mathrm{V}}_1 \mathbf{\mu}_1\) and \(\Sigma_2 = {\mathop{\mathrm{V}}_1}^\top \mathop{\mathrm{V}}_2 \Sigma_1 {\mathop{\mathrm{V}}_2}^\top \mathop{\mathrm{V}}_1\).

Proof. As we know that \(\mathop{\mathrm{ilr}}_2\left(\mathbf{x}\right) = {\mathop{\mathrm{V}}_2}^\top \mathop{\mathrm{V}}_1 \mathop{\mathrm{ilr}}_1\left(\mathbf{x}\right)\), it is enough to prove that \({\mathop{\mathrm{V}}_2}^\top \mathop{\mathrm{V}}_1\) is a \(\left(D-1\right) \times \left(D-1\right)\)-orthogonal matrix, because then we only need to perform an orthonormal change of basis in the expression of the distribution of \(\mathbf{X}\) to obtain the result.
Let us verify that it is indeed true: \[\begin{aligned} \left({\mathop{\mathrm{V}}_2}^\top \mathop{\mathrm{V}}_1\right)^\top {\mathop{\mathrm{V}}_2}^\top \mathop{\mathrm{V}}_1 & = {\mathop{\mathrm{V}}_1}^\top \mathop{\mathrm{V}}_2 {\mathop{\mathrm{V}}_2}^\top \mathop{\mathrm{V}}_1 = {\mathop{\mathrm{V}}_1}^\top \left(\mathop{\mathrm{I}}_D - \tfrac{1}{D} \mathbf{1}_D {\mathbf{1}_D}^\top\right) \mathop{\mathrm{V}}_1 \\ & = {\mathop{\mathrm{V}}_1}^\top \mathop{\mathrm{V}}_1 - \tfrac{1}{D} {\mathop{\mathrm{V}}_1}^\top \mathbf{1}_D {\mathbf{1}_D}^\top \mathop{\mathrm{V}}_1 = \mathop{\mathrm{I}}_{D-1} \end{aligned}\] since \({\mathop{\mathrm{V}}_1}^\top \mathbf{1}_D\) is the vector of the sums of columns of \(\mathop{\mathrm{V}}_1\) and is equal to \(\mathbf{0}_{D-1}\).

Remark. In this proof, we use the fact that the matrix \({\mathop{\mathrm{V}}_2}^\top \mathop{\mathrm{V}}_1\) is a \(\left(D-1\right) \times \left(D-1\right)\)-orthogonal matrix, which is natural since a change of contrast matrix corresponds to a different choice of orthonormal basis of \(\mathcal{S}^D\) (which is isometric to \(\mathbb{R}^{D-1}\)), but also very helpful to prove this kind of \(\mathop{\mathrm{ilr}}\)-independence results.

This allows us to define the elliptical distributions as push-forward measures:

Definition 2.1 If \(\mu \in \mathcal{S}^D\) and \(\Sigma \in \mathcal{M}\), a random composition \(\mathbf{X}\) on \(\mathcal{S}^D\) is said to follow a \(\left(\mu, \Sigma\right)\)-elliptical distribution if the random vector \(\mathop{\mathrm{ilr}}\left(\mathbf{X}\right)\) (where \(\mathop{\mathrm{ilr}}\) is any isometric logratio transformation) follows a \(\left(\mu^*, \Sigma^*\right)\)-elliptical distribution on \(\mathbb{R}^{D-1}\).

In particular the Gaussian distribution on the simplex \(\mathcal{N}_\mathcal{S}\left(\mu, \Sigma\right)\) is defined as the push-forward measure by any \(\mathop{\mathrm{ilr}}^{-1}\) of the Gaussian distribution \(\mathcal{N}\left(\mu^*, \Sigma^*\right)\).