icspylab package.
Workshop on Invariant Coordinate Selection and Related Methods
University of Helsinki
Doctoral Workshop on Decision Mathematics and Statistics
June 4, 2026
icspylab package.
icspylab package.
Tip
Both scatters differ in shape: \(\Sigma\) is less attracted by the small group than \(\Sigma_4\). ICS jointly diagonalizes the two in order to compare them and to reveal the direction from group 0 to group 1.
If \(P\) is a distribution over \(\mathbb R^p\) with expectation \(\mu\), we define
Invariant Coordinate Selection (ICS) relies on eigendecomposition of \(\Sigma^{-1} \Sigma_4\) to reveal interesting projections, understand the structure of \(P\) (clusters, outliers) (Tyler et al. 2009).
| Assumption | \(P\) is an elliptical location mixture distribution with \(k\) groups |
|---|---|
| Prior work | Becquart et al. (2026) |
| Objectives | 1. How to compute \(\Sigma_4\) explicitly? 2. Which (simple) equation do the eigenvalues of ICS satisfy? 3. Why are some eigenvalues small and other large? 4. When do the eigenvectors of ICS span the discriminant subspace? 5. Which groups are separated by which eigenvectors? |
| Tools | Affine geometry, Barycentric coordinates, Symbolic calculus |
Let \(P = \sum_{i=0}^{k-1} \pi_i P_i\) be a mixture distribution on \(\mathbb R^p\), where the group proportions are \[(\pi_0, \dots, \pi_{k-1})^\top \in \mathcal S^k = \{ \pi \in (\mathbb R_+^*)^k | \sum_{i=0}^{k-1} \pi_i = 1 \}.\]
If each \(P_i\) has mean \(\mu_i\) then the mean of \(P\) is \[\mu = \sum_{i=0}^{k-1} \pi_i \mu_i.\]
If moreover \(P_0, \dots, P_{k-1}\) share the same within-group covariance matrix \(W\), then the covariance matrix of \(P\) is \[\Sigma = W + B\] where \(B = \sum_{i=0}^{k-1} \pi_i \delta_i \delta_i^\top\) is the between-group covariance matrix that is based on the group directions \[\delta_i = \mu_i - \mu, 0 \leq i \leq k-1.\]
How to compute the fourth-order covariance matrix \(\Sigma_4\)?
Theorem 1 (Peña, Prieto, and Viladomat (2010)) If the \(P_i, 0 \leq i \leq k-1\) are elliptical distributions, then \[(p+2) \Sigma_4 = \alpha \Sigma + \sum_{0 \leq i,j \leq k-1} \beta_{ij} \delta_i \delta_j^\top \tag{1}\] where
Moreover, \(\alpha\) is an eigenvalue of \((p+2) \Sigma^{-1} \Sigma_4\) with multiplicity greater than or equal to \(p-r\), where \(r \leq k-1\) is the dimension of the affine space generated by the group means.
But…
Equation 1 is incorrect.
Theorem 2 (Peña, Prieto, and Viladomat (2010), corrected) If the \(P_i, 0 \leq i \leq k-1\) are elliptical distributions, then \[\begin{aligned} (p+2) \Sigma_4 &= \underbrace{\alpha \Sigma}_{\color{#6a4c93}{B = 0 \text{ (elliptical)}}} \\ &\quad + \underbrace{(\tilde k - 1) \operatorname{tr} (W \Sigma^{-1}) W + (\bar k - 2) W \Sigma^{-1} W}_{\color{#3a7d5c}{B \neq 0 \text{ and } W \neq 0 \text{ (non-Gaussian)}}} \\ &\quad - \underbrace{\operatorname{tr} (B \Sigma^{-1}) B + 2 B \Sigma^{-1} B}_{\color{#2d6a8a}{B \neq 0 \text{ and } W \neq 0}} \\ &\quad + \underbrace{\sum_{i=0}^{k-1} \pi_i \delta_i^\top \Sigma^{-1} \delta_i \delta_i \delta_i^\top}_{\color{#b8860b}{W = 0 \text{ (discrete)}}} \end{aligned}\] where \(\alpha = \tilde k p + (1 − \tilde k) \sum_{i=0}^{k-1} \pi_i \delta_i^\top \Sigma^{-1} \delta_i + \bar k\) and \(\tilde k, \bar k\) are some ellipticity parameters.
Moreover, \(\alpha\) is an eigenvalue of \((p+2) \Sigma^{-1} \Sigma_4\) with multiplicity greater than or equal to \(p-r\), where \(r \leq k-1\) is the dimension of the affine space generated by the group means.
Corollary 1 (Peña, Prieto, and Viladomat (2010), Gaussian) If the \(P_i, 0 \leq i \leq k-1\) are Gaussian distributions, then \[\begin{aligned} (p+2) \Sigma_4 &= \underbrace{(p+2) \Sigma}_{\color{#6a4c93}{B = 0 \text{ (elliptical)}}} \\ &\quad - \underbrace{\left( \operatorname{tr} (B \Sigma^{-1}) B + 2 B \Sigma^{-1} B \right)}_{\color{#2d6a8a}{B \neq 0 \text{ and } W \neq 0}} \\ &\quad + \underbrace{\sum_{i=0}^{k-1} \pi_i \delta_i^\top \Sigma^{-1} \delta_i \, \delta_i \delta_i^\top}_{\color{#b8860b}{W = 0 \text{ (discrete)}}} \end{aligned}\] Moreover, \(1\) is an eigenvalue of \(\Sigma^{-1} \Sigma_4\) with multiplicity greater than or equal to \(p-r\), where \(r \leq k-1\) is the dimension of the affine space generated by the group means.
But…
This formula relies on \(\Sigma^{-1}\) that is not explicit in the parameters \(W\), \((\mu_0, \dots, \mu_{k-1}), (\pi_1, \dots, \pi_{k-1})\) of \(P\).
From now on we assume that \(P\) is a discrete distribution on simplex vertices, so that
In that case, Mahalanobis inner products between the group directions \(\delta_j \Sigma^{-1} \delta_i\) become tractable, and we prove that the eigenvalues of \(\Sigma^{-1} \Sigma_4\) depend only on group proportions.
An edge case
Let us begin by proving a simple formula for \(\Sigma_4\).
Lemma 1 Let \(P = \sum_{i=0}^{k-1} \pi_i \delta_{\mu_i}\) be a discrete distribution on the simplex \((\mu_0, \dots, \mu_{k-1})\) of \(\mathbb R^{k-1}\) and with \((\pi_0, \dots, \pi_{k-1})^\top \in \mathcal S^k\). With \(\mu = \sum_{i=0}^{k-1} \pi_i \mu_i\) and \(\delta_i = \mu_i - \mu, 0 \leq i \leq k-1\) \[ \Sigma_4 = \frac1{k+1} \sum_{i=0}^{k-1} (1 - \pi_i) \delta_i \delta_i^\top \]
Now we can derive a simple rational equation depending only on proportions satisfied by the eigenvalues of \(\Sigma_4 \Sigma^{-1}\) and express their corresponding eigenvectors.
Theorem 3 Let \(P = \sum_{i=0}^{k-1} \pi_i \delta_{\mu_i}\) be a discrete distribution on the simplex \((\mu_0, \dots, \mu_{k-1})\) of \(\mathbb R^{k-1}\) and \(\lambda > 0\). Let us denote
Then \(\lambda\) is an eigenvalue of \((k+1) \Sigma_4 \Sigma^{-1}\) if and only if one of the following holds:
We can reunite both cases in Theorem 3 using the characteristic polynomial \[ \chi = \det(t \, I_{k-1} - (k+1) \Sigma^{-1} \Sigma_4) \] which is the unit polynomial whose roots are the eigenvalues of \((k+1) \Sigma^{-1} \Sigma_4\) (with multiplicities).
Corollary 2 Let \(P = \sum_{i=0}^{k-1} \pi_i \delta_{\mu_i}\) be a discrete distribution on the simplex \((\mu_0, \dots, \mu_{k-1})\) of \(\mathbb R^{k-1}\) and \(\alpha_i = \frac{1-\pi_i}{\pi_i}\) the odds against group \(i \in \{ 0, \dots, k-1 \}\). Let \(Q = \prod_{i=0}^{k-1} (t - \alpha_i)\) and \(\chi = \det(t \, I_{k-1} - (k+1) \Sigma^{-1} \Sigma_4)\). Then \[ \begin{aligned} \chi &= \frac{Q + Q'}{t+1} = Q(t) \sum_{i=0}^{k-1} \frac{\pi_i}{t - \alpha_i} \\ &= \tfrac1{\prod_{i=0}^{k-1} \pi_i} \sum_{i=0}^k (-1)^{k-i} (t+1)^{i - 2} (t + 1 + i) e_i(\pi_0, \dots, \pi_{k-1}) \end{aligned} \] using the elementary symmetric polynomials \[e_i(\pi_0, \dots, \pi_{k-1}) = \sum_{0\leq j_1<j_2<\cdots<j_i\leq k-1} \pi_{j_1}\pi_{j_2}\dotsm \pi_{j_i}, 0 \leq i \leq k.\]
Theorem 4 (Tyler et al. (2009)) If \(P\) is a mixture of \(k\) elliptical distributions with same within-group covariance, there exists a noise eigenvalue that has multiplicity at least \(p-r\) where \(r \leq k-1\) is the dimension of the affine space generated by the group means. If its multiplicity is minimal, then the remaining eigenvectors span the discriminant subspace.
Step 1: Identifying the noise eigenvalue. For \((\Sigma, (k+1) \Sigma_4)\), Theorem 2 shows that it is \(\alpha\), which corresponds to the proportionality constant when \(B=0\) (elliptical case).
Step 2: Characterising the parameters for which the multiplicity of this eigenvalue is minimal. In the case of a discrete distribution on a simplex, this means that \(k+1\) is not an eigenvalue of \((k+1)\Sigma_4 \Sigma^{-1}\), i.e. \[\chi (k+1) \neq 0\]
\[6 \pi_{0} \pi_{1} - 1 \neq 0 \iff \pi_0 \notin \{ \tfrac12 - \tfrac{\sqrt{3}}6, \tfrac12 + \tfrac{\sqrt{3}}6 \} \approx \{ 0.2113, 0.7887 \}.\]
\[40 \pi_{0} \pi_{1} \pi_{2} - 7 \pi_{0} \pi_{1} - 7 \pi_{0} \pi_{2} - 7 \pi_{1} \pi_{2} + 1\neq 0 \]
//
// === Build callable det functions from the sympy-derived strings ===
//
math = {
const args = "pi1, pi2, m1x, m1y, m2x, m2y, w00, w01, w11";
return {
N00: new Function(args, "return " + N00_src + ";"),
N01: new Function(args, "return " + N01_src + ";"),
N11: new Function(args, "return " + N11_src + ";"),
Ddet: new Function(args, "return " + D_src + ";")
};
}//
// === Single d3 panel: centers triangle + zero curve inside ===
//
// Important pattern for buttons + reactive update:
// We expose a global `window.__icsRender()` from this cell.
// The buttons cell mutates window.__icsState then calls __icsRender().
// This avoids the OJS-viewof-event-routing tangle.
//
board = {
const MU0 = [-0.5, -Math.sqrt(3)/6];
const DEFAULT_MU1 = [0.5, -Math.sqrt(3)/6];
const DEFAULT_MU2 = [0.0, Math.sqrt(3)/3];
// State lives on window so the buttons cell can mutate it directly.
window.__icsState = window.__icsState || {
mu1: [...DEFAULT_MU1],
mu2: [...DEFAULT_MU2],
covA: 0, covB: 0, covTheta: 0,
showSign: true
};
const s = window.__icsState;
// Keep DEFAULT_* accessible to the buttons cell.
window.__icsDefaults = { MU0, DEFAULT_MU1, DEFAULT_MU2,
DEFAULT_COV: { a: 0.18, b: 0.18, theta: 0 } };
// --- Set up SVG (idempotent: clear container first in case of re-run) ---
const container = d3.select("#ics-board");
container.selectAll("*").remove();
const W = 720, H = 520;
const svg = container.append("svg")
.attr("viewBox", [0, 0, W, H])
.attr("style", "width: min(75vh, 75vw); display: block; margin: 0 auto; max-width:900px; background:#fff; border:1px solid #d8d4c4; user-select:none;");
const x = d3.scaleLinear().domain([-1.7, 1.7]).range([60, W - 60]);
const y = d3.scaleLinear().domain([-1.2, 1.2]).range([H - 30, 30]);
// Background grid
for (let v = -1.5; v <= 1.5 + 1e-9; v += 0.5) {
if (Math.abs(v) < 1e-9) continue;
svg.append("line").attr("x1", x(v)).attr("x2", x(v))
.attr("y1", y(-1.15)).attr("y2", y(1.15))
.attr("stroke", "#e8e4d4").attr("stroke-width", 1);
}
for (let v = -1.0; v <= 1.0 + 1e-9; v += 0.5) {
if (Math.abs(v) < 1e-9) continue;
svg.append("line").attr("y1", y(v)).attr("y2", y(v))
.attr("x1", x(-1.65)).attr("x2", x(1.65))
.attr("stroke", "#e8e4d4").attr("stroke-width", 1);
}
svg.append("line").attr("x1", x(-1.65)).attr("x2", x(1.65))
.attr("y1", y(0)).attr("y2", y(0))
.attr("stroke", "#1a1a1a").attr("opacity", 0.15);
svg.append("line").attr("x1", x(0)).attr("x2", x(0))
.attr("y1", y(-1.15)).attr("y2", y(1.15))
.attr("stroke", "#1a1a1a").attr("opacity", 0.15);
// Layers (drawn in order)
const gShade = svg.append("g");
const gCurve = svg.append("g");
const gTriangle = svg.append("g");
const gEllipses = svg.append("g");
const gHandles = svg.append("g");
// --- Math: Σ_W matrix + det evaluator ---
function sigmaW() {
const c = Math.cos(s.covTheta), si = Math.sin(s.covTheta);
return {
w00: c*c*s.covA**2 + si*si*s.covB**2,
w01: c*si*(s.covA**2 - s.covB**2),
w11: si*si*s.covA**2 + c*c*s.covB**2
};
}
function detAt(pi1, pi2) {
if (pi1 < 0 || pi2 < 0 || pi1 + pi2 > 1) return NaN;
const m1x = s.mu1[0] - MU0[0], m1y = s.mu1[1] - MU0[1];
const m2x = s.mu2[0] - MU0[0], m2y = s.mu2[1] - MU0[1];
const {w00, w01, w11} = sigmaW();
const D = math.Ddet(pi1, pi2, m1x, m1y, m2x, m2y, w00, w01, w11);
if (Math.abs(D) < 1e-14) return NaN;
const n00 = math.N00(pi1, pi2, m1x, m1y, m2x, m2y, w00, w01, w11);
const n01 = math.N01(pi1, pi2, m1x, m1y, m2x, m2y, w00, w01, w11);
const n11 = math.N11(pi1, pi2, m1x, m1y, m2x, m2y, w00, w01, w11);
return n00 * n11 - n01 * n01;
}
function bary(pi1, pi2) {
const pi0 = 1 - pi1 - pi2;
return [
pi0*MU0[0] + pi1*s.mu1[0] + pi2*s.mu2[0],
pi0*MU0[1] + pi1*s.mu1[1] + pi2*s.mu2[1]
];
}
// --- Render loop: rebuilds shading, curve, triangle, ellipses, handles ---
function render() {
const N = 140;
const F = new Float64Array((N+1)*(N+1));
const inside = new Uint8Array((N+1)*(N+1));
for (let j = 0; j <= N; j++) {
for (let i = 0; i <= N; i++) {
const pi1 = i/N, pi2 = j/N;
const idx = j*(N+1)+i;
if (pi1 + pi2 <= 1.0 + 1e-9) {
const v = detAt(pi1, pi2);
F[idx] = v;
inside[idx] = isNaN(v) ? 0 : 1;
}
}
}
// Sign shading
gShade.selectAll("*").remove();
if (s.showSign) {
const posPath = [], negPath = [];
for (let j = 0; j < N; j++) {
for (let i = 0; i < N; i++) {
const i00 = j*(N+1)+i, i10 = j*(N+1)+i+1, i01 = (j+1)*(N+1)+i, i11 = (j+1)*(N+1)+i+1;
if (inside[i00] && inside[i10] && inside[i01]) {
const avg = (F[i00] + F[i10] + F[i01]) / 3;
const w0 = bary(i/N, j/N), w1 = bary((i+1)/N, j/N), w2 = bary(i/N, (j+1)/N);
const seg = `M${x(w0[0])},${y(w0[1])}L${x(w1[0])},${y(w1[1])}L${x(w2[0])},${y(w2[1])}Z`;
(avg >= 0 ? posPath : negPath).push(seg);
}
if (inside[i10] && inside[i11] && inside[i01]) {
const avg = (F[i10] + F[i11] + F[i01]) / 3;
const w0 = bary((i+1)/N, j/N), w1 = bary((i+1)/N, (j+1)/N), w2 = bary(i/N, (j+1)/N);
const seg = `M${x(w0[0])},${y(w0[1])}L${x(w1[0])},${y(w1[1])}L${x(w2[0])},${y(w2[1])}Z`;
(avg >= 0 ? posPath : negPath).push(seg);
}
}
}
if (posPath.length) gShade.append("path").attr("d", posPath.join("")).attr("fill", "#bd5d3a").attr("fill-opacity", 0.42);
if (negPath.length) gShade.append("path").attr("d", negPath.join("")).attr("fill", "#3d6478").attr("fill-opacity", 0.42);
}
// Zero curve
gCurve.selectAll("*").remove();
const segs = [];
function interp(pa, va, pb, vb) {
const t = va / (va - vb);
return [pa[0] + t*(pb[0]-pa[0]), pa[1] + t*(pb[1]-pa[1])];
}
function trace(pa, va, pb, vb, pc, vc) {
const code = (va>0?1:0) | (vb>0?2:0) | (vc>0?4:0);
if (code === 0 || code === 7) return;
let q1, q2;
if (code === 1 || code === 6) { q1 = interp(pa, va, pb, vb); q2 = interp(pa, va, pc, vc); }
else if (code === 2 || code === 5) { q1 = interp(pb, vb, pa, va); q2 = interp(pb, vb, pc, vc); }
else { q1 = interp(pc, vc, pa, va); q2 = interp(pc, vc, pb, vb); }
segs.push(`M${x(q1[0])},${y(q1[1])}L${x(q2[0])},${y(q2[1])}`);
}
for (let j = 0; j < N; j++) {
for (let i = 0; i < N; i++) {
const i00 = j*(N+1)+i, i10 = j*(N+1)+i+1, i01 = (j+1)*(N+1)+i, i11 = (j+1)*(N+1)+i+1;
if (inside[i00] && inside[i10] && inside[i01])
trace(bary(i/N,j/N), F[i00], bary((i+1)/N,j/N), F[i10], bary(i/N,(j+1)/N), F[i01]);
if (inside[i10] && inside[i11] && inside[i01])
trace(bary((i+1)/N,j/N), F[i10], bary((i+1)/N,(j+1)/N), F[i11], bary(i/N,(j+1)/N), F[i01]);
}
}
if (segs.length) gCurve.append("path").attr("d", segs.join("")).attr("fill", "none").attr("stroke", "#1a1a1a").attr("stroke-width", 1.6);
// Triangle outline
gTriangle.selectAll("*").remove();
gTriangle.append("path")
.attr("d", `M${x(MU0[0])},${y(MU0[1])}L${x(s.mu1[0])},${y(s.mu1[1])}L${x(s.mu2[0])},${y(s.mu2[1])}Z`)
.attr("fill", "none").attr("stroke", "#1a1a1a").attr("stroke-width", 1.4);
// Σ_W ellipses at each center
gEllipses.selectAll("*").remove();
const scale = x(1) - x(0);
for (const [cx, cy] of [MU0, s.mu1, s.mu2]) {
gEllipses.append("ellipse")
.attr("cx", x(cx)).attr("cy", y(cy))
.attr("rx", s.covA * scale).attr("ry", s.covB * scale)
.attr("transform", `rotate(${-s.covTheta*180/Math.PI} ${x(cx)} ${y(cy)})`)
.attr("fill", "rgba(122,31,31,0.06)").attr("stroke", "#7a1f1f").attr("stroke-width", 1.2);
}
// Handles
const c = Math.cos(s.covTheta), si = Math.sin(s.covTheta);
const a1 = [MU0[0] + s.covA*c, MU0[1] + s.covA*si];
const a2 = [MU0[0] - s.covB*si, MU0[1] + s.covB*c];
gHandles.selectAll("*").remove();
gHandles.append("circle").attr("cx", x(MU0[0])).attr("cy", y(MU0[1])).attr("r", 4).attr("fill", "#1a1a1a");
gHandles.append("text").attr("x", x(MU0[0]) + 11).attr("y", y(MU0[1]) - 9)
.attr("font-family", "serif").attr("font-style", "italic").attr("font-size", 15).text("μ₀");
const handles = [
{ type: "center", which: 1, pos: s.mu1, label: "μ₁" },
{ type: "center", which: 2, pos: s.mu2, label: "μ₂" },
{ type: "covA", pos: a1 },
{ type: "covB", pos: a2 }
];
const hg = gHandles.selectAll("g.h").data(handles).join("g").attr("class", "h")
.attr("transform", d => `translate(${x(d.pos[0])}, ${y(d.pos[1])})`)
.style("cursor", "grab");
hg.append("circle").attr("r", 18).attr("fill", "transparent");
hg.filter(d => d.type === "center").append("circle").attr("r", 7)
.attr("fill", "#fff").attr("stroke", "#1a1a1a").attr("stroke-width", 1.5);
hg.filter(d => d.type === "covA").append("rect").attr("x", -6).attr("y", -6).attr("width", 12).attr("height", 12)
.attr("fill", "#fff").attr("stroke", "#7a1f1f").attr("stroke-width", 1.4);
hg.filter(d => d.type === "covB").append("polygon").attr("points", "0,-7 -6,5 6,5")
.attr("fill", "#fff").attr("stroke", "#7a1f1f").attr("stroke-width", 1.4);
hg.filter(d => d.label).append("text").attr("x", 11).attr("y", -9)
.attr("font-family", "serif").attr("font-style", "italic").attr("font-size", 15).text(d => d.label);
hg.call(d3.drag()
.on("drag", (event, d) => {
const wx = x.invert(event.x), wy = y.invert(event.y);
if (d.type === "center") {
const t = d.which === 1 ? s.mu1 : s.mu2;
t[0] = Math.max(-1.6, Math.min(1.6, wx));
t[1] = Math.max(-1.15, Math.min(1.15, wy));
} else if (d.type === "covA") {
const dx = wx - MU0[0], dy = wy - MU0[1];
const r = Math.hypot(dx, dy);
if (r > 0.01) { s.covA = Math.min(1.0, r); s.covTheta = Math.atan2(dy, dx); }
} else if (d.type === "covB") {
const dx = wx - MU0[0], dy = wy - MU0[1];
const perp = -Math.sin(s.covTheta)*dx + Math.cos(s.covTheta)*dy;
s.covB = Math.min(1.0, Math.max(0.01, Math.abs(perp)));
}
render();
})
);
// Readout
const f = v => v.toFixed(3);
const w = sigmaW();
document.getElementById("ics-readout").innerHTML =
`μ₁ = (${f(s.mu1[0])}, ${f(s.mu1[1])}) · μ₂ = (${f(s.mu2[0])}, ${f(s.mu2[1])}) · ` +
`Σ_W axes = (${f(s.covA)}, ${f(s.covB)}) at ${(s.covTheta*180/Math.PI).toFixed(1)}°`;
}
// Expose the render function globally so the buttons cell can call it.
window.__icsRender = render;
// Wire buttons here too (idempotent: replace any previous handlers).
const D = window.__icsDefaults;
document.getElementById("b-equil").onclick = () => {
s.mu1 = [...D.DEFAULT_MU1]; s.mu2 = [...D.DEFAULT_MU2]; render();
};
document.getElementById("b-align").onclick = () => {
s.mu1 = [D.MU0[0] + 1.0, D.MU0[1]];
s.mu2 = [D.MU0[0] - 0.6, D.MU0[1]]; render();
};
document.getElementById("b-circ").onclick = () => {
s.covA = D.DEFAULT_COV.a; s.covB = D.DEFAULT_COV.b; s.covTheta = D.DEFAULT_COV.theta;
render();
};
document.getElementById("b-zero").onclick = () => {
s.covA = 0.001; s.covB = 0.001; s.covTheta = 0; render();
};
document.getElementById("b-sign").onchange = (e) => {
s.showSign = e.target.checked; render();
};
render();
return svg.node();
}Since \(P = \sum_i \pi_i \delta_{\mu_i}\), expanding the definition of \(\Sigma_4\) gives: \[(k+1)\Sigma_4 = \sum_{i=0}^{k-1} \pi_i (\delta_i^\top \Sigma^{-1} \delta_i) \, \delta_i \delta_i^\top.\] It remains to compute the Mahalanobis distances \(\delta_i^\top \Sigma^{-1} \delta_i, 0 \leq i \leq k-1\) between the group means and the global mean. Since \(\Sigma = \sum_{i=0}^{k-1} \pi_i \delta_i \delta_i^\top\) and \(\sum_{i=0}^{k-1} \pi_i \delta_i = 0\), then for any \(0 \leq j \leq k-1\): \[\delta_j = \Sigma \Sigma^{-1} \delta_j = \sum_{i=0}^{k-1} \pi_i \left( \delta_i^\top \Sigma^{-1}\delta_j + 1 \right) \delta_i, \qquad \sum_{i=0}^{k-1} \pi_i \bigl(\delta_i^\top \Sigma^{-1}\delta_j + 1\bigr) = 1.\] By uniqueness of the barycentric coordinates of \(\delta_j\) in the affine basis \((\delta_0, \dots, \delta_{k-1})\), we get \[\pi_j(\delta_j^\top \Sigma^{-1}\delta_j + 1) = 1, \qquad (k+1)\Sigma_4 = \sum_{i=0}^{k-1}(1-\pi_i)\,\delta_i\delta_i^\top.\]
Let \(M_\lambda = (k+1) (\Sigma_4 \Sigma^{-1} - \lambda I_{k-1})\), \(E_\lambda = \ker M_\lambda\), \(F_\lambda = \ker(\Sigma^{-1}\Sigma_4 - \lambda I_{k-1}) = \Sigma^{-1} E_\lambda\), \(\beta_i = \alpha_i - (k+1)\lambda\) and \(\pi_\lambda = \frac1{(k+1)\lambda+1}\) for \(0 \leq i \leq k-1\). Note that \(\beta_i = 0 \iff \pi_i = \pi_\lambda\). Let \(v = \sum_{i=0}^{k-1} v_i \delta_i \in \mathbb R^{k-1}\) with \(\sum_{i=0}^{k-1} v_i = 1\) (barycentric coordinates). From the proof of Lemma 1, \(\delta_j^\top \Sigma^{-1} \delta_i = \frac{\delta_{ij} - \pi_j}{\pi_j}\) for \(0 \leq i,j \leq k-1\), so \[(k+1)\Sigma_4\Sigma^{-1}\,\delta_i = \sum_{j=0}^{k-1} (1-\pi_j)\frac{\delta_{ij} - \pi_j}{\pi_j}\,\delta_j = \alpha_i\,\delta_i - \sum_{j=0}^{k-1} \delta_j, \quad M_\lambda v = \sum_{j=0}^{k-1} (\beta_j v_j - 1) \delta_j.\] Since \(\sum_{i=0}^{k-1} \pi_i \delta_i = 0\) we have \[v \in E_\lambda \iff \exists\, t \in \mathbb R, \; \forall\, 0 \leq i \leq k-1, \quad \beta_i v_i = 1 - t\frac{\pi_i}{\pi_\lambda}.\] Set \(h_i = 1/\beta_i\) when \(i \notin I_\lambda\). Since \(\pi_i \beta_i = 1 - \pi_i/\pi_\lambda\), we get \(\pi_i h_i = \pi_\lambda(h_i - \pi_i)\), so \[v \in E_\lambda \iff \exists\, t \in \mathbb R, \quad \begin{cases} 1 - t\frac{\pi_i}{\pi_\lambda} = 0 & \text{for } i \in I_\lambda \\ v_i = (1-t)\,h_i + t\,\pi_i & \text{for } i \notin I_\lambda \end{cases} \qquad (\star)\]
Case a. \(I_\lambda = \emptyset\). Set \(h = \sum_{i=0}^{k-1} h_i \delta_i\). Summing \((\star)\) over \(0 \leq i \leq k-1\) and using \(\sum_{i=0}^{k-1} v_i = 1\), \(\sum_{i=0}^{k-1} \pi_i = 1\) gives \[(1-t)\Bigl(\sum_{i=0}^{k-1} h_i - 1\Bigr) = 0 \qquad (\star\star)\] so \((\star\star) \iff t = 1\) or \(\sum_{i=0}^{k-1} h_i = 1\). If \(t = 1\), then \(v_i = \pi_i\) for \(0 \leq i \leq k-1\), so \(v = \sum_{i=0}^{k-1} \pi_i \delta_i = 0\), which is trivial. Therefore \[E_\lambda \neq \{0\} \iff \sum_{i=0}^{k-1} h_i = 1 \iff \sum_{i=0}^{k-1} \frac{1}{\alpha_i - (k+1)\lambda} = 1\] and then \(E_\lambda = \operatorname{span}(h)\), so \(\lambda\) is simple. We have \(F_\lambda = \operatorname{span}(\Sigma^{-1} h)\) with \(\delta_i^\top \Sigma^{-1} h = (h_i - \pi_i)/\pi_i = h_i/\pi_\lambda\) for \(0 \leq i \leq k-1\).
Case b. \(I_\lambda \neq \emptyset\). Then \((\star)\) gives \(t = 1\), so \[v \in E_\lambda \iff v_i = \pi_i \text{ for } i \notin I_\lambda \text{ and } \sum_{i \in I_\lambda} v_i = 1 - \sum_{i \notin I_\lambda} \pi_i.\] Setting \(c_i = v_i - \pi_i\) for \(i \in I_\lambda\), the constraint becomes \(\sum_{i \in I_\lambda} c_i = 0\) and \(v = \sum_{i \in I_\lambda} c_i\,\delta_i\). Hence \(E_\lambda = \bigl\{\sum_{i \in I_\lambda} c_i\,\delta_i | \sum_{i \in I_\lambda} c_i = 0\bigr\}\), so \(E_\lambda \neq \{0\} \iff |I_\lambda| \geq 2\), and then \(\dim E_\lambda = |I_\lambda| - 1\). Since \(x \in F_\lambda \iff \Sigma x \in E_\lambda\) and the barycentric coordinate of \(\Sigma x\) on \(\delta_i\) is \(\pi_i(\delta_i^\top x + 1)\), we get \(x \in F_\lambda \iff \delta_i^\top x = 0\) for \(i \notin I_\lambda\), i.e. \(F_\lambda = \{\delta_i, i \notin I_\lambda\}^\perp\).
Let \(r\) be the number of distinct roots of \(Q\) and \(\operatorname{rad}(Q)\) the unit polynomial of degree \(r\) whose roots are those of \(Q\). Then \(\frac{Q}{\operatorname{rad}(Q)} = \gcd(Q,Q')\) is the unit polynomial of degree \(k-r\) obtained by lowering the multiplicities of the roots of \(Q\) by 1. The Theorem 3 allows us to write \(\chi = \frac{Q}{\operatorname{rad}(Q)} \cdot \frac{H \circ u}{(k+1)^{k-1} u}\) where \(H\) is the unit polynomial of degree \(r-1\) whose roots are the solutions of \(\frac{Q'}{Q} \circ u(\lambda) + 1 = 0\) with multiplicity 1.
Furthermore, we have \(Q + Q' = \gcd(Q,Q') R\) where \(\gcd(Q,R) = 1\) and \(R\) is a unit polynomial of degree \(r\). Since \(Q(0) = (-1)^k \prod_{i=0}^{k-1} \frac1{\pi_i} > 0\) and \(Q'(0) = (-1)^{k-1} \sum_{i=0}^{k-1} \prod_{j \neq i} \frac1{\pi_j} = - \sum_{i=0}^{k-1} \pi_i Q(0) = - Q(0)\), then \(t\) divides \(Q + Q'\), so \(t\) divides \(R\). It suffices to show that \(\frac{R}{t} = H\). Since both polynomials are unital of degree \(r-1\), it suffices to verify that every root of \(H\) is a root of \(\frac{R}{t}\). Now, if \(v\) is a root of \(H\), it satisfies \(Q(v) \neq 0\) and \(\frac{Q'}{Q}(v) + 1 = 0\), so \(Q(v) + Q'(v) = 0\) but \(v \gcd(Q,Q')(v) \neq 0\), so \(\frac{R(v)}v = 0\). The second equality follows from \(Q = \sum_{i=0}^{k-1} (-1)^{k-1-i} t^i e_{k-1-i}(\pi_0^{-1}, \dots, \pi_{k-1}^{-1})\) and from \(e_{k-1-i}(\pi_0^{-1}, \dots, \pi_{k-1}^{-1}) = \frac1{\prod_{i=0}^{k-1} \pi_i} e_i (\pi_0, \dots, \pi_{k-1})\).
Mondon & Becquart — ICS and the discriminant subspace using FOBI