Figure 2: Daily maximum temperature per meteorological station in Occitanie between 1950 and 2025. 3 days chosen at random are extracted from the data set.
Total size: 300 stations \(\times\) 65 years \(\times\) 365 days > 4 million observations.
…through distributional time series…
Code
oct |>filter(name =="TOULOUSE-BLAGNAC") |>plot_funs(tx, color = year) +facet_wrap(vars(10* year %/%10)) +labs(x ="Temperature", y ="Density")
Figure 3: Annual maximum temperature densities for the TOULOUSE-BLAGNAC station, grouped by decade.
…to trend slope densities
Interpretation: using odds ratios defined by Maier et al. (2025).
Figure 6: Empirical estimates of the scatter operators \(\color{#0072b2}{\operatorname{Cov}} [X]\) and \(\color{#d55e00}{\operatorname{Cov}_4} [X]\) for the data set of slope densities of temperature change in Occitanie.
Theory
\(w\)-weighted covariance operator\[\begin{multline*}\langle \operatorname{Cov}_w [X] x, y \rangle \\
= \mathbb E [ w^2(\| \operatorname{Cov}[X]^{-1/2} (X - \mathbb E X) \|) \\
\langle X - \mathbb EX, x \rangle \langle X - \mathbb EX, y \rangle ]
\end{multline*}\] for \((x,y) \in E\).
Figure 10: Mean slope density across Occitanie, shifted by each ICS dual eigendensity.
Theory
Proposition 1 (Reconstructing formula)\[X = \mathbb EX + \sum_{j=1}^p z_j h^*_j\] where \[\begin{aligned}
H^* &= (h^*_j)_{1 \leq j \leq p} \\
&= (\color{#0072b2}{S_1} [X] h_j)_{1 \leq j \leq p}
\end{aligned}\] is the dual basis of \(H=(h_j)_{1 \leq j \leq p}\).
A link between ICS problems
Proposition 2 Let \((E, \langle \cdot, \cdot \rangle_E) \overset{\varphi}{\rightarrow} (F, \langle \cdot, \cdot \rangle_F)\) be an isometry between two Euclidean spaces of dimension \(p\) and \(\color{#0072b2}{w_1}, \color{#d55e00}{w_2} : \mathbb R^+ \rightarrow \mathbb R\) two measurable functions. For any integrable \(E\)-valued random object \(X\) and for \(\ell \in \{ 1, 2 \}\)\[
\operatorname{Cov}_{w_\ell}^F [\varphi(X)] = \varphi \circ \operatorname{Cov}_{w_\ell}^E [X] \circ \varphi^{-1}
\] Moreover, for any basis \(H = (h_1, \dots, h_p)\) of \(E\) and any \(\Lambda = (\lambda_1 \geq \ldots \geq \lambda_p)\)
\((H, \Lambda)\) solves \(\operatorname{ICS} (X, \color{#0072b2}{\operatorname{Cov}_{w_1}^E}, \color{#d55e00}{\operatorname{Cov}_{w_2}^E})\) in the space \(E\)
\((\varphi(H), \Lambda)\) solves \(\operatorname{ICS} (\varphi(X), \color{#0072b2}{\operatorname{Cov}_{w_1}^F}, \color{#d55e00}{\operatorname{Cov}_{w_2}^F})\) in the space \(F\)
Sun Y. and Genton M. G. 2011. “Functional Boxplots.”Journal of Computational and Graphical Statistics 20 (2) : 316–34.
Van Den Boogaart K. G., Egozcue J. J. and Pawlowsky-Glahn V. 2014. “Bayes HilbertSpaces.”Australian & New Zealand Journal of Statistics 56 (2) : 171–94.
Appendix
Scatter operators
\((E, \langle \cdot, \cdot \rangle)\) Euclidean space of dimension \(p\).
\(X \in \mathcal M (\Omega, E)\) random variable over \(E\).
\(\mathcal S^{+} (E)\) space of non-negative symmetric operators over \(E\).
\(\mathcal {GS}^{+} (E)\) space of positive symmetric operators over \(E\).
\(S: \mathcal A \subseteq \mathcal M (\Omega, E) \rightarrow \mathcal S^{+} (E)\)scatter operator over \(\mathcal A\) if:
Invariance by equality in distribution: \[\forall (X,Y) \in \mathcal A^2, X \sim Y \Rightarrow S[X] = S[Y] \]
It is affine equivariant if: \[\forall A \in \mathcal{GL} (E), \forall b \in E, S[AX+b] = A S[X] A^*\]
Let us decompose \(S_1[X]^{-1} (X - \mathbb EX)\) over the basis \(H\), which is orthonormal in \((E, \langle \cdot, S_1[X] \cdot \rangle)\):
\[\begin{aligned}
S_1[X]^{-1} (X - \mathbb EX) &= \sum_{i=1}^p \langle S_1[X]^{-1} (X - \mathbb EX), S_1[X] h_j \rangle h_j \\
&= \sum_{i=1}^p \langle X - \mathbb EX, h_j \rangle h_j \\
S_1[X]^{-1} (X - \mathbb EX) &= \sum_{i=1}^p z_j h_j
\end{aligned}\]
The dual basis \(H^*\) of \(H\) is the one that satisfies \(\langle h_j, h^*_j \rangle = \delta_{ij}\) for all \(1 \leq i,j \leq p\) and we know from the definition of ICS that this holds for \((S_1[X] h_j)_{1 \leq j \leq p}\).
Proof. In ICS, the first step is to centre the data so wlog: \(\mathbb E [X] = 0\). \(E\) is isometric to \(\mathbb R^p\) via the linear isomorphism: \[\phi_B: x \mapsto G_B^{1/2} [x]_B\] Then for any \((x,y) \in E^2\): \[
\begin{aligned}
\langle \operatorname{Cov}_w [X] x, y \rangle_E
&= \mathbb E [ w^2(X) \langle X, x \rangle_E \langle X, y \rangle_E ] \\
&= \mathbb E [ w^2([X]_B) \langle G_B^{1/2} [X]_B, G_B^{1/2} [x]_B \rangle \langle G_B^{1/2} [X]_B, G_B^{1/2} [y]_B \rangle ] \\
&= \langle \operatorname{Cov}_w (G_B^{1/2} [X]_B) G_B^{1/2} [x]_B, G_B^{1/2} [y]_B \rangle \\
&= \langle \operatorname{Cov}_w (G_B [X]_B) [x]_B, [y]_B \rangle \\
&= \langle \operatorname{Cov}_w ([X]_B) G_B [x]_B, G_B [y]_B \rangle
\end{aligned}
\]
Bayes space and CB-splines
Source:Van Den Boogaart, Egozcue, and Pawlowsky-Glahn (2014) and Machalová et al. (2021).
\(P = \frac1{\lambda(I)} \lambda\) uniform probability measure over \(I=[a,b]\).
\(L^2_0 (P) = \{ f \in L^2 (P) | \int_I f \mathrm dP = 0 \}\) is a separable Hilbert space.
Centred log-ratio transform\(\operatorname{clr}: p \in \mathbb R^I \mapsto \log(p) - \int_I \log(p) \mathrm dP\) (if it exists)
Bayes space\(B^2 (P) = \{ \operatorname{clr}^{-1} (\{f\}), f \in L^2_0 (I) \}\) inherits the separable Hilbert space structure of \(L^2_0 (P)\). Each equivalence class contains a density function defined almost everywhere.
B-splines basis\(B=(B_i)_{-q \leq i \leq k}\) of \(\mathcal S_{q+1}^\Delta (I) \subset L^2 (P)\), the subspace of polynomial splines on \(I\) of order \(q+1\) and with knots \(\Delta\).
ZB-splines basis\(Z=(Z_i)_{-q+1 \leq i \leq k-1} = \left(\frac{\mathrm d B_i}{\mathrm dx}\right)_{-q+1 \leq i \leq k-1}\) of \(\mathcal Z_q^\Delta (I) = \mathcal S_q^\Delta (I) \cap L^2_0 (P)\)
CB-splines basis\(C = (C_i)_{-q+1 \leq i \leq k-1} = (\operatorname{clr}^{-1} (Z_i))_{-q+1 \leq i \leq k-1}\) of \(\mathcal C_q^\Delta (I)\) which is a Euclidean subspace of \(B^2 (P)\) of dimension \(p=k+q-1\).