1. Introduction
1.1 Optimal transport problems
Consider $N\ge 2$ Polish probability spaces $(X^1,\mu ^1),\ldots, (X^N,\mu ^N)$ and $X=\prod _{i=1}^NX^i$ . Let $c\;:\;X\to [0,\infty ]$ and consider the Multi-marginal optimal transport problems
and
in the set
that is, in the set of couplings or transport plans between the $N$ marginals $\mu ^1,\ldots,\mu ^N$ . We refer to the second problem as the $\sup$ case. The first of these problems is widely encountered in the literature of the last thirty years. The second, although also old, gained popularity only more recently thanks to the applications of optimal transportation in machine learning (see, for example [Reference Peyré and Cuturi17]).
Some general results on the multi-marginal optimal transport problem are available in refs. [Reference Carlier10, Reference Kellerer28, Reference Pass30–Reference Rachev and Rüschendorf32], and results for special costs are available, for example in ref. [Reference Gangbo and Swiech23] for the quadratic cost with some generalisations in ref. [Reference Heinich27] and in ref. [Reference Carlier and Nazaret12] for the determinant cost. More applications appeared in ref. [Reference Ghoussoub and Moameni24]. Applications to economics of the multi-marginal optimal transportation problems include, for example, the problem of team-matching which is a generalisation of the classical marriage problem [Reference Carlier and Ekeland11, Reference Chiapporri, McCann and Nesheim13]. Applications to physics are related to quantum chemistry and the strong interacting regime for particles which are described in refs. [Reference Seidl33–Reference Seidl, Perdew and Levy35]. By now, there are several papers on the transport theory for the Coulomb cost and some more general repulsive costs, a selection is given by [Reference Buttazzo, De Pascale and Gori-Giorgi8, Reference Colombo, De Pascale and Di Marino14–Reference Cotar, Friesecke and Klüppelberg16, Reference Di Marino, Gerolin, Nenna, Bergounioux, Oudet, Rumpf, Carlier, Champion and Santambrogio20, Reference Friesecke, Mendl, Pass, Cotar and Klüppelberg21].
This paper is concerned with an optimality condition for the problems above, introduced in the next subsection. In particular, we will study the sufficiency of such optimality condition. We will give a new, easier, and in our opinion easier-to-understand proof of some known results, and we will show that this new approach allows to extend sufficiency results to a wider setting.
1.2 $\boldsymbol{c}$ -cyclical monotonicity, $\boldsymbol\infty$ - $\boldsymbol{c}$ -cyclical monotonicity and the main theorem
In this context, the $c$ -cyclical monotonicity takes the following form.
Definition 1.1. We say that a set $\Gamma \subset \prod _{i=1}^NX^i$ is $c$ -cyclically monotone (CM), if for every $k$ -tuple of points $(x^{1,j},\ldots, x^{N,j})_{j=1}^k$ and every $(N-1)$ -tuple of permutations $(\sigma ^2,\ldots,\sigma ^N)$ of the set $\{1,\ldots, k\}$ we have
We also say that $\gamma \in \Pi (\mu ^1,\ldots,\mu ^N)$ is $c$ -cyclically monotone if it is concentrated on a $c$ -cyclically monotone set.
Definition 1.2. We say that a set $\Gamma \subset \prod _{i=1}^NX^i$ is infinitely $c$ -cyclically monotone (ICM), if for every $k$ -tuple of points $(x^{1,j},\ldots, x^{N,j})_{j=1}^k$ and every $(N-1)$ -tuple of permutations $(\sigma ^2,\ldots,\sigma ^N)$ of the set $\{1,\ldots, k\}$ we have
We also say that a coupling $\gamma \in \Pi (\mu ^1,\ldots,\mu ^N)$ is infinitely cyclically monotone if it is concentrated on an ICM set.
We will use the expression $c$ -cyclically monotone for both conditions above. The main theorem of this paper is the following
Theorem 1.3. Let $\mu ^i \in \mathcal P(X^i)$ with compact support for $i=1,\ldots,N$ , let $c\;:\;X \to{\mathbb{R}}\cup \{+\infty \}$ be continuous. If $\gamma \in \Pi (\mu ^1,\ldots,\mu ^N)$ is an ICM plan for $c$ , then $\gamma$ is optimal.
Along the way, we will give a new proof of the following known result due, in a more general setting, to Griessler [Reference Griessler25].
Theorem 1.4. Let $\mu ^i \in \mathcal P(X^i)$ with compact support for $i=1,\ldots,N$ , let $c\;:\;X \to{\mathbb{R}}$ be continuous. If $\gamma \in \Pi (\mu ^1,\ldots,\mu ^N)$ is a CM plan for $c$ , then $\gamma$ is optimal.
We now discuss an important characterisation of $c$ -cyclically monotone transport plans. With this aim we define, still according to Griessler [Reference Griessler25],
Definition 1.5. Let $\gamma$ be a positive and finite Borel measure on $X$ . We say that $\gamma$ is finitely optimal if all its finitely supported submeasures are optimal with respect to their marginals. Here by submeasure we mean any probability measure $\alpha$ satisfying $\mathop{\textrm{supp}}\nolimits (\alpha )\subset \mathop{\textrm{supp}}\nolimits (\gamma )$ .
Proposition 1.6. If $\gamma \in \Pi (\mu ^1,\ldots, \mu ^N)$ is CM or ICM, then it is finitely optimal for the problem $(P)$ or $(P_\infty )$ , respectively.
Lemma 1.7. Let $\alpha =\sum _{i=1}^lm^i\delta _{( x^{1,i},\ldots, x^{N,i})}$ and $\overline \alpha =\sum _{i=1}^{\overline l}\overline{m}^i\delta _{(\overline{x}^{1,i},\ldots,\overline{x}^{N,i})}$ be two discrete measures with positive, integer coefficients and the same marginals. Let us denote by $\tilde l=m^1 + \ldots +m^l$ the number of rows of the following table
where the first $m^1$ rows are equal among themselves, the following $m^2$ rows are equal among themselves and so on. Let $\overline A$ be the analogous table associated with $\overline \alpha$ . Then, $\overline A$ has $\tilde l$ rows, and there exist $(N-1)$ permutations of the set $\{1,\ldots,\tilde l\}$ such that $\overline{A}$ is equal to
Proof. For each $k\in \{1,\ldots, N\}$ , the $k$ -th marginal of $\alpha$ is given by the sum of the Dirac masses centred on the points of the $k$ -th column of the table $A$ with multiplicity. Analogously, the $k$ -th marginal of $\overline \alpha$ is given by the sum of the Dirac masses centred on the points of the $i$ -th column of the table $\overline A$ with multiplicity. Since the marginals of $\alpha$ and $\overline \alpha$ are the same, each point $x^{k,i}$ appearing in the $k$ -th marginal must appear in both matrices the same number of times, proving the existence of the bijections $\sigma ^2,\ldots,\sigma ^N$ as required. This also implies that $\overline A$ has $\tilde l$ rows.
Proof. (of Proposition 1.6) We fix a finitely supported submeasure $\alpha =\sum _{i=1}^l a_i\delta _{X^i}$ of $\gamma$ . We need to show that $\alpha$ is an optimal coupling of its marginals. To do this, we fix another coupling, $\overline \alpha = \sum _{i=1}^{\overline l} \overline a _i\delta _{\overline X^i}$ , with the same marginals as $\alpha$ . We have to show that
where $\tilde C$ is any of the two costs under consideration. Let us first assume that the discrete measures $\alpha$ and $\overline \alpha$ have rational coefficients. We consider the measures $M\alpha$ and $M \overline \alpha$ , where $M$ is the product of the denominators of the coefficients of $\alpha$ and $\overline \alpha$ . They are discrete measures having positive, integer coefficients and the same marginals, so we can apply Lemma 1.7 to find permutations $\sigma ^2,\ldots,\sigma ^N$ such that $M \alpha$ and $M \overline \alpha$ have representations $A$ and $\overline A$ , respectively. If $\tilde C=C$ we have using the $c$ -cyclical monotonicity of $\alpha$
proving the optimality of $\alpha$ . If $\tilde C=C_\infty$ , the conclusion is immediate:
Now, assume that $\alpha$ and $\overline \alpha$ have real (not necessarily rational) coefficients,
We show that for all $\varepsilon \gt 0$ there exist two discrete measures
with the same marginals, $q_i, \overline q_i \in \mathbb{Q}$ and
Being concentrated on $X^1,\ldots,X^l$ and $\overline X^1, \ldots \overline X^{\overline l}$ is equivalent to the fact that the vector $\underline{\textbf{a}}\;:\!=\;(a_1, \ldots, a_l, \overline a_1, \ldots, \overline a_{\overline l})$ is a solution of
where $\mathcal A$ is a matrix with coefficients $1, 0, -1$ . Indeed, if we write, for example, the equality between the first two marginals we obtain
so some of the points $\overline x^{1,i}$ must coincide with, for example, $x^{1,1}$ and this gives, for two sets of indices
Since the matrix $\mathcal A$ has integer coefficients
and this allows to choose $\beta$ and $\overline \beta$ . Since $C[\alpha ]\approx C[\beta ]$ , $C[\overline \alpha ]\approx C[\overline \beta ]$ , $C_\infty [\alpha ]=C_\infty [\beta ]$ and $C_\infty [ \overline \alpha ]=C[\overline \beta ]$ , we conclude.
2. Essential background and preliminary results
The $c$ -cyclical monotonicity is the most important optimality condition for a transport plan. Giving a satisfactory historical background requires a paper on his own and we refer the reader to the survey [Reference De Pascale, Kausamo and Wyczesany19]. Originally born in convex analysis as characterisation of sub-differential of convex functions, for $N=2$ it first appeared as optimality condition in ref. [Reference Knott and Smith29] in an equivalent formulation of the Kantorovich’s problem. In that context, which was partly motivated by some models appearing in financial mathematics, the authors started by characterising optimal random variables using $c$ -cyclical monotonicity.
For the quadratic cost $c(x,y)=|x-y|^2$ in ${\mathbb{R}} ^d$ , the necessity of the condition is a basic result. See, for example, Prop. 2.24 of [Reference Villani36]. The classical structure of cyclical monotonicity of optimal plans was mentioned as a possible alternative tool in ref. [Reference Brenier7] and explicitly exploited in ref. [Reference Caffarelli9]. After that, in ref. [Reference Gangbo and McCann22], the authors extended the result to lower semi-continuous cost functions bounded from below. They showed that every finite optimal plan with respect to such a cost lies on a $c$ -cyclically monotone set.
For more general settings there are, essentially, two arguments to prove that the support of the optimal plan must be $c$ -cyclically monotone. The first one uses duality and appears in ref. [Reference Knott and Smith29], while the second one relies on modifying a transport plan that is not $c$ -cyclically monotone and showing that its cost can be improved. The latter technique was introduced in ref. [Reference Abdellaoui and Henich1] and used, for example, in Proposition 2.24 of [Reference Villani36]. Both approaches can be extended to the multi-marginal case with few technical modifications.
To the best of our knowledge, for $N=2$ the most general result was proved in ref. [Reference Beiglböck, Goldstern, Maresch and Schachermayer4] who removed regularity assumptions on the cost proving that: if $X, \,Y$ are Polish spaces equipped with Borel probability measures $\mu, \nu$ and $c \;:\; X \times Y \to [0, \infty ]$ is a Borel measurable cost function, then every optimal transport plan with finite total cost is $c$ -cyclically monotone.
Concerning the sufficiency of the condition, we reported above Th. 1.4 which seems to be the most general available in the case $N\gt 2$ . Much more is known for $N=2$ , and we will comment on this at the end of the paper.
2.1 Lower semi-continuity, compactness and existence of minimisers
Existence f the optimal transport problems above is usually obtained by the direct method of Calculus of Variations. Here, we shortly report the tools which we do not find elsewhere or that will be used substantially in our proofs. A useful convergence on the set of transport plans is the tight convergence.
Definition 2.1. Let $X$ be a metric space and let $\gamma _n \in \mathcal P (X)$ we say that $\gamma _n$ converges tightly to $\gamma$ if for all $\phi \in C_b (X)$
The tight convergence will be denoted by $\stackrel{*}{\rightharpoonup }$ .
Definition 2.2. Let $\Pi$ be a set of Borel probability measures on a metric space $X$ . We say that $\Pi$ is tight (or uniformly tight) if for all $\varepsilon \gt 0$ there exists $K_\varepsilon \subset X$ compact such that
for all $\gamma \in \Pi$ .
Theorem 2.3 (Prokhorov). Let $X$ be a complete and separable metric space (Polish space). Then, $\Pi \subset \mathcal P (X)$ is tight if and only if it is pre-compact with respect to the tight convergence.
Remark 2.4.
-
1. The tight convergence is lower semi-continuous on open sets and upper semi-continuous on closed sets;
-
2. If $X$ is complete and separable, then if $\Pi$ is a singleton it is always tight.
The following compactness theorem will be used in this paper.
Theorem 2.5. For $i=1, \ldots, N$ , let $X^i$ be a Polish space. Let $X= X^1 \times \ldots \times X^N$ . Let $\mathcal M^i \subset \mathcal P (X^i)$ be tight for all $i$ . Then, the set
is tight.
Proof. Let $\varepsilon \gt 0$ . By the tightness of $\mathcal M^i$ , we can fix a compact set $K^i \subset X^i$ such that for all $\mu ^i \in \mathcal M^i$
Let $K= K^1 \times \ldots \times K^N$ and let $\gamma \in \Pi$ . Since all the marginals $\pi ^i_\sharp \gamma \in \mathcal M ^i$ , and since
one gets
Corollary 2.6. By Prokhorov’s theorem, a set $\Pi \subset \mathcal P(X)$ as in the theorem above is pre-compact for the tight convergence. This is, in particular, true if $\mathcal M^i=\{\mu ^i\}$ .
If $c$ is lower semi-continuous, then also the functionals $C$ and $C_\infty$ (see problems (P) and (P∞) on the first page for the definition) are lower semi-continuous with respect to the tight convergence of measures. The lower semi-continuity of $C$ is a standard result of optimal transport theory (see, e.g. [Reference Villani37] or [Reference Kellerer28] for the multi-marginal case). The next lemma proves the lower semi-continuity of $C_\infty$ .
Lemma 2.7. If the function $c\;:\;X\to{\mathbb{R}}\cup \{+\infty \}$ is lower semi-continuous, then also the functional $C_\infty$ is lower semi-continuous.
Proof. First, we note that, thanks to the lower semi-continuity of $c$ , its $\gamma$ -essential supremum can be written as
Fix $\gamma \in \Pi (\mu ^1,\ldots,\mu ^N)$ and let $(\gamma ^n)_n$ be a sequence converging to $\gamma$ . Now there exist a vector $v\in \mathop{\textrm{supp}}\nolimits \gamma$ and a sequence $v^n=(x^{1,n},\ldots,x^{N,n})\in \prod _{i=1}^NX^i$ such that $v^n\in \mathop{\textrm{supp}}\nolimits \gamma ^n$ for all $n$ and $v^n\to v$ . Moreover,
Since the above inequality holds for all $v\in \mathop{\textrm{supp}}\nolimits \gamma$ and for all sequences converging to $v$ , it also holds for the $\gamma$ -essential supremum, and the claim follows.
The use of compactness and semi-continuity theorems above gives the existence of optimal transport plans for both problems considered here.
2.2 $\boldsymbol\Gamma$ -convergence
A crucial tool that we will use in this paper is $\Gamma$ -convergence. All the details can be found, for instance, in Braides’s book [Reference Braides6] or in the classical book by Dal Maso [Reference Dal Maso18]. In what follows, $(X,d)$ is a metric space or a topological space equipped with a convergence.
Definition 2.8. Let $(F_n)_n$ be a sequence of functions $X \mapsto \bar{\mathbb{R}}$ . We say that $(F_n)_n$ $\Gamma$ -converges to $F$ if for any $x \in X$ we have
-
• for any sequence $(x^n)_n$ of $X$ converging to $x$
\begin{equation*} \liminf \limits _n F_n(x^n) \geq F(x) \qquad \text {($\Gamma $-liminf inequality);}\end{equation*} -
• there exists a sequence $(x^n)_n$ converging to $x$ and such that
\begin{equation*} \limsup \limits _n F_n(x^n) \leq F(x) \qquad \text {($\Gamma $-limsup inequality).} \end{equation*}
This definition is actually equivalent to the following equalities for any $x \in X$ :
The function $x \mapsto \inf \left \{ \liminf \limits _n F_n(x^n) \;:\; x^n \to x \right \}$ is called $\Gamma$ -liminf of the sequence $(F_n)_n$ and the other one its $\Gamma$ -limsup. A useful result is the following (which for instance implies that a constant sequence of functions does not $\Gamma$ -converge to itself in general).
Proposition 2.9. The $\Gamma$ -liminf and the $\Gamma$ -limsup of a sequence of functions $(F_n)_n$ are both lower semi-continuous on $X$ .
The main interest of $\Gamma$ -convergence resides in its consequences in terms of convergence of minima.
Theorem 2.10. Let $(F_n)_n$ be a sequence of functions $X \to \bar{\mathbb{R}}$ and assume that $F_n$ $\Gamma$ -converges to $F$ . Assume moreover that there exists a compact and non-empty subset $K$ of $X$ such that
(we say that $(F_n)_n$ is equi-mildly coercive on $X$ ). Then, $F$ admits a minimum on $X$ and the sequence $(\inf _X F_n)_n$ converges to $\min F$ . Moreover, if $(x_n)_n$ is a sequence of $X$ such that
and if $(x_{\phi (n)})_n$ is a subsequence of $(x_n)_n$ having a limit $x$ , then $ F(x) = \inf _X F$ .
3. Discretisation of transport plans (Dyadic-type decomposition in Polish spaces)
Let $\gamma$ be a Borel probability measure on $X=(X^1,d_1)\times \cdots \times (X^N,d_N)$ with marginals $\mu ^1,\ldots,\mu ^N$ . The space $X$ will be equipped with the $\sup$ metric
Let $\varepsilon _n=\frac 1n$ . Since $\{\mu ^i\}_{i=1}^N$ are Borel probability measures, they are inner regular. Hence for all $n$ , there exist compact sets $K^{1,n}\subset \mathop{\textrm{supp}}\nolimits \mu ^1, K^{2,n}\subset \mathop{\textrm{supp}}\nolimits \mu ^2,\ldots, K^{N,n}\subset \mathop{\textrm{supp}}\nolimits \mu ^N$ such that
for all $k=1,\ldots,N.$ We may assume that, for all $k$ and $n$ , $K^{k,n}\subset K^{k,{n+1}}$ .
We denote $K^{n}\;:\!=\;\prod _{k=1}^N K^{k,n}$ . Since
one gets
The cost $c$ is uniformly continuous on each $K^{n}$ , and for all $n$ , we can fix $\delta _n\in (0,\varepsilon _n)$ such that the sequence ( $\delta _n$ ) is decreasing in $n$ and
Next we fix, for all $n$ , finite Borel partitions for the sets $K^{1,n},\ldots,K^{N,n}$ . We denote these by $\{\tilde B_i^{k,n}\}_{i=1}^{\tilde m^{k,n}}$ , $k=1,\ldots,N$ , and we choose them in such a way that for all $n\in \mathbb{N}$ and $k\in \{1,\ldots,N\}$
for all $i\in \{1,\ldots,\tilde m^{k,n}\}$ .
We form a new, possibly finer, partition $\{B_i^{k,n}\}_{i=1}^{m^{k,n}}$ for each $K^{k,n}$ by intersecting (if the intersection if non-empty) each element $\tilde B_i^{k,n}$ successively first with the set $K^{k,1}$ , then with $K^{k,2}$ , and so on up until intersecting with the set $K^{k,n-1}$ . So that for $j \in \{1, \ldots n\}$ either $B_i^{k,n} \cap K^{k,j}$ is empty or it is the entire $B_i^{k,n}$ . The products
form a partition of the set $K^n$ with
for all $i\in \{1,\ldots,m^{k,n}\}.$
We denote
and for all ${\textbf{i}}\;:\!=\;(i_1,\ldots,i_N)\in I^n$ , we use the notation $Q_{\textbf{i}}^n\;:\!=\; B_{i_1}^{1,n}\times \cdots \times B_{i_N}^{N,n}$ . We fix points $z^n _{\textbf{i}}=z_{i_1,\ldots,i_N}^n\in \prod _{k=1}^NB_{i_k}^{k,N}\cap \mathop{\textrm{supp}}\nolimits \gamma$ (i.e. $z^n _{\textbf{i}}\in Q^n_{\textbf{i}}\cap \mathop{\textrm{supp}}\nolimits \gamma$ ). We define
since $\tilde \alpha ^n(X)=\gamma (K^n)$ , the measures $\alpha ^n$ are probability measures.
To each multi-index $\textbf{i}=(i_1,\ldots,i_N)$ and thus to each point $z_{\textbf{i}}^n$ correspond $N$ points
which are ‘coordinates’ in the spaces $X_i$ of $z_{\textbf{i}}^n$ . The marginals of $\alpha ^n$ are supported by the Dirac measures given by these points. We denote these marginals by $\mu ^{1,n},\ldots, \mu ^{N,n}$ . More precisely, they can be described as
Proposition 3.1. $\alpha ^n\rightharpoonup \gamma$ .
Proof. Let $\varepsilon \gt 0$ and $\varphi \in C_b(X)$ . We have to find $n_0\in \mathbb{N}$ such that
Let $M\gt 0$ be such that
We fix $\bar n\in \mathbb{N}$ such that
Since $\varphi \in C_b(X)$ , it is uniformly continuous on the set $K^{\bar n}$ , there exists $\delta \gt 0$ such that
Moreover, the decomposition $\mathcal{Q}^n$ has been constructed so that there exists $n_0 \geq \bar{n}$ such that for all $k \in \{1, \ldots, N\}$ and $n \geq n_0$
We start from
and we evaluate separately the two terms on the RHS. For all $n\ge n_0$ , the first term can be estimated as follows: (we recall that, by construction, $K^{\bar n}\subset K^n$ )
Above in $a)$ , we have written
and then estimated the numerator from above by $\frac{\varepsilon }{5M}$ and the term $\gamma (X\setminus K^n)$ of the denominator from below by $\frac 12$ . By construction, since $n\ge \bar n$ , there exist a subset $\bar I^n\subset I^n$ such that
So we write
We simplify the notations for the next few lines and, for all $\textbf{i}\in \bar I^n$ , we denote by $Q\;:\!=\;B_{i_1}^{1,n}\times \cdots \times B_{i_N}^{N,n}$ and by $u_0=z_{i_1,\ldots,i_N}\in Q$ the point in which $\tilde \alpha ^n$ is concentrated. Then for each ‘cube’ $Q$
and in the last passage, we have used the uniform continuity of $\varphi$ on $K^{\bar n}$ . Summing the estimate above over all cubes $Q=B_{i_1}^{1,n}\times \cdots \times B_{i_N}^{N,n}$ , $\textbf{i}\in \bar I^n$ , gives
Combining this estimate with (7) gives us the estimate
Finally, the ‘tail’ term in (6). Using the set $\bar I^n$ defined above one gets
Using this we get
Remark 3.2. If $\mathop{\textrm{supp}}\nolimits \mu ^k$ is compact for $k=1, \ldots, N$ , then the dependence on $n$ of $K^n$ is not needed anymore since one can take $K^n\equiv K\;:\!=\; \mathop{\textrm{supp}}\nolimits \mu ^1 \times \ldots \times \mathop{\textrm{supp}}\nolimits \mu ^N$ . This also simplifies the analytic expressions of $\alpha ^n$ and their marginal measures.
In line with the previous Remark, we prove the following:
Proposition 3.3. If $\mathop{\textrm{supp}}\nolimits \mu ^k$ is compact for $k=1, \ldots, N$ then for all $k,n$ and all $i$
where, we recall, $\mu ^{k,n}$ is defined in (4) above and is the $k$ -th marginal of the discretisation $\alpha ^n$ of $\gamma$ defined in (3).
Proof. Again we prove the formula for the first marginal.
4. Variational approximations and conclusions
In this section, we prove the discrete approximations of the functionals that will be used in the optimality proofs. Given a transport plan $\gamma$ , we have introduced, in the previous section, the dyadic approximation $\{\alpha ^n\}_{n \in \mathbb{N}}$ of $\gamma$ .
4.1 The $\sup$ case
We define the functionals $\mathcal{F}_n,\mathcal{F}\;:\;\mathcal{P}(X)\to{\mathbb{R}} \cup \{+\infty \}$ by
and
For the rest of this subsection, we assume that $c$ is continuous and that $\mu ^i$ has compact support for $i=1,\ldots,N.$ We prove the following
Proposition 4.1. The functionals $\mathcal{F}_n$ are equi-coercive and
Proof. Let $\beta \in \mathcal{P}(X)$ . We recall that we need to prove the following:
If $\mathcal F[\beta ]\lt +\infty$ , the $\Gamma$ - $\liminf$ inequality (Condition (I)) follows from the lower semi-continuity of the functional $C_\infty$ . If $\mathcal{F}[\beta ]=+\infty$ , then either $\beta \notin \Pi (\mu ^1,\ldots,\mu ^N)$ or $C_\infty (\beta )=+\infty$ . In the first case, since $\beta ^n \stackrel{*}{\rightharpoonup } \beta$ and $\mu ^{i,n}\stackrel{*}{\rightharpoonup } \mu ^i$ for $i=1, \ldots, N$ , there exists $n_0\in \mathbb{N}$ such that $\beta ^n\notin \Pi (\mu ^{1,n},\ldots,\mu ^{N,n})$ for all $n\ge n_0$ . Hence, $\mathcal{F}_n[\beta ^n]=+\infty$ for all $n\ge n_0$ . If $C_\infty (\beta )=+\infty$ , then let $M\gt 0$ and let ${\textbf{x}} \in \mathop{\textrm{spt}}\nolimits \beta$ and $r\gt 0$ be such that $B({{\textbf{x}}}, r) \subset \{c\gt M-\varepsilon \}$ . Since the evaluation on open sets is lower semi-continuous with respect to the tight convergence, we have that, for $n$ big enough, $\beta _n (B({{\textbf{x}}}, r))\gt 0$ so that $C_\infty (\beta _n)\gt M-\varepsilon$ and since $M$ is arbitrary we conclude.
For the $\Gamma$ - $\limsup$ inequality (Condition (II)), if $\mathcal{F}[\beta ]=+\infty$ , then any sequence with the right marginals and tightly converging to $\beta$ will do. Therefore, we may assume that the measure $\beta$ satisfies $\beta \in \Pi (\mu ^1,\ldots,\mu ^N)$ and $C_\infty [\beta ]\lt +\infty$ . To build the approximants, we use the Borel partitions $\{B_i^{k,n}\}_{i=1}^{m^{k.n}}$ and discrete measures introduced in Section 3. For all $n$ , given a multi-index ${\underline{\textbf{i}}}=(i_1,\ldots,i_N)$ we use, again, the ‘cube’
and set
We then define the measures
We show that $\beta ^n$ has marginals $\mu ^{1,n},\ldots,\mu ^{N,n}$ . For all Borel sets $A\subset X_1$ , we have
where the third inequality is due to Proposition 3.3. The computation is analogous for the other marginals.
The sequence $(\beta ^n)$ converges tightly to $\beta$ which can be seen in a manner analogous to the convergence of the sequence $(\alpha ^n)$ to $\gamma$ . It remains to prove that the sequence satisfies the $\Gamma$ - $\limsup$ inequality. We fix $\varepsilon \gt 0$ . It suffices to show that
Since for all $n$ the support of $\beta ^n$ is a finite set, we can fix $u^n\in \mathop{\textrm{supp}}\nolimits \beta ^n$ such that $C_\infty [\beta ^n]=c(u^n)$ . Moreover, for all $n$ there exists $z^n\in \mathop{\textrm{supp}}\nolimits \beta$ such that $d(u^n,z^n)\le \tfrac 12\delta _n$ . Now for all $n$ large enough to satisfy $\varepsilon _n\lt \varepsilon$ , we have
and we are done.
By Corollary 2.6, $\Pi (\mu ^1, \ldots, \ \mu ^N)\cup _n \Pi (\mu ^{1,n}, \ldots, \ \mu ^{N,n})$ is compact and therefore the equi-coercivity follows.
4.2 The integral case
We define the functionals $\mathcal{G}_n,\mathcal{G}\;:\;\mathcal{P}(X)\to{\mathbb{R}} \cup \{+\infty \}$ by
and
Also for the integral case, we assume that the measures $\mu ^1,\ldots,\mu ^N$ have compact supports and that the cost function $c\;:\;X\to{\mathbb{R}}$ is continuous. We prove the following:
Proposition 4.2. The functionals $\mathcal{G}_n$ are equi-coercive and
Proof. The proof is analogous to that of Proposition 4.1. The only substantial difference is in the proof of the $\Gamma$ - $\limsup$ inequality in the case that the measure $\beta$ belongs to the set $\Pi (\mu ^1,\ldots,\mu ^N)$ . We have to find a sequence $(\beta ^n)$ , weakly ${}^\ast$ converging to $\beta$ and satisfying Condition (II). Let ( $\beta ^n$ ) be the discretisation defined in the proof of Proposition 4.1. Since the supports of the measures $\mu ^1,\ldots,\mu ^N$ are compact, also the set $K\;:\!=\;\mathop{\textrm{spt}}\nolimits \mu ^1\times \cdots \times \mathop{\textrm{spt}}\nolimits \mu ^N$ is compact. Note that for all $n\in \mathbb{N}$ , we have $\mathop{\textrm{spt}}\nolimits \beta ^n\subset K$ . We set $T=\max _{z\in K}c(z)$ . Now the function $c_T\;:\!=\;\min \{c,T\}$ is continuous and bounded on $X$ and by the weak ${}^\ast$ -convergence
from which the $\Gamma$ - $\limsup$ inequality follows.
4.3 Proof of the main theorems and a counterexample
Proof. (of Theorem 1.3) By Proposition 3.1 and Remark 3.2, we can find a sequence $(\alpha ^n)_n$ with finite supports such that $\mathop{\textrm{spt}}\nolimits \alpha ^n \subset \mathop{\textrm{spt}}\nolimits \gamma$ and $\alpha ^n \stackrel{*}{\rightharpoonup } \gamma$ . We define the functionals $\mathcal{F}$ and $\mathcal{F}_n$ of Subsection 4.1 using the marginals of $\gamma$ and $\alpha ^n$ . The plan $\gamma$ is ICM; therefore by Proposition 1.6, it is finitely optimal. This means that each plan $\alpha ^n$ is optimal between its marginals and thus a minimiser of the functional $\mathcal{F}_n$ .
The $\Gamma$ -convergence and equi-coercivity established in Proposition 4.1 imply, by Theorem 2.10, that the minimisers of the functionals $\mathcal{F}_n$ converge, up to subsequences, to a minimiser of $\mathcal F$ . Therefore, since $\alpha ^n\stackrel{*}{\rightharpoonup }\gamma$ , the plan $\gamma$ is optimal for the problem ( $P_\infty$ ).
Proof. (of Theorem 1.4) The proof is the same as that of Theorem 1.3. The $\Gamma$ -convergence is now given by Proposition 4.2.
In ref. [Reference Ambrosio2], Ambrosio and Pratelli give, for the problem ( $P$ ), an example of lower semi-continuous cost function $c\;:\;X\times X\to [0,\infty ]$ ( $c$ assumes also the value $+\infty$ ), for which there exists a $c-$ cyclically monotone transport plan which is not optimal. After that it has been shown in refs. [Reference Beiglböck, Goldstern, Maresch and Schachermayer4] and [Reference Bianchini and Caravenna5] that, for $N=2$ , it is enough that $c$ is Borel measurable and that the set $\{c=+\infty \}$ as a special structure. Actually, the measure theoretical tools introduced in ref. [Reference Bianchini and Caravenna5] could be applied in an even more general settings. We refer the reader to those papers for further details.
The next example, that is a slightly modified version of the example of [Reference Ambrosio2], shows that also in the case of the problem ( $P_\infty$ ) the continuity of the cost may be required, even when the cost assumes only finite values.
Example 4.3. Let us consider the two-marginal $L^\infty$ -optimal transportation problem with marginals $\mu =\nu =\mathcal{L}|_{[0,1]}$ and the cost function
We fix an irrational number $\alpha$ . We set $T_1=Id_{[0,1]}$ and $T_2\;:\;[0,1]\to [0,1]$ , $T_2(x)=x+\alpha \pmod 1$ . Now $T_1$ is an optimal transportation map for the problem ( $P_\infty$ ) with $C_\infty [T_1]=1$ . Since $C_\infty [T_2]=2$ , $T_2$ cannot be optimal. However, it is ICM.
In fact if we assume that $T_2$ is not ICM, we should find a minimal $K\in \mathbb{N}$ and a $K$ -tuple of couples $\{x_i,y_i\}_{i=1}^K$ , all belonging to the support of the plan given by $T_2$ , such that
with the convention $x_{K+1}=x_1$ . By the definition of the map $T_2$ , we have $y_i=x_i+\alpha \pmod 1$ for all $i$ . Given the form of $c$ , the only form in which this inequality can hold is $2\gt 1$ . The right-hand side now tells us that $y_i=x_i+\alpha \pmod 1$ for all $i$ , that is, $x_{i+1}=x_i+\alpha \pmod 1$ for all $i$ . Summing up now gives us (keeping in mind that $x_{K+1}=x_1$ ) that $x_1=x_1+K\alpha \pmod 1$ , contradicting the irrationality of $\alpha$ .
Acknowledgements
Both authors acknowledge the support of GNAMPA-INDAM and ‘Fondi di ricerca di ateneo, ex 60 $\%$ ’ of the University of Firenze.
Funding statement
The research of the first author is part of the project Metodologie innovative per l’analisi di dati a struttura complessa financed by the Fondazione Cassa di Risparmio di Firenze. The second author acknowledges the support of the PRIN (Progetto di ricerca di rilevante interesse nazionale) 2022J4FYNJ, Variational methods for stationary and evolution problems with singularities and interfaces.
Competing interests
There are no competing interest about this research.