Optimal transport
Optimal transport
The Optimal transport (OT) problem has attracted a surge of research interest because it expresses the distance between probability distributions, known as the Wasserstein distance. This strong property enables us to apply this distance to diverse machine learning problems such as generative adversarial networks, graph optimal transport, clustering, and domain adaptation.
The Kantorovich relaxation formulation of the optimal transport (OT) problem is explained briefly. Let 
 and 
 be {probabilities} or positive weight vectors as 
 and 
, respectively. Given two empirical distributions, i.e., discrete measures, 
, 
 and the ground cost matrix 
 between their supports, the problem can be formulated as
,
where 
 represents the transport matrix, and where the domain 
 is defined as
,
where 
 and 
 are the marginal constraints. Moreover, we present the sum of the two vectors respectively as 
 and 
, i.e., 
 and 
. Note that 
 is equal to 
 in the standard OT formulation. The obtained OT matrix 
 brings powerful distances as 
, which is known as the 
-th order Wasserstein distance. It is used in various fields according to the value of 
. Especially, the distance is applied to computer vision when 
, and clustering when 
. When 
 is the ground cost matrix and 
 we specifically designate the 
-th order Wasserstein distance as the  OT distance.




