Notes on Schrodinger bridges
Notes & Scribbles

Giung Nam, 2026-03-30

FPE and FKF

A controlled Ito process (XtuRd)t[0,T](\bm{X}_{t}^{\bm{u}} \in \mathbb{R}^{d})_{t \in [0,T]} is defined as the solution to the SDE:

dXtu=[f(Xtu,t)+σtu(Xtu,t)]dt+σtdBt,\begin{align} \mathrm{d}\bm{X}_{t}^{\bm{u}} &= \left[ \bm{f}(\bm{X}_{t}^{\bm{u}}, t) + \sigma_{t}\bm{u}(\bm{X}_{t}^{\bm{u}}, t) \right] \mathrm{d}t + \sigma_{t}\mathrm{d}\bm{B}_{t}, \end{align}

with the reference drift f:Rd×[0,T]Rd\bm{f} : \mathbb{R}^{d} \times [0, T] \rightarrow \mathbb{R}^{d}, the diffusion coefficient σt:[0,T]R0\sigma_{t} : [0, T] \rightarrow \mathbb{R}_{\geq 0}, and the control drift u:Rd×[0,T]Rd\bm{u} : \mathbb{R}^{d} \times [0, T] \rightarrow \mathbb{R}^{d}. FPE provides a deterministic forward-time evolution of probability densities ptup_{t}^{\bm{u}}:

tptu(x)={[f(x,t)+σtu(x,t)]ptu(x)}+σt22Δptu(x).\begin{align}\textstyle \partial_{t} p_{t}^{\bm{u}}(\bm{x}) = -\nabla \cdot \left\{ \left[ \bm{f}(\bm{x}, t) + \sigma_{t} \bm{u}(\bm{x}, t) \right] p_{t}^{\bm{u}}(\bm{x}) \right\} + \frac{\sigma_{t}^{2}}{2} \Delta p_{t}^{\bm{u}}(\bm{x}). \end{align}

Conversely, FKE provides a deterministic backward-time evolution of expected costs rur^{\bm{u}}:

tru(x,t)+f(x,t)+σtu(x,t),ru(x,t)+σt22Δru(x,t)c(x,t)ru(x,t)=0,\begin{align}\textstyle \partial_{t} r^{\bm{u}}(\bm{x}, t) + \langle \bm{f}(\bm{x}, t) + \sigma_{t} \bm{u}(\bm{x}, t), \nabla r^{\bm{u}}(\bm{x}, t) \rangle + \frac{\sigma_{t}^{2}}{2} \Delta r^{\bm{u}}(\bm{x}, t) - c(\bm{x}, t) r^{\bm{u}}(\bm{x}, t) = 0, \end{align}

which, via FKF, represents the conditional expectation of future outcomes given the current state:

ru(x,t)=E[exp(tTc(Xsu,s)ds)Φ(XTu)Xtu=x],\begin{align}\textstyle r^{\bm{u}}(\bm{x}, t) = \mathbb{E} \left[ \exp{\left( -\int_{t}^{T} c(\bm{X}_{s}^{\bm{u}}, s) \mathrm{d}s\right)} \Phi(\bm{X}_{T}^{\bm{u}}) \mid \bm{X}_{t}^{\bm{u}} = \bm{x} \right], \end{align}

with the terminal constraint ru(x,T)=Φ(x)r^{\bm{u}}(\bm{x}, T) = \Phi(\bm{x}) and the running cost c:Rd×[0,T]Rc : \mathbb{R}^{d} \times [0,T] \rightarrow \mathbb{R}.

KL divergence

Let Pu~\mathbb{P}^{\tilde{\bm{u}}} and Pu\mathbb{P}^{\bm{u}} be path measures on the space of continuous paths C([0,T];Rd)C([0,T];\mathbb{R}^{d}), induced by SDEs with the same reference drift f\bm{f} and diffusion coefficient σt\sigma_{t}, but governed by different controls u~\tilde{\bm{u}} and u\bm{u}. Assuming absolute continuity Pu~Pu\mathbb{P}^{\tilde{\bm{u}}} \ll \mathbb{P}^{\bm{u}}, the KL divergence of Pu~\mathbb{P}^{\tilde{\bm{u}}} with respect to Pu\mathbb{P}^{\bm{u}} is given by:

DKL(Pu~Pu)=EX0:Tu~Pu~[120Tu~(Xtu~,t)u(Xtu~,t)2dt],=0TRd[12u~(x,t)u(x,t)2ptu~(x)]dxdt.\begin{align} D_{\text{KL}} \left( \mathbb{P}^{\tilde{\bm{u}}} \parallel \mathbb{P}^{\bm{u}} \right) &\textstyle = \mathbb{E}_{\bm{X}_{0:T}^{\tilde{\bm{u}}} \sim \mathbb{P}^{\tilde{\bm{u}}}} \left[ \frac{1}{2} \int_{0}^{T} \left\lVert \tilde{\bm{u}}(\bm{X}_{t}^{\tilde{\bm{u}}}, t) - \bm{u}(\bm{X}_{t}^{\tilde{\bm{u}}}, t) \right\rVert^{2} \mathrm{d}t \right], \\ &\textstyle = \int_{0}^{T} \int_{\mathbb{R}^{d}} \left[ \frac{1}{2} \left\lVert \tilde{\bm{u}}(\bm{x}, t) - \bm{u}(\bm{x}, t) \right\rVert^{2} p_{t}^{\tilde{\bm{u}}}(\bm{x}) \right] \mathrm{d}\bm{x} \mathrm{d}t. \end{align}

Dynamic SB

The dynamic SB formulation aims to determine the optimal control that minimally perturbs the reference dynamics while satisfying the initial and terminal marginal distribution constraints, X0π0\bm{X}_{0} \sim \pi_{0} and XTπT\bm{X}_{T} \sim \pi_{T}:

infu,ptuDKL(PuQ)=0TRd[12u(x,t)2ptu(x)]dxdt,s.t.tptu(x)={[f(x,t)+σtu(x,t)]ptu(x)}+σt22Δptu(x),p0u=π0,pTu=πT.\begin{align} \inf_{\bm{u}, p_{t}^{\bm{u}}} & \textstyle \quad D_{\text{KL}}\left( \mathbb{P}^{\bm{u}} \parallel \mathbb{Q} \right) = \int_{0}^{T} \int_{\mathbb{R}^{d}} \left[ \frac{1}{2} \left\lVert \bm{u}(\bm{x}, t) \right\rVert^{2} p_{t}^{\bm{u}}(\bm{x}) \right] \mathrm{d}\bm{x} \mathrm{d}t, \\ \text{s.t.} & \textstyle \quad \partial_{t} p_{t}^{\bm{u}}(\bm{x}) = -\nabla \cdot \left\{ \left[ \bm{f}(\bm{x}, t) + \sigma_{t} \bm{u}(\bm{x}, t) \right] p_{t}^{\bm{u}}(\bm{x}) \right\} + \frac{\sigma_{t}^{2}}{2} \Delta p_{t}^{\bm{u}}(\bm{x}), p_{0}^{\bm{u}} = \pi_{0}, p_{T}^{\bm{u}} = \pi_{T}. \end{align}

As u\bm{u} and ptup_{t}^{\bm{u}} are coupled by the FPE constraint, we reframe this as an unconstrained optimization problem by introducing a Lagrangian:

L(ptu,u,ψt)=0TRd{12u2ptu+ψt[tptu+[(f+σtu)ptu]σt22Δptu]}dxdt,\begin{align}\textstyle \mathcal{L}(p_{t}^{\bm{u}}, \bm{u}, \psi_{t}) = \int_{0}^{T} \int_{\mathbb{R}^{d}} \left\{ \frac{1}{2} \left\lVert \bm{u} \right\rVert^{2} p_{t}^{\bm{u}} + \psi_{t} \cdot \left[ \partial_{t}p_{t}^{\bm{u}} + \nabla \cdot \left[ \left( \bm{f} + \sigma_{t}\bm{u} \right) p_{t}^{\bm{u}} \right] - \frac{\sigma_{t}^{2}}{2} \Delta p_{t}^{\bm{u}} \right] \right\} \mathrm{d}\bm{x} \mathrm{d}t, \end{align}

which yields the optimal control u\bm{u}^{\ast} written in terms of the Lagrange multiplier ψt:RdR\psi_{t} : \mathbb{R}^{d} \rightarrow \mathbb{R}:

u(x,t)=σtψt(x),\begin{align} \bm{u}^{\ast}(\bm{x}, t) = \sigma_{t} \nabla \psi_{t}(\bm{x}), \end{align}

and the minimizer (u,pt)(\bm{u}^{\ast}, p_{t}^{\ast}) as the solution to the HJB-FP system:

{tψt=σt22ψt2ψt,fσt22Δψt,tpt=[(f+σt2ψt)pt]+σt22Δpt,s.t.p0=π0,pT=πT.\begin{align} \begin{cases} \partial_{t}\psi_{t} = -\frac{\sigma_{t}^{2}}{2} \left\lVert \nabla \psi_{t} \right\rVert^{2} - \langle \nabla \psi_{t}, \bm{f} \rangle - \frac{\sigma_{t}^{2}}{2} \Delta \psi_{t}, \\ \partial_{t}p_{t}^{\ast} = - \nabla \cdot \left[ \left( \bm{f} + \sigma_{t}^{2}\nabla\psi_{t} \right) p_{t}^{\ast} \right] + \frac{\sigma_{t}^{2}}{2} \Delta p_{t}^{\ast}, \end{cases} \quad \text{s.t.} \quad p_{0}^{\ast} = \pi_{0}, p_{T}^{\ast} = \pi_{T}. \end{align}

Applying the change of variables (ψ,pt)(ϕt,ψ^t)(\psi, p_{t}^{\ast}) \mapsto (\phi_{t}, \hat{\psi}_{t}) defined as ψt(x)=logϕt(x)\psi_{t}(\bm{x}) = \log{\phi_{t}(\bm{x})} and pt(x)=ϕt(x)ϕ^t(x)p_{t}^{\ast}(\bm{x}) = \phi_{t}(\bm{x}) \hat{\phi}_{t}(\bm{x}), i.e., Cole-Hopf transformation, transforms the non-linear HJB-FP system into the linear system for (ϕt,ϕ^t)(\phi_{t}, \hat{\phi}_{t}):

{tϕt=ϕt,fσt22Δϕt,tϕ^t=(ϕ^tf)+σt22Δϕ^t,s.t.p0=ϕ0ϕ^0,pT=ϕTϕ^T.\begin{align} \begin{cases} \partial_{t}\phi_{t} = -\langle \nabla \phi_{t}, \bm{f} \rangle - \frac{\sigma_{t}^{2}}{2} \Delta \phi_{t}, \\ \partial_{t}\hat{\phi}_{t} = - \nabla \cdot (\hat{\phi}_{t} \bm{f}) + \frac{\sigma_{t}^{2}}{2} \Delta \hat{\phi}_{t}, \end{cases} \quad \text{s.t.} \quad p_{0}^{\ast} = \phi_{0}\hat{\phi}_{0}, p_{T}^{\ast} = \phi_{T} \hat{\phi}_{T}. \end{align}

FPE and FKF

A controlled Ito process $(\bm{X}{t}^{\bm{u}} \in \mathbb{R}^{d}){t \in [0,T]}$ is defined as the solution to the SDE: $$ \begin{align} \mathrm{d}\bm{X}{t}^{\bm{u}} &= \left[ \bm{f}(\bm{X}{t}^{\bm{u}}, t) + \sigma_{t}\bm{u}(\bm{X}{t}^{\bm{u}}, t) \right] \mathrm{d}t + \sigma{t}\mathrm{d}\bm{B}{t}, \end{align} $$ with the reference drift $\bm{f} : \mathbb{R}^{d} \times [0, T] \rightarrow \mathbb{R}^{d}$, the diffusion coefficient $\sigma{t} : [0, T] \rightarrow \mathbb{R}{\geq 0}$, and the control drift $\bm{u} : \mathbb{R}^{d} \times [0, T] \rightarrow \mathbb{R}^{d}$. FPE provides a deterministic forward-time evolution of probability densities $p{t}^{\bm{u}}$: $$ \begin{align}\textstyle \partial_{t} p_{t}^{\bm{u}}(\bm{x}) = -\nabla \cdot \left{ \left[ \bm{f}(\bm{x}, t) + \sigma_{t} \bm{u}(\bm{x}, t) \right] p_{t}^{\bm{u}}(\bm{x}) \right} + \frac{\sigma_{t}^{2}}{2} \Delta p_{t}^{\bm{u}}(\bm{x}). \end{align} $$ Conversely, FKE provides a deterministic backward-time evolution of expected costs $r^{\bm{u}}$: $$ \begin{align}\textstyle \partial_{t} r^{\bm{u}}(\bm{x}, t) + \langle \bm{f}(\bm{x}, t) + \sigma_{t} \bm{u}(\bm{x}, t), \nabla r^{\bm{u}}(\bm{x}, t) \rangle + \frac{\sigma_{t}^{2}}{2} \Delta r^{\bm{u}}(\bm{x}, t) - c(\bm{x}, t) r^{\bm{u}}(\bm{x}, t) = 0, \end{align} $$ which, via FKF, represents the conditional expectation of future outcomes given the current state: $$ \begin{align}\textstyle r^{\bm{u}}(\bm{x}, t) = \mathbb{E} \left[ \exp{\left( -\int_{t}^{T} c(\bm{X}{s}^{\bm{u}}, s) \mathrm{d}s\right)} \Phi(\bm{X}{T}^{\bm{u}}) \mid \bm{X}_{t}^{\bm{u}} = \bm{x} \right], \end{align} $$ with the terminal constraint $r^{\bm{u}}(\bm{x}, T) = \Phi(\bm{x})$ and the running cost $c : \mathbb{R}^{d} \times [0,T] \rightarrow \mathbb{R}$.

KL divergence

Let $\mathbb{P}^{\tilde{\bm{u}}}$ and $\mathbb{P}^{\bm{u}}$ be path measures on the space of continuous paths $C([0,T];\mathbb{R}^{d})$, induced by SDEs with the same reference drift $\bm{f}$ and diffusion coefficient $\sigma_{t}$, but governed by different controls $\tilde{\bm{u}}$ and $\bm{u}$. Assuming absolute continuity $\mathbb{P}^{\tilde{\bm{u}}} \ll \mathbb{P}^{\bm{u}}$, the KL divergence of $\mathbb{P}^{\tilde{\bm{u}}}$ with respect to $\mathbb{P}^{\bm{u}}$ is given by: $$ \begin{align} D_{\text{KL}} \left( \mathbb{P}^{\tilde{\bm{u}}} \parallel \mathbb{P}^{\bm{u}} \right) &\textstyle = \mathbb{E}{\bm{X}{0:T}^{\tilde{\bm{u}}} \sim \mathbb{P}^{\tilde{\bm{u}}}} \left[ \frac{1}{2} \int_{0}^{T} \left\lVert \tilde{\bm{u}}(\bm{X}{t}^{\tilde{\bm{u}}}, t) - \bm{u}(\bm{X}{t}^{\tilde{\bm{u}}}, t) \right\rVert^{2} \mathrm{d}t \right], \ &\textstyle = \int_{0}^{T} \int_{\mathbb{R}^{d}} \left[ \frac{1}{2} \left\lVert \tilde{\bm{u}}(\bm{x}, t) - \bm{u}(\bm{x}, t) \right\rVert^{2} p_{t}^{\tilde{\bm{u}}}(\bm{x}) \right] \mathrm{d}\bm{x} \mathrm{d}t. \end{align} $$

Dynamic SB

The dynamic SB formulation aims to determine the optimal control that minimally perturbs the reference dynamics while satisfying the initial and terminal marginal distribution constraints, $\bm{X}{0} \sim \pi{0}$ and $\bm{X}{T} \sim \pi{T}$: $$ \begin{align} \inf_{\bm{u}, p_{t}^{\bm{u}}} & \textstyle \quad D_{\text{KL}}\left( \mathbb{P}^{\bm{u}} \parallel \mathbb{Q} \right) = \int_{0}^{T} \int_{\mathbb{R}^{d}} \left[ \frac{1}{2} \left\lVert \bm{u}(\bm{x}, t) \right\rVert^{2} p_{t}^{\bm{u}}(\bm{x}) \right] \mathrm{d}\bm{x} \mathrm{d}t, \ \text{s.t.} & \textstyle \quad \partial_{t} p_{t}^{\bm{u}}(\bm{x}) = -\nabla \cdot \left{ \left[ \bm{f}(\bm{x}, t) + \sigma_{t} \bm{u}(\bm{x}, t) \right] p_{t}^{\bm{u}}(\bm{x}) \right} + \frac{\sigma_{t}^{2}}{2} \Delta p_{t}^{\bm{u}}(\bm{x}), p_{0}^{\bm{u}} = \pi_{0}, p_{T}^{\bm{u}} = \pi_{T}. \end{align} $$ As $\bm{u}$ and $p_{t}^{\bm{u}}$ are coupled by the FPE constraint, we reframe this as an unconstrained optimization problem by introducing a Lagrangian: $$ \begin{align}\textstyle \mathcal{L}(p_{t}^{\bm{u}}, \bm{u}, \psi_{t}) = \int_{0}^{T} \int_{\mathbb{R}^{d}} \left{ \frac{1}{2} \left\lVert \bm{u} \right\rVert^{2} p_{t}^{\bm{u}} + \psi_{t} \cdot \left[ \partial_{t}p_{t}^{\bm{u}} + \nabla \cdot \left[ \left( \bm{f} + \sigma_{t}\bm{u} \right) p_{t}^{\bm{u}} \right] - \frac{\sigma_{t}^{2}}{2} \Delta p_{t}^{\bm{u}} \right] \right} \mathrm{d}\bm{x} \mathrm{d}t, \end{align} $$ which yields the optimal control $\bm{u}^{\ast}$ written in terms of the Lagrange multiplier $\psi_{t} : \mathbb{R}^{d} \rightarrow \mathbb{R}$: $$ \begin{align} \bm{u}^{\ast}(\bm{x}, t) = \sigma_{t} \nabla \psi_{t}(\bm{x}), \end{align} $$ and the minimizer $(\bm{u}^{\ast}, p_{t}^{\ast})$ as the solution to the HJB-FP system: $$ \begin{align} \begin{cases} \partial_{t}\psi_{t} = -\frac{\sigma_{t}^{2}}{2} \left\lVert \nabla \psi_{t} \right\rVert^{2} - \langle \nabla \psi_{t}, \bm{f} \rangle - \frac{\sigma_{t}^{2}}{2} \Delta \psi_{t}, \ \partial_{t}p_{t}^{\ast} = - \nabla \cdot \left[ \left( \bm{f} + \sigma_{t}^{2}\nabla\psi_{t} \right) p_{t}^{\ast} \right] + \frac{\sigma_{t}^{2}}{2} \Delta p_{t}^{\ast}, \end{cases} \quad \text{s.t.} \quad p_{0}^{\ast} = \pi_{0}, p_{T}^{\ast} = \pi_{T}. \end{align} $$ Applying the change of variables $(\psi, p_{t}^{\ast}) \mapsto (\phi_{t}, \hat{\psi}{t})$ defined as $\psi{t}(\bm{x}) = \log{\phi_{t}(\bm{x})}$ and $p_{t}^{\ast}(\bm{x}) = \phi_{t}(\bm{x}) \hat{\phi}{t}(\bm{x})$, i.e., Cole-Hopf transformation, transforms the non-linear HJB-FP system into the linear system for $(\phi{t}, \hat{\phi}{t})$: $$ \begin{align} \begin{cases} \partial{t}\phi_{t} = -\langle \nabla \phi_{t}, \bm{f} \rangle - \frac{\sigma_{t}^{2}}{2} \Delta \phi_{t}, \ \partial_{t}\hat{\phi}{t} = - \nabla \cdot (\hat{\phi}{t} \bm{f}) + \frac{\sigma_{t}^{2}}{2} \Delta \hat{\phi}{t}, \end{cases} \quad \text{s.t.} \quad p{0}^{\ast} = \phi_{0}\hat{\phi}{0}, p{T}^{\ast} = \phi_{T} \hat{\phi}_{T}. \end{align} $$