Efficient and Differentiable Reachable Sets for Learning-Enabled Control Systems

SCALE Seminar @ UCB

Akash Harapanahalli

PhD Student

School of Electrical and Computer Engineering, Georgia Tech

2025-09-29

Introduction

Acknowledgements

Dr. Samuel Coogan
Associate Professor
Georgia Institute of Technology

Dr. Saber Jafarpour
Research Assistant Professor
University of Colorado Boulder

Brendan Gould
PhD Student
Georgia Institute of Technology

Motivation

Learning-enabled components are becoming increasingly intertwined with our day-to-day lives
There are little formal guarantees of safe operation, which is dangerous when applied in safety-critical domains

Robustness Analysis of Neural Networks in the Loop

Problem setting: Continuous-time system in feedback with neural network
Efficient and scalable certification of neural network controlled systems
Trainable conditions: backpropogate information to improve robustness

Image Credit: OpenAI

Reachable Set Computation

Consider the closed-loop system \[ \dot{x} = f(x,w), \] \(x\in\mathbb{R}^{n_x}\), \(w\in\mathbb{R}^{n_w}\).

Problem: Reach-Avoid

Given uncertain initial state \(x_0\in\mathcal{X}_0\) and adversarial disturbances \(w\in\mathcal{W}\), certify that the agent avoids unsafe states \(\mathcal{O}\) for every \(t\), while reaching a goal set \(\mathcal{G}\): \[ \forall t\in[t_0,t_f]\ \mathcal{R}(t;t_0,\mathcal{X}_0,\mathcal{W}) \subseteq \mathcal{O}^\complement, \quad \mathcal{R}(t_f;t_0,\mathcal{X}_0,\mathcal{W}) \subseteq \mathcal{G}. \]

Possible use-cases:

If detecting potential collision with \(\mathcal{O}\), swap to a safe backup strategy, or filter the input
If not guaranteed to reach \(\mathcal{G}\), resynthesize input trajectory

Existing Methods

Crucial Tradeoffs: how to balance accuracy, efficiency, and guarantees?

Level Set Methods [1]

Finds a function \(g(t,x)\) whose \(0\) sublevel set \(\{x : g(t,x) \leq 0\}\) is the reachable set of the system at time \(t\).

Gives exact reachable set ↑ ↑
Need to solve a partial differential equation. Hard to find true closed-form solutions, scalability concerns ↓
Recent works approximate PDE solution using neural networks [2] ↑ ↓

Set-Based Methods [3]

Propogates set-overapproximations through the system dynamics

Guaranteed overapproximating reachable set ↑
Simpler set representations suffer from overconservatism from wrapping effects ↑ ↓
Complicated set representations suffer from algorithmic complexity ↑ ↓

In This Talk

Simple theoretical modifications \(\implies\) accuracy benefits for little computational cost

Theory: A general framework for constraint-based reachable set propogation using a controlled parametric embedding system.

Number of parameters remain constant
Continuous-time system \(\to\) continuous-time embedding system bounding its behavior
Amenable to simple bounding strategies like interval analysis
New: A hypercontrol input differentially shaping the reachable set

Computation: An efficient implementation using our tool immrax for interval analysis in JAX for Python

Just-in-Time (JIT) compilation for efficient runtime computations
Automatic differentiation to directly backpropogate safety violations

Differential Geometry: Extensions of these ideas for safety verification of systems evolving on Lie groups and homogeneous manifolds

Parametric Reachable Sets

A theoretical framework for reachable set computation using a controlled dynamical embedding system

The Linear Systems Case: Supporting Hyperplanes

\[ \dot{x} = Ax, \] with initial condition \(x_0\) inside a halfspace \[ \alpha_0(x_0 - {\mathring{x}}_0) \leq y_0, \] \(\alpha_0\in\mathbb{R}^{1\times{n_x}}\), \({\mathring{x}}_0\in\mathbb{R}^{n_x}\), \(y_0\in\mathbb{R}\). How does the halfspace evolve?

If we set \(\dot{{\mathring{x}}} = A{\mathring{x}}\) and \(\dot{\alpha}^T = -A^T\alpha^T\), [4] \[ \tfrac{\mathrm{d}}{\mathrm{d}t} [\alpha (x - {\mathring{x}})] = -\alpha A(x - {\mathring{x}}) + \alpha A(x - {\mathring{x}}) = 0. \] Thus the trajectory of the ODE \[ \dot{{\mathring{x}}} = A{\mathring{x}}, \ \dot{\alpha} = -\alpha A, \] yields supporting halfspace \(\{x : \alpha(t)(x - {\mathring{x}}(t)) \leq y_0\}\).

The Linear Systems Case: H-Polytopes

\[ \dot{x} = Ax, \] with initial condition \(x_0\) inside an H-rep polytope \[ \alpha_0(x_0 - {\mathring{x}}_0) \leq y_0, \] \(\alpha_0\in\mathbb{R}^{{n_y}\times{n_x}}\), \({\mathring{x}}_0\in\mathbb{R}^{n_x}\), \(y_0\in\mathbb{R}^{n_y}\), \(\leq\) is element-wise. Flow each row of \(\alpha\) using the adjoint dynamics \[ \dot{{\mathring{x}}} = A{\mathring{x}}, \ \dot{\alpha} = -\alpha A, \] then \[ \alpha(t)(x(t) - {\mathring{x}}(t)) \leq y_0. \]

The Linear Systems Case: Nonlinear Supports

\[ \dot{x} = Ax, \] with initial condition \(x_0\) satisfying \[ h(\alpha_0(x_0 - {\mathring{x}}_0)) \leq y_0. \] Flow \(\alpha\) using the adjoint dynamics \[ \dot{{\mathring{x}}} = A{\mathring{x}}, \ \dot{\alpha} = -\alpha A, \] then we still get \[ \tfrac{\mathrm{d}}{\mathrm{d}t} h(\alpha(x - {\mathring{x}})) = \tfrac{\partial h}{\partial z} [-\alpha A(x - {\mathring{x}}) + \alpha A(x - {\mathring{x}})] = 0, \] and therefore that \[ h(\alpha(t)(x(t) - {\mathring{x}}(t))) \leq y_0. \]

The Linear Systems Case: Ellipsoids as \(\ell_2\) Normotopes

\[ \dot{x} = Ax, \] with initial condition \(x_0\) satisfying \[ \|\alpha_0(x_0 - {\mathring{x}}_0)\|_2 \leq y_0. \] Flow \(\alpha\) using the adjoint dynamics \[ \dot{{\mathring{x}}} = A{\mathring{x}}, \ \dot{\alpha} = -\alpha A, \] then \[ \|\alpha(t)(x(t) - {\mathring{x}}(t))\|_2 \leq y_0. \]

\(\ell_2\)-norm balls with time-varying shape matrix

Nonlinear Systems: Templated H-Polytopes

\[ \dot{x} = f(x), \] with initial condition \(x_0\) satisfying \[ \alpha_0(x_0 - {\mathring{x}}_0) \leq y_0. \] Fix \(\dot{\alpha} = 0\) and flow offsets \(y\) such that \[ \begin{aligned} \dot{{\mathring{x}}} &= f({\mathring{x}}) \\ \dot{y}_k &\geq \sup_{\substack{x:\alpha(x - {\mathring{x}}) \leq y \\ \alpha_k(x - {\mathring{x}}) = y_k}} \alpha_k (f(x) - f({\mathring{x}})) \end{aligned} \] then \[ \alpha_0(x(t) - {\mathring{x}}(t)) \leq y(t) \]

Connection: Parametric Embedding System

Summary: \(x(t)\in\{x : \alpha(t)(x - {\mathring{x}}(t)) \leq y(t)\}\) if

Linear System: Evolve parameters \(\alpha\) using the adjoint dynamics \[ \begin{aligned} \dot{{\mathring{x}}} &= A{\mathring{x}}, \\ \dot{\alpha} &= -\alpha A \\ \dot{y} &= 0 \end{aligned} \] (this generalized to nonlinear functions of \(\alpha(x - {\mathring{x}})\))

Nonlinear System: Evolve offsets \(y_k\) using bound of vector field in the direction of row \(\alpha_k\) \[ \begin{aligned} \dot{{\mathring{x}}} &= f({\mathring{x}}) \\ \dot{\alpha} &= 0 \\ \dot{y}_k &\geq \sup_{\substack{x:\alpha(x - {\mathring{x}}) \leq y \\ \alpha_k(x - {\mathring{x}}) = y_k}} \alpha_k (f(x) - f({\mathring{x}})) \end{aligned} \]

Our Idea:

Combine \(\alpha\) evolution with offset bounding for nonlinear systems
Generalize to any constraint-based set representation

A Quick Preview

We can define a new class of sets as the intersection of \(\alpha\)-parameterized sublevel sets: \[ \{x : g(\alpha,x - {\mathring{x}}) \leq y\} = \bigcap_{k=1}^{{n_y}} \{x : g_k(\alpha,x - {\mathring{x}}) \leq y_k\}, \] for centering point \({\mathring{x}}\in\mathbb{R}^{n_x}\), parameters \(\alpha\in\mathbb{R}^{n_\alpha}\), offsets \(y\in\mathbb{R}^{n_y}\), and nonlinearity \(g:\mathbb{R}^{n_\alpha}\times\mathbb{R}^{n_x}\to\mathbb{R}^{n_y}\).

We will define dynamics on \({\mathring{x}},\alpha,y\) \[ \dot{{\mathring{x}}} = f({\mathring{x}}), \quad \dot{\alpha} = ??, \quad \dot{y} = ??, \] such that every trajectory of the system was contained in the trajectory of this embedding system as \[ x_0\in \{x : g(\alpha(t_0),x - {\mathring{x}}(t_0)) \leq y(t_0)\} \implies x(t) \in \{x : g(\alpha(t),x - {\mathring{x}}(t)) \leq y(t)\}. \]

Parametopes: General Constraint-Based Set Representation

Definition 1: Parametope

A parametope is the set \[ \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g := \{x\in\mathbb{R}^{n_x}: g(\alpha,x - {\mathring{x}}) \leq y\}, \] where \({\mathring{x}}\in\mathbb{R}^{n_x}\) (center), \(\alpha\in\mathbb{R}^{n_\alpha}\) (parameters), \(y\in\mathbb{R}^{n_y}\) (offset), and \(g:\mathbb{R}^{n_\alpha}\times\mathbb{R}^{n_x}\to\mathbb{R}^{n_y}\) (nonlinearity).

Notationally we denote the \(k\)-th facet \[ \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)^k_g := \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g \cap \{x\in\mathbb{R}^{n_x}: g_k(\alpha,x - {\mathring{x}}) = y_k\} \]

Example: Normotopes

The sublevel set of norm \(\|\cdot\|\) shaped by \(\alpha\in\mathbb{R}^{n\times n}\) \[ \left[\!\left[ {\mathring{x}},\alpha,y \right]\!\right] := \{x\in\mathbb{R}^{n_x}: \|\alpha(x - {\mathring{x}})\| \leq y\}. \]

Example: Polytopes

With \(g(\alpha,x - {\mathring{x}}) = \alpha(x - {\mathring{x}})\), the H-rep polytope \[ \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g = \{x\in\mathbb{R}^{n_x}: \alpha(x - {\mathring{x}}) \leq y\}. \]

Theory: Reachable Sets via Controlled Parametric Embeddings

Define an embedding system, \[ \begin{aligned} \dot{{\mathring{x}}} &= f({\mathring{x}},{\mathring{w}}), \\ \dot{\alpha} &= U(t,{\mathring{x}},\alpha,y) \\ \dot{y} &= E(t,{\mathring{x}},\alpha,y) \end{aligned} \] evolving on \(\mathbb{R}^{n_x}\times\mathbb{R}^{n_\alpha}\times\mathbb{R}^{n_y}\), for some nominal disturbance \({\mathring{w}}(t)\).

\(U\) is arbitrary—we call this the hypercontrol
\(E\) needs to satisfy an inequality (right)

Theorem: Parametric reachable sets

Let \(({\mathring{x}}(t),\alpha(t),y(t))\) be the trajectory of the embedding system (left). If \(E\) is such that for every \(k\), \[ \begin{gathered} (x,w) \in \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k \times \mathcal{W}\implies \\ \begin{aligned} \dot{y}_k = E_k(t,{\mathring{x}},\alpha,y) \geq &\ \frac{\partial g_k}{\partial \alpha}(\alpha,x - {\mathring{x}}) [U(t,{\mathring{x}},\alpha,y)] \\ &+ \frac{\partial g_k}{\partial x} (\alpha,x - {\mathring{x}}) [f(x,w) - f({\mathring{x}},{\mathring{w}})] \end{aligned} \end{gathered} \] and \(\left(\!\left( {\mathring{x}}(t),\alpha(t),y(t) \right)\!\right)_g\) has transverse facets, then for \(t\geq t_0\), \[ \mathcal{R}_f(t;t_0,\left(\!\left( {\mathring{x}}(t_0),\alpha(t_0),y(t_0) \right)\!\right)_g,\mathcal{W}) \subseteq \left(\!\left( {\mathring{x}}(t),\alpha(t),y(t) \right)\!\right)_g. \]

Theorem Intuition: Geometric Interpretation of Terms

\[ \begin{gathered} \forall k, \quad x \in \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k = \{x\in\mathbb{R}^{n_x}: g(\alpha,x - {\mathring{x}}) \leq y,\, g_k(\alpha,x - {\mathring{x}}) = y_k\}, \quad w\in \mathcal{W}\implies \\ \dot{y}_k = E_k(t,{\mathring{x}},\alpha,y) \geq \frac{\partial g_k}{\partial \alpha}(\alpha,x - {\mathring{x}}) [U(t,{\mathring{x}},\alpha,y)] + \frac{\partial g_k}{\partial x} (\alpha,x - {\mathring{x}}) [f(x,w) - f({\mathring{x}},{\mathring{w}})] \end{gathered} \]

Term 1: infinitesimal change in offset due to the infinitesimal change \(U\) in the parameters

Term 2: the component of \(f\) in the direction of the normal vector \(\frac{\partial g_k}{\partial x}\) to the \(k\)-th facet at the point \(x\)

Theorem Inequality: Computational Bounds

\[ \begin{gathered} % x \in \param{\ox,\alpha,y}_g^k, \quad w\in \calW \implies \\ \dot{y}_k = E_k(t,{\mathring{x}},\alpha,y) \geq \sup_{ x\in\left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k,\, w\in \mathcal{W}} \left[\frac{\partial g_k}{\partial \alpha}(\alpha,x - {\mathring{x}}) [U(t,{\mathring{x}},\alpha,y)] + \frac{\partial g_k}{\partial x} (\alpha,x - {\mathring{x}}) [f(x,w) - f({\mathring{x}},{\mathring{w}})] \right] \end{gathered} \] How can we obtain comuptationally efficient and guaranteed bounds of this inequality?

Use interval analysis as an efficient function bounding tool
In many special cases (intervals, polytopes, normotopes), there are simple and accurate forms
Linear differential inclusions for first-order dynamics abstraction

`immrax`: Interval Analysis in JAX

Efficient, Scalable, and Differentiable Implementation using JAX for Python

Interval Analysis

Definition: Order, Intervals

Element-wise Order: \(x,y\in\mathbb{R}^n\), \[ x \leq y \iff \ \forall i,\, x_i \leq y_i \] Interval: \(\mathbb{IR}^n \ni [\underline{x},\overline{x}] = \{x : \underline{x}\leq x \leq \overline{x}\}\subset \mathbb{R}^n\)

Definition: Inclusion Function

Given \(f:\mathbb{R}^n\to\mathbb{R}^m\), \(\mathsf{F}=[\underline{\mathsf{F}},\overline{\mathsf{F}}]:\mathbb{IR}^n\to\mathbb{IR}^m\) is an inclusion function [7] if for every \(x\in[\underline{x},\overline{x}]\in\mathbb{IR}^n\), \[ f(x) \in [\underline{\mathsf{F}}(\underline{x},\overline{x}),\overline{\mathsf{F}}(\underline{x},\overline{x})]. \]

Automatable through composition: \(\mathsf{F}= \mathsf{F}_1 \circ \mathsf{F}_2\) is an inclusion function for \(f = f_1\circ f_2\)

Inclusion Primitives

Definition: Minimal Inclusion Function

The minimal inclusion function is \[ \mathsf{F}_i(\underline{x},\overline{x}) = \left[\inf_{x\in[\underline{x},\overline{x}]} f_i(x), \sup_{x\in[\underline{x},\overline{x}]} f_i(x)\right] \]

Example: Scalar Continuous Function

Let \(\mathcal{S}= \{\text{critical points}\}\cap[\underline{x},\overline{x}]\). \[ \mathsf{F}(\underline{x},\overline{x}) = \left[\min_{x\in\mathcal{S}\cup\{\underline{x},\overline{x}\}}f(x),\max_{x\in\mathcal{S}\cup\{\underline{x},\overline{x}\}}f(x)\right], \]

Example: Natural Inclusion Function

Let \(\textsf{SIN}\) be an inclusion function for \(\sin\) and \(\textsf{POW}_2\) be an inclusion function for \((\cdot)^2\). Then \[ \textsf{SIN} \circ \textsf{POW}_2 \] is an inclusion function for \(\sin((\cdot)^2)\).

Compositional strategy:

Define a library of primitives with defined minimal inclusion functions
Break down function into its primitive components
Replace primitive components with inclusion functions

Basics of JAX: A Numerical Computation Library for Python

Feature: jit(f): Just-in-time Compilation

JIT compiled version of the function

Efficient runtime execution

Feature: vmap(f): Vectorizing/Parallelizing

Vectorized function along specified axis of input array

Dispatch parallel computations to GPU when applicable

Feature: {jvp,vjp,jacfwd,grad}(f): Automatic Differentiation

Forward- or Reverse-mode automatic differentiated function

[Input] is differentiable with respect to [Output]

These transformations are arbitrarily composable.

Basics of JAX: A Numerical Computation Library for Python

Example Usage

import jax
import jax.numpy as jnp

def f (x) :
  return jnp.sin(x**2)

f_grad = jax.grad(f)
print(f_grad(.5), 2*.5*jnp.cos(.5**2))

> 0.9689124 0.9689124

These transformations are arbitrarily composable.

immrax.natif: JAX Transformation for Interval Analysis

natif is fully compatible/composable with all other JAX transformations

Definition: immrax.Interval Class

lower and upper attribute for lower and upper bound of interval.

Feature: natif(f): Natural Inclusion Function

Traces f into a JAX program, and interprets primitives using their minimal inclusion functions.

Example: \(f(x) = \sin(x^2)\)

{ lambda ; a:f32[5]. let
    b:f32[5] = integer_pow[y=2] a
    c:f32[5] = sin b
  in (c,) }

{ lambda ; a:f32[5]. let
    b:f32[5] = integer_pow[y=2] a
    c:f32[5] = sin b
  in (c,) }

{ lambda ; a:f32[5]. let
    b:f32[5] = integer_pow[y=2] a
    c:f32[5] = sin b
  in (c,) }

immrax.natif: JAX Transformation for Interval Analysis

natif is fully compatible/composable with all other JAX transformations

Example Usage

import jax
import jax.numpy as jnp
from immrax import natif, interval

def f (x) :
  return jnp.sin(x**2)

F = natif(f)
print(F(interval(-0.5, 0.5)))

> 0.0 <= x <= 0.24740396

A Basic Prototype to Bound Theorem’s Inequality

\[ \begin{gathered} % x \in \param{\ox,\alpha,y}_g^k, \quad w\in \calW \implies \\ \dot{y}_k = E_k(t,{\mathring{x}},\alpha,y) \geq \sup_{ x\in\left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k,\, w\in \mathcal{W}} \underbrace{\left[\frac{\partial g_k}{\partial \alpha}(\alpha,x - {\mathring{x}}) [U(t,{\mathring{x}},\alpha,y)] + \frac{\partial g_k}{\partial x} (\alpha,x - {\mathring{x}}) [f(x,w) - f({\mathring{x}},{\mathring{w}})] \right]}_{=:\xi_k(x,w)} \end{gathered} \]

Let \([\underline{x},\overline{x}]\) be an interval containing \(\left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k\), and \(\mathcal{W}\subseteq [\underline{w},\overline{w}]\)
Use immrax.natif to obtain an inclusion function \(\Xi_k:\mathbb{I}\mathbb{R}^{n_x}\times\mathbb{I}\mathbb{R}^{n_w}\to\mathbb{I}\mathbb{R}^{n_y}\) for \(\xi_k\)
Since \(\xi_k(x,w) \in \Xi_k([\underline{x},\overline{x}],[\underline{w},\overline{w}])\), set \(E_k(t,{\mathring{x}},\alpha,y) = \overline{\Xi}_k([\underline{x},\overline{x}],[\underline{w},\overline{w}])\)

Example: Control Nudging Using Differentiable Verification

Controlled by a feed-forward trajectory \(u(t)\) from MPC on nominal trajectory
Input curve \(u(t)\) is differentiable with respect to safety violation

\[ \begin{align*} \dot{p}_x &= v \cos(\phi + \beta(u_2)), \\ \dot{p}_y &= v \sin(\phi + \beta(u_2)), \\ \dot{\phi} &=v\sin(\beta(u_2)), \\ \dot{v} &= u_1, \\ \beta(u_2) &= \arctan \left(\tfrac{1}{2}\tan(u_2)\right). \end{align*} \]

Dynamics Abstraction: Mixed Jacobian Linear Differential Inclusion

Theorem

Let \(X_1\times \cdots\times X_n\subseteq\mathbb{R}^n\) be an interval. Let \([\mathcal{M}]\subseteq\mathbb{R}^{m\times n}\) be an interval matrix satisfying \[ \frac{\partial f_i}{\partial x_j} (X_1,\dots,X_j,{\mathring{x}}_{j+1},\dots,{\mathring{x}}_{n}) \subseteq [\mathcal{M}]_{ij}. \] Then \[ f(x) - f({\mathring{x}}) \in [\mathcal{M}] (x - {\mathring{x}}), \] for every \(x\in X_1\times\cdots\times X_n\).

Fully automated using the immrax.mjacM transform

Special Cases Addressed in Our Work

Each of the following set representations have been implemented in immrax using the mixed Jacobian linear differential inclusion.

Set Rep.	Definition	Connections to the Literature (\(\dot{\alpha} = 0\))	Paper
Intervals	\(\{x : \underline{x}\leq x \leq \overline{x}\}\)	Interval Reachability, Mixed Monotone Embeddings	[9]
H-Polytopes	\(\{x : \alpha(x - {\mathring{x}}) \leq y\}\)	Templated H-Polytopes (bound push through faces)	[10], [11], [6]
Normotopes	\(\{x : \\|\alpha(x - {\mathring{x}})\\| \leq y\}\)	Norm-Based Contraction Theory (log norms)	[12], [13]

Neural Networks in the Loop

Use CROWN [14] to obtain the linear bounds \[ Cx + \underline{d}\leq \pi(x) \leq Cx + \overline{d}. \] Combine with mixed Jacobian LDI [15] to get \[ f(x,\pi(x)) - f({\mathring{x}},\pi({\mathring{x}})) \in \underbrace{([\mathcal{M}^x] + [\mathcal{M}^u]C)}_{\text{first-order interactions}}(x - {\mathring{x}}) + \underbrace{R}_{\text{remainder}} \]

GPU Parallelizable Verification

Controlled by \(4\times 100\times 100\times 2\) ReLU network imitating MPC

# Part.	CPU	GPU
\(1^4 = 1\)	\(.0112\)	\(.0178\)
\(2^4 = 16\)	\(.143\)	\(.0207\)
\(3^4 = 81\)	\(.627\)	\(.0306\)
\(4^4 = 256\)	\(1.44\)	\(.0489\)
\(5^4 = 625\)	\(4.60\)	\(.095\)
\(6^4 = 1296\)	\(11.1\)	\(.198\)

Training Robustly Forward Invariant Neural Networks

Problem: Invariance Training

Train the neural network \(\pi\) so that a prescribed set \(\mathcal{S}\) is robustly forward invariant.

Existing methods use sampling-based procedures to certify Lyapunov functions post-training [16]

Proposition

The pointwise embedding system check \[ \dot{{\mathring{x}}} = 0, \quad \dot{\alpha} = 0, \quad E({\mathring{x}},\alpha,y) \leq 0 \] implies \(\left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g\) is a forward invariant set.

Proof Sketch: \(\frac{\partial g_k}{\partial x}(\alpha,x - {\mathring{x}})[f(x,w)] \leq E_k({\mathring{x}},\alpha,y) \leq 0\) (barrier-like condition)

Training Robust Neural Networks: Polytope Forward Invariance

Embedding output \(E\) is differentiable with respect to neural network parameters \(\pi\)

Trainable Loss \[ \mathcal{L}(\pi) = \sum_{k} (\operatorname{ReLU}(E_k + \varepsilon)) \]

induces

Simple Invariance Condition \[ E({\mathring{x}},\alpha,y) \leq 0 \]

Nonlinear Segway Model [17]

\[ \begin{aligned} \begin{bmatrix} \dot{\phi} \\ \dot{v} \\ \ddot{\phi} \end{bmatrix} = \begin{bmatrix} \dot{\phi} \\ \frac{\cos\phi(-1.8u + 11.5v+9.8\sin\phi)-10.9u+68.4v-1.2\dot{\phi}^2\sin\phi}{\cos\phi-24.7} \\ \frac{(9.3u-58.8v)\cos\phi + 38.6u - 234.5v - \sin\phi(208.3+\dot{\phi}^2\cos\phi)}{\cos^2\phi - 24.7} \end{bmatrix} \end{aligned} \]

Trained in \(11.4\) seconds, compared to \(2758\) seconds for sampling-based [16]
Robust to \(\pm2\%\) uncertainty on all parameters

Training Robust Neural Networks: Polytope Forward Invariance

Embedding output \(E\) is differentiable with respect to neural network parameters \(\pi\)

Trainable Loss \[ \mathcal{L}(\pi) = \sum_{k} (\operatorname{ReLU}(E_k + \varepsilon)) \]

induces

Simple Invariance Condition \[ E({\mathring{x}},\alpha,y) \leq 0 \]

Nonlinear Vehicle Platoon

\[ \dot{p}_j = v_j, \quad \dot{v}_j = \sigma(u_j)(1 + w_j), \] \(\sigma(u)\) (softmax), \(w_j\in[-0.1,0.1]\). \[ u_j = \begin{cases} \pi((x_j,\ x_{j-3} - x_{j},\ x_{j} - x_{j+3})), & j=3k \\ \pi((0,\ x_{j-1} - x_j,\ x_j - x_{j+1})), & \text{ow}, \end{cases} \]

Shared policy, limited information

N	States	Training Time (s)
4	8	6.90
10	20	57.7
16	32	170.
22	44	408.
28	56	1040

Injecting Parameter Dynamics: Adjoint of the Linearization

In the previous slides, we only considered the \(\dot{\alpha} = 0\) case.

Recall for the linear system \(\dot{x} = Ax\), the \[ \dot{\alpha} = -\alpha A \] resulted in a cancellation leading to \(\dot{y} = 0\), and optimal constraints supporting the true reachable set.

For nonlinear systems \(\dot{x} = f(x)\), we consider the following as a good guess for “good” parameter dynamics. \[ \dot{\alpha} = -\alpha \frac{\partial f}{\partial x} ({\mathring{x}}) \]

We can incorporate this “feedback” term directly into the previously discussed embedding systems.

ARCH-COMP AINNCS Benchmarks: Efficiency and Accuracy

Benchmark	Instance	CORA	CROWN-Reach	JuliaReach	NNV	immrax
ACC	safe-distance	3.091	2.525	0.638	26.751	0.066
AttitudeControl	avoid	3.418	3.485	5.728	–	0.507
NAV	robust	1.990	20.902	5.197	405.291	42.384
TORA	reach-tanh	3.166	7.486	0.357	63.523	0.020
TORA	reach-sigmoid	6.001	5.293	0.458	118.312	0.023

Example: Large-Scale Vehicle Platoon

Second-order vehicle platoon, 4 states for each vehicle
Lead vehicle feed-forward MPC to navigate around obstacle, followers use nonlinear feedback to track displacement from leading agent

N	States	Runtime (s)
3	12	0.012
6	24	0.021
9	36	0.072
12	48	0.135
15	60	0.184
18	72	0.244

Adding parameter dynamics \(-\alpha \frac{\partial f}{\partial x}({\mathring{x}})\) does not significantly change runtime, but drastically improves performance

With \(\dot{\alpha} = 0\)

With \(\dot{\alpha} = -\alpha \frac{\partial f}{\partial x}({\mathring{x}})\)

Example: Robot Arm

Existing strategies: occasionally resynthesize norm using SDP [18]
Our approach: differentially update \(\alpha\)

\[ \begin{aligned} \dot{q}_1 &= z_1, \quad \dot{q}_2 = z_2, \\ \dot{z}_1 &= \tfrac{1}{mq_2^2 + ML^2/3}(-2mq_2z_1z_2 - k_{d_1}z_1 + k_{p_1}(u_1 - q_1)), \\ \dot{z}_2 &= q_2 z_1^2 + \tfrac1m (- k_{d_2} z_2 + k_{p_2} (u_2 - q_2)), \end{aligned} \]

Bounding full \([\mathcal{J}]\) LDI, \(\dot{\alpha} = 0\) [18]

Our interval \([\mathcal{M}]\) LDI, \(\dot{\alpha} = 0\)

Our \([\mathcal{M}]\) LDI, \(\dot{\alpha} = -\alpha \frac{\partial f}{\partial x}({\mathring{x}})\)

Example: Reach-iLQR for Van der Pol Oscillator

Minimizing the cost \[ \Phi({\mathring{x}},\alpha,y) := -\log\det(\alpha^T\alpha / y^2) \] minimizes the volume of the set \(\left[\!\left[ {\mathring{x}},\alpha,y \right]\!\right]\).

Terminal set volume is differentiable with respect to parameter dynamics \(\dot{\alpha} = U\)

Idea: Use numerical optimal control to synthesize \(\dot{\alpha} = U\). \[ U(t,{\mathring{x}},\alpha,y) = -\alpha \frac{\partial f}{\partial x}({\mathring{x}}) + U_\text{ff}(t) \]

Use iterative dynamic programming (iLQR) to directly optimize terminal reachable set volume

Differential Geometric Extensions

Reachable Sets via Lie Algebra

Motivation: Systems with Lie group states like a quadrotor with \(3\)D orientation: \[ \dot{R} = R\hat{\omega}. \] Propogating a full subset like \([\underline{R},\overline{R}] \subseteq \mathbb{R}^{3\times 3}\) is overconservative: 9 degrees of uncertainty for a 3-dim submanifold.

Choosing a local chart and applying an off the shelf reachable set tool warps the reachable set geometry

Idea: apply local Lie algebra equivalence.

Build locally equivalent system in Lie algebra
Apply existing reachability techniques
Exponentiate back to the original manifold
Transition between tangent spaces to avoid injectivity issues

Reachable Sets via Lie Algebra (examples)

Coupled Oscillators

\(SO(3)\) Single Integrator

Contraction Theory on Homogeneous Manifolds

Setting: Nonlinear system \[ \dot{x} = f(x), \] where \(x\in\mathcal{M}\) is a homogeneous Riemannian manifold and \(f\in\mathfrak{X}(\mathcal{M})\) is a vector field.

Homogeneous manifold: there is a transitive group action \(\lambda:\mathsf{G}\times\mathcal{M}\to\mathcal{M}\)
Invariant metric: the Riemannian metric is invariant under this group action, e.g., \[ \left\langle\hspace{-0.2em}\left\langle v,w \right\rangle\hspace{-0.2em}\right\rangle_{\mathbb{S}^2} = \left\langle\hspace{-0.2em}\left\langle Rv,Rw \right\rangle\hspace{-0.2em}\right\rangle_{\mathbb{S}^2}, \] for the case of \(SO(3)\) acting on \(\mathbb{S}^2\).
We provide a simplified logarithmic norm condition for contraction.

Contraction Theory on Homogeneous Manifolds (examples)

Nonexpansive Single Integrator on \(SO(3)\)

Systems cannot contract sets containing the image of a one-parameter subgroup through the action \(\lambda\).

Example: Spheres cannot contract any set containing a great circle.

Conclusion

Controlled parametric embeddings allow for efficient reachable set computations with a hypercontrol dynamically shaping the set
Simple theoretical modifications (e.g., adding \(\dot{\alpha} = -\alpha \frac{\partial f}{\partial x}({\mathring{x}})\)) can drastically improve reachable set estimates for little computational cost
immrax is our implementation of interval analysis using JAX for Python, supporting JIT compilation, automatic differentiation, and GPU parallelization.
[output] is differentiable with respect to [input]—allows for safety filtering, training robust neural network controllers, and directly optimizing the reachable set volume

Thank you for your attention!

https://github.com/gtfactlab/immrax
pip install immrax

https://akashhara.com

References

[1]

S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin, “Hamilton-Jacobi reachability: A brief overview and recent advances,” in 56th Conference on Decision and Control (CDC), 2017, pp. 2242–2253. doi: 10.1109/CDC.2017.8263977

[2]

S. Bansal and C. J. Tomlin, “Deepreach: A deep learning approach to high-dimensional reachability,” in 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2021, pp. 1817–1824.

[3]

M. Althoff, G. Frehse, and A. Girard, “Set propagation techniques for reachability analysis,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 4, no. 1, pp. 369–395, 2021.

[4]

P. Varaiya, “Reach set computation using optimal control,” in Verification of Digital and Hybrid Systems, Springer, 2000, pp. 323–331.

[5]

S. M. Harwood and P. I. Barton, “Efficient polyhedral enclosures for the reachable set of nonlinear control systems,” Mathematics of Control, Signals, and Systems, vol. 28, pp. 1–33, 2016.

[6]

A. Harapanahalli and S. Coogan, “Parametric reachable sets via controlled dynamical embeddings,” arXiv preprint arXiv:2504.06955, 2025.

[7]

L. Jaulin, M. Kieffer, O. Didrit, and E. Walter, Applied interval analysis, 1st Edition. Springer, 2001.

[8]

J. Bradbury et al., “JAX: Composable transformations of Python+NumPy programs.” 2018. Available: http://github.com/jax-ml/jax

[9]

A. Harapanahalli, S. Jafarpour, and S. Coogan, “Immrax: A parallelizable and differentiable toolbox for interval analysis and mixed monotone reachability in JAX,” IFAC-PapersOnLine, vol. 58, no. 11, pp. 75–80, 2024, doi: https://doi.org/10.1016/j.ifacol.2024.07.428. Available: https://www.sciencedirect.com/science/article/pii/S2405896324005275

[10]

A. Harapanahalli and S. Coogan, “Certified robust invariant polytope training in neural controlled odes,” arXiv preprint arXiv:2408.01273, 2024.

[11]

B. Gould, A. Harapanahalli, and S. Coogan, “Automatic and scalable safety verification using interval reachability with subspace sampling,” IEEE Control Systems Letters, 2025.

[12]

A. Harapanahalli and S. Coogan, “A linear differential inclusion for contraction analysis to known trajectories,” arXiv preprint arXiv:2411.11587, 2024.

[13]

A. Harapanahalli and S. Coogan, “Efficient norm-based reachable sets via iterative dynamic programming,” arXiv preprint arXiv:2408.01273, 2025.

[14]

H. Zhang, T.-W. Weng, P.-Y. Chen, C.-J. Hsieh, and L. Daniel, “Efficient neural network robustness certification with general activation functions,” Advances in neural information processing systems, vol. 31, 2018.

[15]

S. Jafarpour, A. Harapanahalli, and S. Coogan, “Efficient interaction-aware interval analysis of neural network feedback loops,” IEEE Transactions on Automatic Control, 2024.

[16]

Y. Huang, I. D. J. Rodriguez, H. Zhang, Y. Shi, and Y. Yue, “Fi-ode: Certified and robust forward invariance in neural odes,” arXivorg, 2023.

[17]

T. Gurriet, A. Singletary, J. Reher, L. Ciarletta, E. Feron, and A. Ames, “Towards a framework for realizable safety critical control through active set invariance,” in 2018 ACM/IEEE 9th international conference on cyber-physical systems (ICCPS), IEEE, 2018, pp. 98–106.

[18]

C. Fan, J. Kapinski, X. Jin, and S. Mitra, “Simulation-driven reachability using matrix measures,” ACM Transactions on Embedded Computing Systems (TECS), vol. 17, no. 1, pp. 1–28, 2017.

[19]

A. Harapanahalli and S. Coogan, “Efficient reachable sets on Lie groups using Lie algebra monotonicity and tangent intervals,” in 2024 IEEE 63rd Conference on Decision and Control (CDC), IEEE, 2024, pp. 695–702.

[20]

A. Harapanahalli and S. Coogan, “A global coordinate-free approach to invariant contraction on homogeneous manifolds,” in 2025 American Control Conference (ACC), IEEE, 2025, pp. 4004–4010.

[21]

K. Shen and J. K. Scott, “Rapid and accurate reachability analysis for nonlinear dynamic systems by exploiting model redundancy,” Computers & Chemical Engineering, vol. 106, pp. 596–608, 2017.

Example: Jacobian-based Inclusion Function

Definition (Jacobian-Based Inclusion Function)

Let \(f:\mathbb{R}^n\to\mathbb{R}^m\) be differentiable, with inclusion function \(\mathsf{J}\) for its Jacobian as \(\frac{\partial f}{\partial x}(x) \in [\mathsf{J}(\underline{x},\overline{x})]\) for every \(x\in[\underline{x},\overline{x}].\) Let \({\mathring{x}}\in[\underline{x},\overline{x}]\).

\[ \begin{align*} \mathsf{F}(\underline{x},\overline{x}) = [\mathsf{J}(\underline{x},\overline{x})]([\underline{x},\overline{x}] - {\mathring{x}}) + f({\mathring{x}}) \end{align*} \]

is a Jacobian-based inclusion function of \(f\).

import jax
import jax.numpy as jnp
from immrax import natif

def jacif (f) :
  J = natif(jax.jacfwd(f))
  def F (ix, xc) :
    return J(ix)@(ix-xc) + f(xc)

Special Case 1: Intervals (mixed monotonicity)

Let \(g(\alpha,x - {\mathring{x}}) = \alpha (x - {\mathring{x}})\), and \(\alpha = \left[\begin{smallmatrix} -\mathbf{I} \\ \mathbf{I} \end{smallmatrix}\right]\). Then \[ \left(\!\left( {\mathring{x}},\left[\begin{smallmatrix} -\mathbf{I} \\ \mathbf{I} \end{smallmatrix}\right],\left[\begin{smallmatrix} -\underline{y} \\ \overline{y} \end{smallmatrix}\right] \right)\!\right)_g = \{x : \underline{y}\leq x - {\mathring{x}}\leq \overline{y}\} = [{\mathring{x}}+ \underline{y},{\mathring{x}}+ \overline{y}] . \] If we set \(\dot{\alpha} = 0\), Theorem’s inequality reduces to (for an upper \(k\)-th face) \[ \sup_{\substack{x\in[{\mathring{x}}+ \underline{y},{\mathring{x}}+ \overline{y}] \\ e_k^T(x - {\mathring{x}}) = \overline{y}_k}} e_k^T (f(x) - f({\mathring{x}})) = \sup_{\substack{x\in[{\mathring{x}}+ \underline{y},{\mathring{x}}+ \overline{y}] \\ x_k = {\mathring{x}}_k + \overline{y}_k}} (f_k(x) - f_k({\mathring{x}})) \leq \overline{\mathsf{F}}_k ([{\mathring{x}}+ \underline{y}_{k\leftarrow\overline{y}_k},{\mathring{x}}+ \overline{y}]) - f_k({\mathring{x}}) =: \dot{\overline{y}}_k \]

Easily bound \(\dot{\overline{y}}_k\) through evaluations of inclusion function \(\mathsf{F}\) on the “flattened” interval \([{\mathring{x}}+ \underline{y}_{k\leftarrow\overline{y}_k},{\mathring{x}}+ \overline{y}]\)
Resulting embedding system \[ \dot{{\mathring{x}}} = f({\mathring{x}}), \quad \dot{\alpha} = 0, \quad \dot{\underline{y}} = \underline{E}({\mathring{x}},\underline{y},\overline{y}), \quad \dot{\overline{y}} = \overline{E}({\mathring{x}},\underline{y},\overline{y}) \] is monotone with respect to the order \((=,=,\geq,\leq)\).

Special Case 3: Norms (contraction theory)

Suppose we have a linear differential inclusion \[ f(x,w) - f({\mathring{x}},{\mathring{w}}) \in \operatorname{co}\{M_i\}_i (x - {\mathring{x}}) + \operatorname{co}\{M_j\}_j (w - {\mathring{w}}), \] then the following is a valid embedding system [], \[ \begin{aligned} \dot{{\mathring{x}}} &= f({\mathring{x}},{\mathring{w}}) \\ \dot{\alpha} &= U \\ \dot{y} &= \big(\max_{i} \mu(U\alpha^{-1} + \alpha M^x_i \alpha^{-1}) \big) y + \max_j \|\alpha M^w_j\|_{w\to x} \end{aligned} \]

\(\|A\|_{w\to x}\) is the operator norm
\(\mu\) is the logarithmic norm (usually known in closed-form) \[ \mu(A) = \limsup_{h\searrow0}\tfrac{1}{h} (\|\mathbf{I}+ hA\| - 1) \]
\(\dot{\alpha} = 0\) recovers results from norm-based contraction theory

Definition: Normotopes

Efficient and Differentiable Reachable Sets for Learning-Enabled Control Systems

Introduction

Acknowledgements

Motivation

Robustness Analysis of Neural Networks in the Loop

Reachable Set Computation

Existing Methods

In This Talk

Parametric Reachable Sets

The Linear Systems Case: Supporting Hyperplanes

The Linear Systems Case: H-Polytopes

The Linear Systems Case: Nonlinear Supports

The Linear Systems Case: Ellipsoids as \(\ell_2\) Normotopes

Nonlinear Systems: Templated H-Polytopes

Connection: Parametric Embedding System

A Quick Preview

Parametopes: General Constraint-Based Set Representation

Theory: Reachable Sets via Controlled Parametric Embeddings

Theorem Intuition: Geometric Interpretation of Terms

Theorem Intuition: Bound on Facets

Theorem Inequality: Computational Bounds

immrax: Interval Analysis in JAX

Interval Analysis

Inclusion Primitives

Basics of JAX: A Numerical Computation Library for Python

Basics of JAX: A Numerical Computation Library for Python

immrax.natif: JAX Transformation for Interval Analysis

immrax.natif: JAX Transformation for Interval Analysis

A Basic Prototype to Bound Theorem’s Inequality

Example: Control Nudging Using Differentiable Verification

Dynamics Abstraction: Mixed Jacobian Linear Differential Inclusion

Special Cases Addressed in Our Work

Neural Networks in the Loop

GPU Parallelizable Verification

Training Robustly Forward Invariant Neural Networks

Training Robust Neural Networks: Polytope Forward Invariance

Training Robust Neural Networks: Polytope Forward Invariance

Injecting Parameter Dynamics: Adjoint of the Linearization

ARCH-COMP AINNCS Benchmarks: Efficiency and Accuracy

Example: Large-Scale Vehicle Platoon

Example: Robot Arm

Example: Reach-iLQR for Van der Pol Oscillator

Differential Geometric Extensions

Reachable Sets via Lie Algebra

Reachable Sets via Lie Algebra (examples)

Contraction Theory on Homogeneous Manifolds

Contraction Theory on Homogeneous Manifolds (examples)

Conclusion

References

Example: Jacobian-based Inclusion Function

Special Case 1: Intervals (mixed monotonicity)

Special Case 2: H-Polytopes (lifting + refinement)

Special Case 3: Norms (contraction theory)

`immrax`: Interval Analysis in JAX