Efficient and Differentiable Reachable Sets for Learning-Enabled Control Systems

SCALE Seminar @ UCB

Akash Harapanahalli

PhD Student

School of Electrical and Computer Engineering, Georgia Tech

2025-09-29

Introduction

Acknowledgements

Dr. Samuel Coogan
Associate Professor
Georgia Institute of Technology

Dr. Saber Jafarpour
Research Assistant Professor
University of Colorado Boulder

Brendan Gould
PhD Student
Georgia Institute of Technology

Motivation

  • Learning-enabled components are becoming increasingly intertwined with our day-to-day lives
  • There are little formal guarantees of safe operation, which is dangerous when applied in safety-critical domains

Robustness Analysis of Neural Networks in the Loop

  • Problem setting: Continuous-time system in feedback with neural network
  • Efficient and scalable certification of neural network controlled systems
  • Trainable conditions: backpropogate information to improve robustness

Image Credit: OpenAI

Reachable Set Computation

Consider the closed-loop system \[ \dot{x} = f(x,w), \] \(x\in\mathbb{R}^{n_x}\), \(w\in\mathbb{R}^{n_w}\).

Problem: Reach-Avoid
Given uncertain initial state \(x_0\in\mathcal{X}_0\) and adversarial disturbances \(w\in\mathcal{W}\), certify that the agent avoids unsafe states \(\mathcal{O}\) for every \(t\), while reaching a goal set \(\mathcal{G}\): \[ \forall t\in[t_0,t_f]\ \mathcal{R}(t;t_0,\mathcal{X}_0,\mathcal{W}) \subseteq \mathcal{O}^\complement, \quad \mathcal{R}(t_f;t_0,\mathcal{X}_0,\mathcal{W}) \subseteq \mathcal{G}. \]

Possible use-cases:

  • If detecting potential collision with \(\mathcal{O}\), swap to a safe backup strategy, or filter the input
  • If not guaranteed to reach \(\mathcal{G}\), resynthesize input trajectory

Existing Methods

Crucial Tradeoffs: how to balance accuracy, efficiency, and guarantees?

Level Set Methods [1]

Finds a function \(g(t,x)\) whose \(0\) sublevel set \(\{x : g(t,x) \leq 0\}\) is the reachable set of the system at time \(t\).

  • Gives exact reachable set
  • Need to solve a partial differential equation. Hard to find true closed-form solutions, scalability concerns
  • Recent works approximate PDE solution using neural networks [2]

Set-Based Methods [3]

Propogates set-overapproximations through the system dynamics

  • Guaranteed overapproximating reachable set
  • Simpler set representations suffer from overconservatism from wrapping effects
  • Complicated set representations suffer from algorithmic complexity

In This Talk

Simple theoretical modifications \(\implies\) accuracy benefits for little computational cost

Theory: A general framework for constraint-based reachable set propogation using a controlled parametric embedding system.

  • Number of parameters remain constant
  • Continuous-time system \(\to\) continuous-time embedding system bounding its behavior
  • Amenable to simple bounding strategies like interval analysis
  • New: A hypercontrol input differentially shaping the reachable set

Computation: An efficient implementation using our tool immrax for interval analysis in JAX for Python

  • Just-in-Time (JIT) compilation for efficient runtime computations
  • Automatic differentiation to directly backpropogate safety violations

Differential Geometry: Extensions of these ideas for safety verification of systems evolving on Lie groups and homogeneous manifolds

Parametric Reachable Sets

A theoretical framework for reachable set computation using a controlled dynamical embedding system

The Linear Systems Case: Supporting Hyperplanes

\[ \dot{x} = Ax, \] with initial condition \(x_0\) inside a halfspace \[ \alpha_0(x_0 - {\mathring{x}}_0) \leq y_0, \] \(\alpha_0\in\mathbb{R}^{1\times{n_x}}\), \({\mathring{x}}_0\in\mathbb{R}^{n_x}\), \(y_0\in\mathbb{R}\). How does the halfspace evolve?

If we set \(\dot{{\mathring{x}}} = A{\mathring{x}}\) and \(\dot{\alpha}^T = -A^T\alpha^T\), [4] \[ \tfrac{\mathrm{d}}{\mathrm{d}t} [\alpha (x - {\mathring{x}})] = -\alpha A(x - {\mathring{x}}) + \alpha A(x - {\mathring{x}}) = 0. \] Thus the trajectory of the ODE \[ \dot{{\mathring{x}}} = A{\mathring{x}}, \ \dot{\alpha} = -\alpha A, \] yields supporting halfspace \(\{x : \alpha(t)(x - {\mathring{x}}(t)) \leq y_0\}\).

The Linear Systems Case: H-Polytopes

\[ \dot{x} = Ax, \] with initial condition \(x_0\) inside an H-rep polytope \[ \alpha_0(x_0 - {\mathring{x}}_0) \leq y_0, \] \(\alpha_0\in\mathbb{R}^{{n_y}\times{n_x}}\), \({\mathring{x}}_0\in\mathbb{R}^{n_x}\), \(y_0\in\mathbb{R}^{n_y}\), \(\leq\) is element-wise. Flow each row of \(\alpha\) using the adjoint dynamics \[ \dot{{\mathring{x}}} = A{\mathring{x}}, \ \dot{\alpha} = -\alpha A, \] then \[ \alpha(t)(x(t) - {\mathring{x}}(t)) \leq y_0. \]

The Linear Systems Case: Nonlinear Supports

\[ \dot{x} = Ax, \] with initial condition \(x_0\) satisfying \[ h(\alpha_0(x_0 - {\mathring{x}}_0)) \leq y_0. \] Flow \(\alpha\) using the adjoint dynamics \[ \dot{{\mathring{x}}} = A{\mathring{x}}, \ \dot{\alpha} = -\alpha A, \] then we still get \[ \tfrac{\mathrm{d}}{\mathrm{d}t} h(\alpha(x - {\mathring{x}})) = \tfrac{\partial h}{\partial z} [-\alpha A(x - {\mathring{x}}) + \alpha A(x - {\mathring{x}})] = 0, \] and therefore that \[ h(\alpha(t)(x(t) - {\mathring{x}}(t))) \leq y_0. \]

The Linear Systems Case: Ellipsoids as \(\ell_2\) Normotopes

\[ \dot{x} = Ax, \] with initial condition \(x_0\) satisfying \[ \|\alpha_0(x_0 - {\mathring{x}}_0)\|_2 \leq y_0. \] Flow \(\alpha\) using the adjoint dynamics \[ \dot{{\mathring{x}}} = A{\mathring{x}}, \ \dot{\alpha} = -\alpha A, \] then \[ \|\alpha(t)(x(t) - {\mathring{x}}(t))\|_2 \leq y_0. \]

  • \(\ell_2\)-norm balls with time-varying shape matrix

Nonlinear Systems: Templated H-Polytopes

\[ \dot{x} = f(x), \] with initial condition \(x_0\) satisfying \[ \alpha_0(x_0 - {\mathring{x}}_0) \leq y_0. \] Fix \(\dot{\alpha} = 0\) and flow offsets \(y\) such that \[ \begin{aligned} \dot{{\mathring{x}}} &= f({\mathring{x}}) \\ \dot{y}_k &\geq \sup_{\substack{x:\alpha(x - {\mathring{x}}) \leq y \\ \alpha_k(x - {\mathring{x}}) = y_k}} \alpha_k (f(x) - f({\mathring{x}})) \end{aligned} \] then \[ \alpha_0(x(t) - {\mathring{x}}(t)) \leq y(t) \]

Connection: Parametric Embedding System

Summary: \(x(t)\in\{x : \alpha(t)(x - {\mathring{x}}(t)) \leq y(t)\}\) if

Linear System: Evolve parameters \(\alpha\) using the adjoint dynamics \[ \begin{aligned} \dot{{\mathring{x}}} &= A{\mathring{x}}, \\ \dot{\alpha} &= -\alpha A \\ \dot{y} &= 0 \end{aligned} \] (this generalized to nonlinear functions of \(\alpha(x - {\mathring{x}})\))

Nonlinear System: Evolve offsets \(y_k\) using bound of vector field in the direction of row \(\alpha_k\) \[ \begin{aligned} \dot{{\mathring{x}}} &= f({\mathring{x}}) \\ \dot{\alpha} &= 0 \\ \dot{y}_k &\geq \sup_{\substack{x:\alpha(x - {\mathring{x}}) \leq y \\ \alpha_k(x - {\mathring{x}}) = y_k}} \alpha_k (f(x) - f({\mathring{x}})) \end{aligned} \]

Our Idea:

  • Combine \(\alpha\) evolution with offset bounding for nonlinear systems
  • Generalize to any constraint-based set representation

A Quick Preview

We can define a new class of sets as the intersection of \(\alpha\)-parameterized sublevel sets: \[ \{x : g(\alpha,x - {\mathring{x}}) \leq y\} = \bigcap_{k=1}^{{n_y}} \{x : g_k(\alpha,x - {\mathring{x}}) \leq y_k\}, \] for centering point \({\mathring{x}}\in\mathbb{R}^{n_x}\), parameters \(\alpha\in\mathbb{R}^{n_\alpha}\), offsets \(y\in\mathbb{R}^{n_y}\), and nonlinearity \(g:\mathbb{R}^{n_\alpha}\times\mathbb{R}^{n_x}\to\mathbb{R}^{n_y}\).

We will define dynamics on \({\mathring{x}},\alpha,y\) \[ \dot{{\mathring{x}}} = f({\mathring{x}}), \quad \dot{\alpha} = ??, \quad \dot{y} = ??, \] such that every trajectory of the system was contained in the trajectory of this embedding system as \[ x_0\in \{x : g(\alpha(t_0),x - {\mathring{x}}(t_0)) \leq y(t_0)\} \implies x(t) \in \{x : g(\alpha(t),x - {\mathring{x}}(t)) \leq y(t)\}. \]

Parametopes: General Constraint-Based Set Representation

Definition 1: Parametope
A parametope is the set \[ \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g := \{x\in\mathbb{R}^{n_x}: g(\alpha,x - {\mathring{x}}) \leq y\}, \] where \({\mathring{x}}\in\mathbb{R}^{n_x}\) (center), \(\alpha\in\mathbb{R}^{n_\alpha}\) (parameters), \(y\in\mathbb{R}^{n_y}\) (offset), and \(g:\mathbb{R}^{n_\alpha}\times\mathbb{R}^{n_x}\to\mathbb{R}^{n_y}\) (nonlinearity).

Notationally we denote the \(k\)-th facet \[ \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)^k_g := \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g \cap \{x\in\mathbb{R}^{n_x}: g_k(\alpha,x - {\mathring{x}}) = y_k\} \]

Example: Normotopes
The sublevel set of norm \(\|\cdot\|\) shaped by \(\alpha\in\mathbb{R}^{n\times n}\) \[ \left[\!\left[ {\mathring{x}},\alpha,y \right]\!\right] := \{x\in\mathbb{R}^{n_x}: \|\alpha(x - {\mathring{x}})\| \leq y\}. \]

Example: Polytopes
With \(g(\alpha,x - {\mathring{x}}) = \alpha(x - {\mathring{x}})\), the H-rep polytope \[ \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g = \{x\in\mathbb{R}^{n_x}: \alpha(x - {\mathring{x}}) \leq y\}. \]

Theory: Reachable Sets via Controlled Parametric Embeddings

Define an embedding system, \[ \begin{aligned} \dot{{\mathring{x}}} &= f({\mathring{x}},{\mathring{w}}), \\ \dot{\alpha} &= U(t,{\mathring{x}},\alpha,y) \\ \dot{y} &= E(t,{\mathring{x}},\alpha,y) \end{aligned} \] evolving on \(\mathbb{R}^{n_x}\times\mathbb{R}^{n_\alpha}\times\mathbb{R}^{n_y}\), for some nominal disturbance \({\mathring{w}}(t)\).

  • \(U\) is arbitrary—we call this the hypercontrol
  • \(E\) needs to satisfy an inequality (right)

Theorem: Parametric reachable sets
Let \(({\mathring{x}}(t),\alpha(t),y(t))\) be the trajectory of the embedding system (left). If \(E\) is such that for every \(k\), \[ \begin{gathered} (x,w) \in \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k \times \mathcal{W}\implies \\ \begin{aligned} \dot{y}_k = E_k(t,{\mathring{x}},\alpha,y) \geq &\ \frac{\partial g_k}{\partial \alpha}(\alpha,x - {\mathring{x}}) [U(t,{\mathring{x}},\alpha,y)] \\ &+ \frac{\partial g_k}{\partial x} (\alpha,x - {\mathring{x}}) [f(x,w) - f({\mathring{x}},{\mathring{w}})] \end{aligned} \end{gathered} \] and \(\left(\!\left( {\mathring{x}}(t),\alpha(t),y(t) \right)\!\right)_g\) has transverse facets, then for \(t\geq t_0\), \[ \mathcal{R}_f(t;t_0,\left(\!\left( {\mathring{x}}(t_0),\alpha(t_0),y(t_0) \right)\!\right)_g,\mathcal{W}) \subseteq \left(\!\left( {\mathring{x}}(t),\alpha(t),y(t) \right)\!\right)_g. \]

Theorem Intuition: Geometric Interpretation of Terms

\[ \begin{gathered} \forall k, \quad x \in \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k = \{x\in\mathbb{R}^{n_x}: g(\alpha,x - {\mathring{x}}) \leq y,\, g_k(\alpha,x - {\mathring{x}}) = y_k\}, \quad w\in \mathcal{W}\implies \\ \dot{y}_k = E_k(t,{\mathring{x}},\alpha,y) \geq \frac{\partial g_k}{\partial \alpha}(\alpha,x - {\mathring{x}}) [U(t,{\mathring{x}},\alpha,y)] + \frac{\partial g_k}{\partial x} (\alpha,x - {\mathring{x}}) [f(x,w) - f({\mathring{x}},{\mathring{w}})] \end{gathered} \]

Term 1: infinitesimal change in offset due to the infinitesimal change \(U\) in the parameters

Term 2: the component of \(f\) in the direction of the normal vector \(\frac{\partial g_k}{\partial x}\) to the \(k\)-th facet at the point \(x\)

Theorem Intuition: Bound on Facets

Our inequality is reminiscent of some standard results \[ \begin{gathered} \forall k, \quad x \in \left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k = \{x\in\mathbb{R}^{n_x}: g(\alpha,x - {\mathring{x}}) \leq y,\, g_k(\alpha,x - {\mathring{x}}) = y_k\}, \quad w\in \mathcal{W}\implies \\ \dot{y}_k = E_k(t,{\mathring{x}},\alpha,y) \geq \frac{\partial g_k}{\partial \alpha}(\alpha,x - {\mathring{x}}) [U(t,{\mathring{x}},\alpha,y)] + \frac{\partial g_k}{\partial x} (\alpha,x - {\mathring{x}}) [f(x,w) - f({\mathring{x}},{\mathring{w}})] \end{gathered} \]

  1. Nagumo’s theorem: \(\mathcal{S}\) is invariant if \(f(x)\) points inside \(\mathcal{S}\) for every \(x\in\partial\mathcal{S}\) (\(\partial\mathcal{S}\) is analogous to \(\left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k\))
  2. Kamke condition: monotone if \(a\leq b\), \(a_k = b_k\) implies \(f_k(a) \leq f_k(b)\) (analogous to \(g_k(\alpha,x - {\mathring{x}}) = y_k\))

Statements about the flow become boundary conditions on the system vector field \(f\) in continuous-time

Technical assumption: Require transverse facet intersections, meaning \[ \left\{\frac{\partial g_k}{\partial x}(\alpha,x - {\mathring{x}}) : \forall k \text{ s.t. } g_k(\alpha,x - {\mathring{x}}) = y_k\right\} \] is a linearly independent subset of \((\mathbb{R}^{n_x})^*\).

Theorem Inequality: Computational Bounds

\[ \begin{gathered} % x \in \param{\ox,\alpha,y}_g^k, \quad w\in \calW \implies \\ \dot{y}_k = E_k(t,{\mathring{x}},\alpha,y) \geq \sup_{ x\in\left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k,\, w\in \mathcal{W}} \left[\frac{\partial g_k}{\partial \alpha}(\alpha,x - {\mathring{x}}) [U(t,{\mathring{x}},\alpha,y)] + \frac{\partial g_k}{\partial x} (\alpha,x - {\mathring{x}}) [f(x,w) - f({\mathring{x}},{\mathring{w}})] \right] \end{gathered} \] How can we obtain comuptationally efficient and guaranteed bounds of this inequality?

  • Use interval analysis as an efficient function bounding tool
  • In many special cases (intervals, polytopes, normotopes), there are simple and accurate forms
  • Linear differential inclusions for first-order dynamics abstraction

immrax: Interval Analysis in JAX

Efficient, Scalable, and Differentiable Implementation using JAX for Python

Interval Analysis

Definition: Order, Intervals
Element-wise Order: \(x,y\in\mathbb{R}^n\), \[ x \leq y \iff \ \forall i,\, x_i \leq y_i \] Interval: \(\mathbb{IR}^n \ni [\underline{x},\overline{x}] = \{x : \underline{x}\leq x \leq \overline{x}\}\subset \mathbb{R}^n\)

Definition: Inclusion Function
Given \(f:\mathbb{R}^n\to\mathbb{R}^m\), \(\mathsf{F}=[\underline{\mathsf{F}},\overline{\mathsf{F}}]:\mathbb{IR}^n\to\mathbb{IR}^m\) is an inclusion function [7] if for every \(x\in[\underline{x},\overline{x}]\in\mathbb{IR}^n\), \[ f(x) \in [\underline{\mathsf{F}}(\underline{x},\overline{x}),\overline{\mathsf{F}}(\underline{x},\overline{x})]. \]

  • Automatable through composition: \(\mathsf{F}= \mathsf{F}_1 \circ \mathsf{F}_2\) is an inclusion function for \(f = f_1\circ f_2\)

Inclusion Primitives

Definition: Minimal Inclusion Function
The minimal inclusion function is \[ \mathsf{F}_i(\underline{x},\overline{x}) = \left[\inf_{x\in[\underline{x},\overline{x}]} f_i(x), \sup_{x\in[\underline{x},\overline{x}]} f_i(x)\right] \]

Example: Scalar Continuous Function
Let \(\mathcal{S}= \{\text{critical points}\}\cap[\underline{x},\overline{x}]\). \[ \mathsf{F}(\underline{x},\overline{x}) = \left[\min_{x\in\mathcal{S}\cup\{\underline{x},\overline{x}\}}f(x),\max_{x\in\mathcal{S}\cup\{\underline{x},\overline{x}\}}f(x)\right], \]

Example: Natural Inclusion Function
Let \(\textsf{SIN}\) be an inclusion function for \(\sin\) and \(\textsf{POW}_2\) be an inclusion function for \((\cdot)^2\). Then \[ \textsf{SIN} \circ \textsf{POW}_2 \] is an inclusion function for \(\sin((\cdot)^2)\).

Compositional strategy:

  1. Define a library of primitives with defined minimal inclusion functions
  2. Break down function into its primitive components
  3. Replace primitive components with inclusion functions

Basics of JAX: A Numerical Computation Library for Python

Feature: jit(f): Just-in-time Compilation
JIT compiled version of the function

Efficient runtime execution

Feature: vmap(f): Vectorizing/Parallelizing
Vectorized function along specified axis of input array

Dispatch parallel computations to GPU when applicable

Feature: {jvp,vjp,jacfwd,grad}(f): Automatic Differentiation
Forward- or Reverse-mode automatic differentiated function

[Input] is differentiable with respect to [Output]

  • These transformations are arbitrarily composable.

Basics of JAX: A Numerical Computation Library for Python

Example Usage

import jax
import jax.numpy as jnp

def f (x) :
  return jnp.sin(x**2)

f_grad = jax.grad(f)
print(f_grad(.5), 2*.5*jnp.cos(.5**2))

> 0.9689124 0.9689124

  • These transformations are arbitrarily composable.

immrax.natif: JAX Transformation for Interval Analysis

  • natif is fully compatible/composable with all other JAX transformations

Definition: immrax.Interval Class
lower and upper attribute for lower and upper bound of interval.

Feature: natif(f): Natural Inclusion Function
Traces f into a JAX program, and interprets primitives using their minimal inclusion functions.

Example: \(f(x) = \sin(x^2)\)
{ lambda ; a:f32[5]. let
    b:f32[5] = integer_pow[y=2] a
    c:f32[5] = sin b
  in (c,) }
{ lambda ; a:f32[5]. let
    b:f32[5] = integer_pow[y=2] a
    c:f32[5] = sin b
  in (c,) }
{ lambda ; a:f32[5]. let
    b:f32[5] = integer_pow[y=2] a
    c:f32[5] = sin b
  in (c,) }

immrax.natif: JAX Transformation for Interval Analysis

  • natif is fully compatible/composable with all other JAX transformations

Example Usage

import jax
import jax.numpy as jnp
from immrax import natif, interval

def f (x) :
  return jnp.sin(x**2)

F = natif(f)
print(F(interval(-0.5, 0.5)))

> 0.0 <= x <= 0.24740396

A Basic Prototype to Bound Theorem’s Inequality

\[ \begin{gathered} % x \in \param{\ox,\alpha,y}_g^k, \quad w\in \calW \implies \\ \dot{y}_k = E_k(t,{\mathring{x}},\alpha,y) \geq \sup_{ x\in\left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k,\, w\in \mathcal{W}} \underbrace{\left[\frac{\partial g_k}{\partial \alpha}(\alpha,x - {\mathring{x}}) [U(t,{\mathring{x}},\alpha,y)] + \frac{\partial g_k}{\partial x} (\alpha,x - {\mathring{x}}) [f(x,w) - f({\mathring{x}},{\mathring{w}})] \right]}_{=:\xi_k(x,w)} \end{gathered} \]

  1. Let \([\underline{x},\overline{x}]\) be an interval containing \(\left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g^k\), and \(\mathcal{W}\subseteq [\underline{w},\overline{w}]\)
  2. Use immrax.natif to obtain an inclusion function \(\Xi_k:\mathbb{I}\mathbb{R}^{n_x}\times\mathbb{I}\mathbb{R}^{n_w}\to\mathbb{I}\mathbb{R}^{n_y}\) for \(\xi_k\)
  3. Since \(\xi_k(x,w) \in \Xi_k([\underline{x},\overline{x}],[\underline{w},\overline{w}])\), set \(E_k(t,{\mathring{x}},\alpha,y) = \overline{\Xi}_k([\underline{x},\overline{x}],[\underline{w},\overline{w}])\)

Example: Control Nudging Using Differentiable Verification

  • Controlled by a feed-forward trajectory \(u(t)\) from MPC on nominal trajectory
  • Input curve \(u(t)\) is differentiable with respect to safety violation

\[ \begin{align*} \dot{p}_x &= v \cos(\phi + \beta(u_2)), \\ \dot{p}_y &= v \sin(\phi + \beta(u_2)), \\ \dot{\phi} &=v\sin(\beta(u_2)), \\ \dot{v} &= u_1, \\ \beta(u_2) &= \arctan \left(\tfrac{1}{2}\tan(u_2)\right). \end{align*} \]

Dynamics Abstraction: Mixed Jacobian Linear Differential Inclusion

Theorem
Let \(X_1\times \cdots\times X_n\subseteq\mathbb{R}^n\) be an interval. Let \([\mathcal{M}]\subseteq\mathbb{R}^{m\times n}\) be an interval matrix satisfying \[ \frac{\partial f_i}{\partial x_j} (X_1,\dots,X_j,{\mathring{x}}_{j+1},\dots,{\mathring{x}}_{n}) \subseteq [\mathcal{M}]_{ij}. \] Then \[ f(x) - f({\mathring{x}}) \in [\mathcal{M}] (x - {\mathring{x}}), \] for every \(x\in X_1\times\cdots\times X_n\).

  • Fully automated using the immrax.mjacM transform

Special Cases Addressed in Our Work

Each of the following set representations have been implemented in immrax using the mixed Jacobian linear differential inclusion.

Set Rep. Definition Connections to the Literature (\(\dot{\alpha} = 0\)) Paper
Intervals \(\{x : \underline{x}\leq x \leq \overline{x}\}\) Interval Reachability, Mixed Monotone Embeddings [9]
H-Polytopes \(\{x : \alpha(x - {\mathring{x}}) \leq y\}\) Templated H-Polytopes (bound push through faces) [10], [11], [6]
Normotopes \(\{x : \|\alpha(x - {\mathring{x}})\| \leq y\}\) Norm-Based Contraction Theory (log norms) [12], [13]

Neural Networks in the Loop

Use CROWN [14] to obtain the linear bounds \[ Cx + \underline{d}\leq \pi(x) \leq Cx + \overline{d}. \] Combine with mixed Jacobian LDI [15] to get \[ f(x,\pi(x)) - f({\mathring{x}},\pi({\mathring{x}})) \in \underbrace{([\mathcal{M}^x] + [\mathcal{M}^u]C)}_{\text{first-order interactions}}(x - {\mathring{x}}) + \underbrace{R}_{\text{remainder}} \]

GPU Parallelizable Verification

  • Controlled by \(4\times 100\times 100\times 2\) ReLU network imitating MPC
# Part. CPU GPU
\(1^4 = 1\) \(.0112\) \(.0178\)
\(2^4 = 16\) \(.143\) \(.0207\)
\(3^4 = 81\) \(.627\) \(.0306\)
\(4^4 = 256\) \(1.44\) \(.0489\)
\(5^4 = 625\) \(4.60\) \(.095\)
\(6^4 = 1296\) \(11.1\) \(.198\)

\[ \begin{align*} \dot{p}_x &= v \cos(\phi + \beta(u_2)), \\ \dot{p}_y &= v \sin(\phi + \beta(u_2)), \\ \dot{\phi} &=v\sin(\beta(u_2)), \\ \dot{v} &= u_1, \\ \beta(u_2) &= \arctan \left(\tfrac{1}{2}\tan(u_2)\right). \end{align*} \]

Training Robustly Forward Invariant Neural Networks

Problem: Invariance Training
Train the neural network \(\pi\) so that a prescribed set \(\mathcal{S}\) is robustly forward invariant.

  • Existing methods use sampling-based procedures to certify Lyapunov functions post-training [16]

Proposition
The pointwise embedding system check \[ \dot{{\mathring{x}}} = 0, \quad \dot{\alpha} = 0, \quad E({\mathring{x}},\alpha,y) \leq 0 \] implies \(\left(\!\left( {\mathring{x}},\alpha,y \right)\!\right)_g\) is a forward invariant set.

Proof Sketch: \(\frac{\partial g_k}{\partial x}(\alpha,x - {\mathring{x}})[f(x,w)] \leq E_k({\mathring{x}},\alpha,y) \leq 0\) (barrier-like condition)

Training Robust Neural Networks: Polytope Forward Invariance

  • Embedding output \(E\) is differentiable with respect to neural network parameters \(\pi\)

Trainable Loss \[ \mathcal{L}(\pi) = \sum_{k} (\operatorname{ReLU}(E_k + \varepsilon)) \]

induces

Simple Invariance Condition \[ E({\mathring{x}},\alpha,y) \leq 0 \]


Nonlinear Segway Model [17]

\[ \begin{aligned} \begin{bmatrix} \dot{\phi} \\ \dot{v} \\ \ddot{\phi} \end{bmatrix} = \begin{bmatrix} \dot{\phi} \\ \frac{\cos\phi(-1.8u + 11.5v+9.8\sin\phi)-10.9u+68.4v-1.2\dot{\phi}^2\sin\phi}{\cos\phi-24.7} \\ \frac{(9.3u-58.8v)\cos\phi + 38.6u - 234.5v - \sin\phi(208.3+\dot{\phi}^2\cos\phi)}{\cos^2\phi - 24.7} \end{bmatrix} \end{aligned} \]

  • Trained in \(11.4\) seconds, compared to \(2758\) seconds for sampling-based [16]
  • Robust to \(\pm2\%\) uncertainty on all parameters

Training Robust Neural Networks: Polytope Forward Invariance

  • Embedding output \(E\) is differentiable with respect to neural network parameters \(\pi\)

Trainable Loss \[ \mathcal{L}(\pi) = \sum_{k} (\operatorname{ReLU}(E_k + \varepsilon)) \]

induces

Simple Invariance Condition \[ E({\mathring{x}},\alpha,y) \leq 0 \]


Nonlinear Vehicle Platoon

\[ \dot{p}_j = v_j, \quad \dot{v}_j = \sigma(u_j)(1 + w_j), \] \(\sigma(u)\) (softmax), \(w_j\in[-0.1,0.1]\). \[ u_j = \begin{cases} \pi((x_j,\ x_{j-3} - x_{j},\ x_{j} - x_{j+3})), & j=3k \\ \pi((0,\ x_{j-1} - x_j,\ x_j - x_{j+1})), & \text{ow}, \end{cases} \]

  • Shared policy, limited information
N States Training Time (s)
4 8 6.90
10 20 57.7
16 32 170.
22 44 408.
28 56 1040

Injecting Parameter Dynamics: Adjoint of the Linearization

In the previous slides, we only considered the \(\dot{\alpha} = 0\) case.

Recall for the linear system \(\dot{x} = Ax\), the \[ \dot{\alpha} = -\alpha A \] resulted in a cancellation leading to \(\dot{y} = 0\), and optimal constraints supporting the true reachable set.

For nonlinear systems \(\dot{x} = f(x)\), we consider the following as a good guess for “good” parameter dynamics. \[ \dot{\alpha} = -\alpha \frac{\partial f}{\partial x} ({\mathring{x}}) \]

  • We can incorporate this “feedback” term directly into the previously discussed embedding systems.

ARCH-COMP AINNCS Benchmarks: Efficiency and Accuracy

Benchmark Instance CORA CROWN-Reach JuliaReach NNV immrax
ACC safe-distance 3.091 2.525 0.638 26.751 0.066
AttitudeControl avoid 3.418 3.485 5.728 0.507
NAV robust 1.990 20.902 5.197 405.291 42.384
TORA reach-tanh 3.166 7.486 0.357 63.523 0.020
TORA reach-sigmoid 6.001 5.293 0.458 118.312 0.023

ACC (verified)

TORA reach-sigmoid (verified)

AttitudeControl (verified)

Example: Large-Scale Vehicle Platoon

  • Second-order vehicle platoon, 4 states for each vehicle
  • Lead vehicle feed-forward MPC to navigate around obstacle, followers use nonlinear feedback to track displacement from leading agent
N States Runtime (s)
3 12 0.012
6 24 0.021
9 36 0.072
12 48 0.135
15 60 0.184
18 72 0.244
  • Adding parameter dynamics \(-\alpha \frac{\partial f}{\partial x}({\mathring{x}})\) does not significantly change runtime, but drastically improves performance

With \(\dot{\alpha} = 0\)

With \(\dot{\alpha} = -\alpha \frac{\partial f}{\partial x}({\mathring{x}})\)

Example: Robot Arm

  • Existing strategies: occasionally resynthesize norm using SDP [18]
  • Our approach: differentially update \(\alpha\)

\[ \begin{aligned} \dot{q}_1 &= z_1, \quad \dot{q}_2 = z_2, \\ \dot{z}_1 &= \tfrac{1}{mq_2^2 + ML^2/3}(-2mq_2z_1z_2 - k_{d_1}z_1 + k_{p_1}(u_1 - q_1)), \\ \dot{z}_2 &= q_2 z_1^2 + \tfrac1m (- k_{d_2} z_2 + k_{p_2} (u_2 - q_2)), \end{aligned} \]

Bounding full \([\mathcal{J}]\) LDI, \(\dot{\alpha} = 0\) [18]

Our interval \([\mathcal{M}]\) LDI, \(\dot{\alpha} = 0\)

Our \([\mathcal{M}]\) LDI, \(\dot{\alpha} = -\alpha \frac{\partial f}{\partial x}({\mathring{x}})\)

Example: Reach-iLQR for Van der Pol Oscillator

Minimizing the cost \[ \Phi({\mathring{x}},\alpha,y) := -\log\det(\alpha^T\alpha / y^2) \] minimizes the volume of the set \(\left[\!\left[ {\mathring{x}},\alpha,y \right]\!\right]\).

  • Terminal set volume is differentiable with respect to parameter dynamics \(\dot{\alpha} = U\)

Idea: Use numerical optimal control to synthesize \(\dot{\alpha} = U\). \[ U(t,{\mathring{x}},\alpha,y) = -\alpha \frac{\partial f}{\partial x}({\mathring{x}}) + U_\text{ff}(t) \]

  • Use iterative dynamic programming (iLQR) to directly optimize terminal reachable set volume

Differential Geometric Extensions

Reachable Sets via Lie Algebra

Motivation: Systems with Lie group states like a quadrotor with \(3\)D orientation: \[ \dot{R} = R\hat{\omega}. \] Propogating a full subset like \([\underline{R},\overline{R}] \subseteq \mathbb{R}^{3\times 3}\) is overconservative: 9 degrees of uncertainty for a 3-dim submanifold.

  • Choosing a local chart and applying an off the shelf reachable set tool warps the reachable set geometry

Idea: apply local Lie algebra equivalence.

  1. Build locally equivalent system in Lie algebra
  2. Apply existing reachability techniques
  3. Exponentiate back to the original manifold
  4. Transition between tangent spaces to avoid injectivity issues

Reachable Sets via Lie Algebra (examples)

Coupled Oscillators

\(SO(3)\) Single Integrator

Contraction Theory on Homogeneous Manifolds

Setting: Nonlinear system \[ \dot{x} = f(x), \] where \(x\in\mathcal{M}\) is a homogeneous Riemannian manifold and \(f\in\mathfrak{X}(\mathcal{M})\) is a vector field.

  • Homogeneous manifold: there is a transitive group action \(\lambda:\mathsf{G}\times\mathcal{M}\to\mathcal{M}\)
  • Invariant metric: the Riemannian metric is invariant under this group action, e.g., \[ \left\langle\hspace{-0.2em}\left\langle v,w \right\rangle\hspace{-0.2em}\right\rangle_{\mathbb{S}^2} = \left\langle\hspace{-0.2em}\left\langle Rv,Rw \right\rangle\hspace{-0.2em}\right\rangle_{\mathbb{S}^2}, \] for the case of \(SO(3)\) acting on \(\mathbb{S}^2\).
  • We provide a simplified logarithmic norm condition for contraction.

Contraction Theory on Homogeneous Manifolds (examples)

Nonexpansive Single Integrator on \(SO(3)\)

Systems cannot contract sets containing the image of a one-parameter subgroup through the action \(\lambda\).

Example: Spheres cannot contract any set containing a great circle.

Conclusion

  1. Controlled parametric embeddings allow for efficient reachable set computations with a hypercontrol dynamically shaping the set
  2. Simple theoretical modifications (e.g., adding \(\dot{\alpha} = -\alpha \frac{\partial f}{\partial x}({\mathring{x}})\)) can drastically improve reachable set estimates for little computational cost
  3. immrax is our implementation of interval analysis using JAX for Python, supporting JIT compilation, automatic differentiation, and GPU parallelization.
  4. [output] is differentiable with respect to [input]—allows for safety filtering, training robust neural network controllers, and directly optimizing the reachable set volume

Thank you for your attention!

References

[1]
S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin, “Hamilton-Jacobi reachability: A brief overview and recent advances,” in 56th Conference on Decision and Control (CDC), 2017, pp. 2242–2253. doi: 10.1109/CDC.2017.8263977
[2]
S. Bansal and C. J. Tomlin, “Deepreach: A deep learning approach to high-dimensional reachability,” in 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2021, pp. 1817–1824.
[3]
M. Althoff, G. Frehse, and A. Girard, “Set propagation techniques for reachability analysis,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 4, no. 1, pp. 369–395, 2021.
[4]
P. Varaiya, “Reach set computation using optimal control,” in Verification of Digital and Hybrid Systems, Springer, 2000, pp. 323–331.
[5]
S. M. Harwood and P. I. Barton, “Efficient polyhedral enclosures for the reachable set of nonlinear control systems,” Mathematics of Control, Signals, and Systems, vol. 28, pp. 1–33, 2016.
[6]
A. Harapanahalli and S. Coogan, “Parametric reachable sets via controlled dynamical embeddings,” arXiv preprint arXiv:2504.06955, 2025.
[7]
L. Jaulin, M. Kieffer, O. Didrit, and E. Walter, Applied interval analysis, 1st Edition. Springer, 2001.
[8]
J. Bradbury et al., JAX: Composable transformations of Python+NumPy programs.” 2018. Available: http://github.com/jax-ml/jax
[9]
A. Harapanahalli, S. Jafarpour, and S. Coogan, “Immrax: A parallelizable and differentiable toolbox for interval analysis and mixed monotone reachability in JAX,” IFAC-PapersOnLine, vol. 58, no. 11, pp. 75–80, 2024, doi: https://doi.org/10.1016/j.ifacol.2024.07.428. Available: https://www.sciencedirect.com/science/article/pii/S2405896324005275
[10]
A. Harapanahalli and S. Coogan, “Certified robust invariant polytope training in neural controlled odes,” arXiv preprint arXiv:2408.01273, 2024.
[11]
B. Gould, A. Harapanahalli, and S. Coogan, “Automatic and scalable safety verification using interval reachability with subspace sampling,” IEEE Control Systems Letters, 2025.
[12]
A. Harapanahalli and S. Coogan, “A linear differential inclusion for contraction analysis to known trajectories,” arXiv preprint arXiv:2411.11587, 2024.
[13]
A. Harapanahalli and S. Coogan, “Efficient norm-based reachable sets via iterative dynamic programming,” arXiv preprint arXiv:2408.01273, 2025.
[14]
H. Zhang, T.-W. Weng, P.-Y. Chen, C.-J. Hsieh, and L. Daniel, “Efficient neural network robustness certification with general activation functions,” Advances in neural information processing systems, vol. 31, 2018.
[15]
S. Jafarpour, A. Harapanahalli, and S. Coogan, “Efficient interaction-aware interval analysis of neural network feedback loops,” IEEE Transactions on Automatic Control, 2024.
[16]
Y. Huang, I. D. J. Rodriguez, H. Zhang, Y. Shi, and Y. Yue, “Fi-ode: Certified and robust forward invariance in neural odes,” arXivorg, 2023.
[17]
T. Gurriet, A. Singletary, J. Reher, L. Ciarletta, E. Feron, and A. Ames, “Towards a framework for realizable safety critical control through active set invariance,” in 2018 ACM/IEEE 9th international conference on cyber-physical systems (ICCPS), IEEE, 2018, pp. 98–106.
[18]
C. Fan, J. Kapinski, X. Jin, and S. Mitra, “Simulation-driven reachability using matrix measures,” ACM Transactions on Embedded Computing Systems (TECS), vol. 17, no. 1, pp. 1–28, 2017.
[19]
A. Harapanahalli and S. Coogan, “Efficient reachable sets on Lie groups using Lie algebra monotonicity and tangent intervals,” in 2024 IEEE 63rd Conference on Decision and Control (CDC), IEEE, 2024, pp. 695–702.
[20]
A. Harapanahalli and S. Coogan, “A global coordinate-free approach to invariant contraction on homogeneous manifolds,” in 2025 American Control Conference (ACC), IEEE, 2025, pp. 4004–4010.
[21]
K. Shen and J. K. Scott, “Rapid and accurate reachability analysis for nonlinear dynamic systems by exploiting model redundancy,” Computers & Chemical Engineering, vol. 106, pp. 596–608, 2017.

Example: Jacobian-based Inclusion Function

Definition (Jacobian-Based Inclusion Function)

Let \(f:\mathbb{R}^n\to\mathbb{R}^m\) be differentiable, with inclusion function \(\mathsf{J}\) for its Jacobian as \(\frac{\partial f}{\partial x}(x) \in [\mathsf{J}(\underline{x},\overline{x})]\) for every \(x\in[\underline{x},\overline{x}].\) Let \({\mathring{x}}\in[\underline{x},\overline{x}]\).

\[ \begin{align*} \mathsf{F}(\underline{x},\overline{x}) = [\mathsf{J}(\underline{x},\overline{x})]([\underline{x},\overline{x}] - {\mathring{x}}) + f({\mathring{x}}) \end{align*} \]

is a Jacobian-based inclusion function of \(f\).

import jax
import jax.numpy as jnp
from immrax import natif

def jacif (f) :
  J = natif(jax.jacfwd(f))
  def F (ix, xc) :
    return J(ix)@(ix-xc) + f(xc)

Special Case 1: Intervals (mixed monotonicity)

Let \(g(\alpha,x - {\mathring{x}}) = \alpha (x - {\mathring{x}})\), and \(\alpha = \left[\begin{smallmatrix} -\mathbf{I} \\ \mathbf{I} \end{smallmatrix}\right]\). Then \[ \left(\!\left( {\mathring{x}},\left[\begin{smallmatrix} -\mathbf{I} \\ \mathbf{I} \end{smallmatrix}\right],\left[\begin{smallmatrix} -\underline{y} \\ \overline{y} \end{smallmatrix}\right] \right)\!\right)_g = \{x : \underline{y}\leq x - {\mathring{x}}\leq \overline{y}\} = [{\mathring{x}}+ \underline{y},{\mathring{x}}+ \overline{y}] . \] If we set \(\dot{\alpha} = 0\), Theorem’s inequality reduces to (for an upper \(k\)-th face) \[ \sup_{\substack{x\in[{\mathring{x}}+ \underline{y},{\mathring{x}}+ \overline{y}] \\ e_k^T(x - {\mathring{x}}) = \overline{y}_k}} e_k^T (f(x) - f({\mathring{x}})) = \sup_{\substack{x\in[{\mathring{x}}+ \underline{y},{\mathring{x}}+ \overline{y}] \\ x_k = {\mathring{x}}_k + \overline{y}_k}} (f_k(x) - f_k({\mathring{x}})) \leq \overline{\mathsf{F}}_k ([{\mathring{x}}+ \underline{y}_{k\leftarrow\overline{y}_k},{\mathring{x}}+ \overline{y}]) - f_k({\mathring{x}}) =: \dot{\overline{y}}_k \]

  • Easily bound \(\dot{\overline{y}}_k\) through evaluations of inclusion function \(\mathsf{F}\) on the “flattened” interval \([{\mathring{x}}+ \underline{y}_{k\leftarrow\overline{y}_k},{\mathring{x}}+ \overline{y}]\)
  • Resulting embedding system \[ \dot{{\mathring{x}}} = f({\mathring{x}}), \quad \dot{\alpha} = 0, \quad \dot{\underline{y}} = \underline{E}({\mathring{x}},\underline{y},\overline{y}), \quad \dot{\overline{y}} = \overline{E}({\mathring{x}},\underline{y},\overline{y}) \] is monotone with respect to the order \((=,=,\geq,\leq)\).

Special Case 2: H-Polytopes (lifting + refinement)

Let \(g(\alpha,x - {\mathring{x}}) = \alpha(x - {\mathring{x}})\), for \(\alpha = \left[\begin{smallmatrix} -H \\ H \end{smallmatrix}\right]\), where \(H\) is a tall full rank matrix. Then \[ % \param{\ox,[-H^T\ H^T]^T,[-\uly^T\ \oly^T]^T}_g = \{x : \uly \leq H(x - \ox) \leq \oly\}. \left(\!\left( {\mathring{x}},\left[\begin{smallmatrix} -H \\ H \end{smallmatrix}\right],\left[\begin{smallmatrix} -\underline{y} \\ \overline{y} \end{smallmatrix}\right] \right)\!\right)_g = \{x : \underline{y}\leq H(x - {\mathring{x}}) \leq \overline{y}\}. \] If we let \(\dot{\alpha} = 0\), \(H^+\) be a left inverse of \(H\), and \(\mathcal{H}= \{Hx\}\), Theorem’s inequality reduces to (for an upper face) \[ \sup_{\substack{x : H(x - {\mathring{x}})\in[\underline{y},\overline{y}] \\ H_k(x - {\mathring{x}}) = \overline{y}_k}} H_k (f(x) - f({\mathring{x}})) = \sup_{\substack{z\in\mathcal{H}\cap[\underline{y},\overline{y}] \\ z_k = \overline{y}_k}} H_k (f(H^+z + {\mathring{x}}) - f({\mathring{x}})) \leq \overline{\mathsf{G}}_k (\mathcal{I}_\mathcal{H}(H{\mathring{x}}+ [\underline{y}_{k\leftarrow\overline{y}_k},\overline{y}])) - H_k f({\mathring{x}}) =: \dot{\overline{y}}_k \]

  • \(\mathsf{G}\) is an inclusion function for the lifted map \[ g(z) = Hf(H^+z) \]
  • \(\mathcal{I}_\mathcal{H}\) is an interval refinement operator [21] satisfying \[ \mathcal{H}\cap[\underline{y},\overline{y}] \subseteq \mathcal{I}_\mathcal{H}([\underline{y},\overline{y}]) \subseteq [\underline{y},\overline{y}] \]

Special Case 3: Norms (contraction theory)

Suppose we have a linear differential inclusion \[ f(x,w) - f({\mathring{x}},{\mathring{w}}) \in \operatorname{co}\{M_i\}_i (x - {\mathring{x}}) + \operatorname{co}\{M_j\}_j (w - {\mathring{w}}), \] then the following is a valid embedding system [], \[ \begin{aligned} \dot{{\mathring{x}}} &= f({\mathring{x}},{\mathring{w}}) \\ \dot{\alpha} &= U \\ \dot{y} &= \big(\max_{i} \mu(U\alpha^{-1} + \alpha M^x_i \alpha^{-1}) \big) y + \max_j \|\alpha M^w_j\|_{w\to x} \end{aligned} \]

  • \(\|A\|_{w\to x}\) is the operator norm
  • \(\mu\) is the logarithmic norm (usually known in closed-form) \[ \mu(A) = \limsup_{h\searrow0}\tfrac{1}{h} (\|\mathbf{I}+ hA\| - 1) \]
  • \(\dot{\alpha} = 0\) recovers results from norm-based contraction theory

Definition: Normotopes
The sublevel set of norm \(\|\cdot\|\) shaped by \(\alpha\in\mathbb{R}^{n\times n}\) \[ \left[\!\left[ {\mathring{x}},\alpha,y \right]\!\right] := \{x\in\mathbb{R}^{n_x}: \|\alpha(x - {\mathring{x}})\| \leq y\}. \]