pyoptmat.ode: integrating ODEs and getting back gradients

Module defining the key objects and functions to integrate ODEs and provide the sensitivity of the results with respect to the model parameters either through backpropogation AD or the adjoint method.

The key functions are pyoptmat.odeint() and pyoptmat.odeint_adjoint(). These functions both integrate a system of ODEs forward in time using one of several methods and have identical function signature. The only difference is that backward passes over the results of pyoptmat.ode.odeint() will use PyTorch backpropogation AD to return the gradients while pyoptmat.ode.odeint_adjoint() will use the adjoint method.

The two different backward pass options are seamless, the end-user can interchange them freely with other PyTorch code stacking on top of the integrated results and obtain gradient information just as if they were using AD.

In all realistic cases explored so far the adjoint method produces faster results with far less memory use and should be preferred versus the AD variant.

The methods currently available to solve ODEs are detailed below, but as a summary:

“forward_euler”: simple forward Euler explicit integration
“backward_euler”: simple backward Euler implicit integration

Both options take a block_size parameter, which sets up vectorized/parallelized time integration. This means the execution device integrates block_size time steps at once, rather than one at a time. This greatly accelerates time integration, particularly on GPUs. The optimal block_size parameter will vary based on your system and set of ODEs, but we highly recommend determining an optimal block size for your system and not leaving it set to the default value of 1.

class pyoptmat.ode.BackwardEulerScheme

Bases: object

Integration with the backward Euler method

accumulate(prev, time, y, a, grad, grad_fn)

Calculate the accumulated value given the results at each time step

Parameters:

prev (tuple of tensors) – previous results
times (tensor) – (ntime+1, nbatch) tensor of times
y (tensor) – (ntime+1, nbatch, nsize) tensor of state
a (tensor) – (ntime+1, nbatch, nsize) tensor of adjoints
grad (tensor) – (ntime+1, nbatch, nsize) tensor of gradient values
grad_fn (function) – function that takes t,y,a and returns the dot product with the model parameters

Returns:

updated gradients

Return type:

next (tuple of tensor)

form_operators(dy, yd, yJ, dt, solver=<class 'pyoptmat.chunktime.BidiagonalThomasFactorization'>)

Form the residual and sparse Jacobian of the batched system

Parameters:

dy (torch.tensor) – (ntime+1, nbatch, nsize) tensor with the increments in y
yd (torch.tensor) – (ntime, nbatch, nsize) tensor giving the ODE rate
yJ (torch.tensor) – (ntime, nbatch, nsize, nsize) tensor giving the derivative of the ODE
dt (torch.tensor) – (ntime, nbatch) tensor giving the time step

Keyword Arguments:

solver (chunktime.BidiagonalOperator) – inverse method

Returns:

residual tensor of shape: (nbatch, ntime * nsize)
J (tensor.tensor): sparse Jacobian tensor of logical: size (nbatch, ntime*nsize, ntime*nsize)

Return type:

R (torch.tensor)

update_adjoint(dt, J, a_prev, grads, solver=<class 'pyoptmat.chunktime.BidiagonalThomasFactorization'>)

Update the adjoint for a block of time

Parameters:

dt (torch.tensor) – block of time steps
J (torch.tensor) – block of Jacobians
a_prev (torch.tensor) – previous adjoint
grads (torch.tensor) – block of gradient values

Returns:

block of updated adjoint values

Return type:

adjoint_block (torch.tensor)

class pyoptmat.ode.FixedGridBlockSolver(func, y0, scheme=<pyoptmat.ode.BackwardEulerScheme object>, block_size=1, rtol=1e-06, atol=1e-08, miter=100, linear_solve_method='direct', direct_solve_method='thomas', direct_solve_min_size=0, adjoint_params=None, guess_type='zero', throw_on_fail=False, **kwargs)

Bases: object

Parent class of solvers which operate on a fixed grid (i.e. non-adaptive methods)

Parameters:

func (function) – function returning the time rate of change and the jacobian
y0 (torch.tensor) – initial condition

Keyword Arguments:

scheme (TimeIntegrationScheme) – time integration scheme, default is backward euler
block_size (int) – target block size
rtol (float) – relative tolerance for Newton’s method
atol (float) – absolute tolerance for Newton’s method
miter (int) – maximum number of Newton iterations
linear_solve_method (str) – method to solve batched sparse Ax = b, options are currently “direct” or “dense”
direct_solve_method (str) – method to use for the direct solver, options are currently “thomas”, “pcr”, or “hybrid”
direct_solve_min_size (int) – minimum PCR block size for the hybrid approach
adjoint_params – parameters to track for the adjoint backward pass
guess_type (string) – strategy for initial guess, options are “zero” and “previous”
throw_on_fail (bool) – if true throw an exception if the implicit solve fails

block_update(t, t_start, y_start, func, y_guess)

Solve a block of times all at once with backward Euler

Parameters:

t (torch.tensor) – block of times
t_start (torch.tensor) – time at start of block
y_start (torch.tensor) – start of block
func (torch.nn.Module) – function to use

integrate(t, cache_adjoint=False)

Main method: actually integrate through the times t.

Parameters:

t (torch.tensor) – timesteps to report results at
n (int) – target time block size

Keyword Arguments:

cache_adjoint (bool) – store the info we’ll need for the adjoint pass, default False

Returns:

integrated results at times t

Return type:

torch.tensor

rewind(output_grad)

Rewind the adjoint to provide the dot product with each output_grad

Parameters:: output_grad (torch.tensor) – dot product tensor

class pyoptmat.ode.ForwardEulerScheme

Bases: object

Integration with the forward Euler method

accumulate(prev, time, y, a, grad, grad_fn)

Calculate the accumulated value given the results at each time step

Parameters:

prev (tuple of tensors) – previous results
times (tensor) – (ntime+1, nbatch) tensor of times
y (tensor) – (ntime+1, nbatch, nsize) tensor of state
a (tensor) – (ntime+1, nbatch, nsize) tensor of adjoints
grad (tensor) – (ntime+1, nbatch, nsize) tensor of gradient values
grad_fn (function) – function that takes t,y,a and returns the dot product with the model parameters

Returns:

updated gradients

Return type:

next (tuple of tensor)

form_operators(dy, yd, yJ, dt, solver=<class 'pyoptmat.chunktime.BidiagonalThomasFactorization'>)

Form the residual and sparse Jacobian of the batched system

Parameters:

dy (torch.tensor) – (ntime+1, nbatch, nsize) tensor with the increments in y
yd (torch.tensor) – (ntime, nbatch, nsize) tensor giving the ODE rate
yJ (torch.tensor) – (ntime, nbatch, nsize, nsize) tensor giving the derivative of the ODE
dt (torch.tensor) – (ntime, nbatch) tensor giving the time step

Keyword Arguments:

solver (chunktime.BidiagonalOperator) – inverse method

Returns:

residual tensor of shape: (nbatch, ntime * nsize)
J (tensor.tensor): sparse Jacobian tensor of logical size: (nbatch, ntime*nsize, ntime*nsize)

Return type:

R (torch.tensor)

update_adjoint(dt, J, a_prev, grads, solver=<class 'pyoptmat.chunktime.BidiagonalThomasFactorization'>)

Update the adjoint for a block of time

Parameters:

dt (torch.tensor) – block of time steps
J (torch.tensor) – block of Jacobians
a_prev (torch.tensor) – previous adjoint
grads (torch.tensor) – block of gradient values

Returns:

block of updated adjoint values

Return type:

adjoint_block (torch.tensor)

class pyoptmat.ode.IntegrateWithAdjoint(*args, **kwargs)

Bases: Function

Wrapper to convince the thing that it needs to use the adjoint sensitivity instead of AD

static backward(ctx, output_grad)

Parameters:

ctx – context object with state
output_grad – grads with which to dot product

static forward(ctx, solver, times, *params)

Parameters:

ctx – context object we can use to stash state
solver – ODE Solver object to use
times – times to provide output for

pyoptmat.ode.odeint(func, y0, times, method='backward-euler', extra_params=None, **kwargs)

Solve the ordinary differential equation defined by the function func in a way that will provide results capable of being differentiated with PyTorch’s AD.

The function func is typically a PyTorch module. It takes the time and state and returns the time rate of change and (optionally) the Jacobian at that time and state. Not all integration methods require the Jacobian, but it’s generally a good idea to provide it make use of the implicit integration methods.

Input variables and initial condition have shape (nbatch, nvar)

Input times have shape (ntime, nbatch)

Output history has shape (ntime, nbatch, nvar)

Parameters:

func (function) – returns tensor defining the derivative y_dot and, optionally, the jacobian as a second return value
y0 (torch.tensor) – initial conditions
times (torch.tensor) – time locations to provide solutions at

Keyword Arguments:

method (string) – integration method, currently “backward-euler” or “forward-euler”
extra_params (list of parameters) – additional parameters that need to be included in the backward pass that are not determinable via introsection of func
scheme (TimeIntegrationScheme) – time integration scheme, default is BackwardEulerScheme
block_size (int) – target block size
rtol (float) – relative tolerance for Newton’s method
atol (float) – absolute tolerance for Newton’s method
miter (int) – maximum number of Newton iterations
linear_solve_method (str) – method to solve batched sparse Ax = b, options are currently “direct” or “dense”
direct_solve_method (str) – method to use for the direct solver, options are currently “thomas”, “pcr”, or “hybrid”
direct_solve_min_size (int) – minimum PCR block size for the hybrid approach
adjoint_params – parameters to track for the adjoint backward pass
guess_type (string) – strategy for initial guess, options are “zero” and “previous”

pyoptmat.ode.odeint_adjoint(func, y0, times, method='backward-euler', extra_params=None, **kwargs)

Solve the ordinary differential equation defined by the function func in a way that will provide gradients using the adjoint trick

The function func is typically a PyTorch module. It takes the time and state and returns the time rate of change and (optionally) the Jacobian at that time and state. Not all integration methods require the Jacobian, but it’s generally a good idea to provide it make use of the implicit integration methods.

Input variables and initial condition have shape (nbatch, nvar)

Input times have shape (ntime, nbatch)

Output history has shape (ntime, nbatch, nvar)

Parameters:

func (function) – returns tensor defining the derivative y_dot and, optionally, the jacobian as a second return value
y0 (torch.tensor) – initial conditions
times (torch.tensor) – time locations to provide solutions at

Keyword Arguments:

method (string) – integration method, currently “backward-euler” or “forward-euler”
extra_params (list of parameters) – additional parameters that need to be included in the backward pass that are not determinable via introsection of func
scheme (TimeIntegrationScheme) – time integration scheme, default is BackwardEulerScheme
block_size (int) – target block size
rtol (float) – relative tolerance for Newton’s method
atol (float) – absolute tolerance for Newton’s method
miter (int) – maximum number of Newton iterations
linear_solve_method (str) – method to solve batched sparse Ax = b, options are currently “direct” or “dense”
direct_solve_method (str) – method to use for the direct solver, options are currently “thomas”, “pcr”, or “hybrid”
direct_solve_min_size (int) – minimum PCR block size for the hybrid approach
adjoint_params – parameters to track for the adjoint backward pass
guess_type (string) – strategy for initial guess, options are “zero” and “previous”