pyoptmat.ode: integrating ODEs and getting back gradients
Module defining the key objects and functions to integrate ODEs and provide the sensitivity of the results with respect to the model parameters either through backpropogation AD or the adjoint method.
The key functions are pyoptmat.odeint() and
pyoptmat.odeint_adjoint(). These functions both
integrate a system of ODEs forward in time using one of several methods
and have identical function signature. The only difference is that
backward passes over the results of pyoptmat.ode.odeint() will
use PyTorch backpropogation AD to return the gradients while
pyoptmat.ode.odeint_adjoint() will use the adjoint method.
The two different backward pass options are seamless, the end-user can interchange them freely with other PyTorch code stacking on top of the integrated results and obtain gradient information just as if they were using AD.
In all realistic cases explored so far the adjoint method produces faster results with far less memory use and should be preferred versus the AD variant.
The methods currently available to solve ODEs are detailed below, but as a summary:
“forward_euler”: simple forward Euler explicit integration
“backward_euler”: simple backward Euler implicit integration
Both options take a block_size parameter, which sets up vectorized/parallelized time integration. This means the execution device integrates block_size time steps at once, rather than one at a time. This greatly accelerates time integration, particularly on GPUs. The optimal block_size parameter will vary based on your system and set of ODEs, but we highly recommend determining an optimal block size for your system and not leaving it set to the default value of 1.
- class pyoptmat.ode.BackwardEulerScheme
Bases:
objectIntegration with the backward Euler method
- accumulate(prev, time, y, a, grad, grad_fn)
Calculate the accumulated value given the results at each time step
- Parameters:
prev (tuple of tensors) – previous results
times (tensor) – (ntime+1, nbatch) tensor of times
y (tensor) – (ntime+1, nbatch, nsize) tensor of state
a (tensor) – (ntime+1, nbatch, nsize) tensor of adjoints
grad (tensor) – (ntime+1, nbatch, nsize) tensor of gradient values
grad_fn (function) – function that takes t,y,a and returns the dot product with the model parameters
- Returns:
updated gradients
- Return type:
next (tuple of tensor)
- form_operators(dy, yd, yJ, dt, solver=<class 'pyoptmat.chunktime.BidiagonalThomasFactorization'>)
Form the residual and sparse Jacobian of the batched system
- Parameters:
dy (torch.tensor) – (ntime+1, nbatch, nsize) tensor with the increments in y
yd (torch.tensor) – (ntime, nbatch, nsize) tensor giving the ODE rate
yJ (torch.tensor) – (ntime, nbatch, nsize, nsize) tensor giving the derivative of the ODE
dt (torch.tensor) – (ntime, nbatch) tensor giving the time step
- Keyword Arguments:
solver (chunktime.BidiagonalOperator) – inverse method
- Returns:
- residual tensor of shape
(nbatch, ntime * nsize)
- J (tensor.tensor): sparse Jacobian tensor of logical
size (nbatch, ntime*nsize, ntime*nsize)
- Return type:
R (torch.tensor)
- update_adjoint(dt, J, a_prev, grads, solver=<class 'pyoptmat.chunktime.BidiagonalThomasFactorization'>)
Update the adjoint for a block of time
- Parameters:
dt (torch.tensor) – block of time steps
J (torch.tensor) – block of Jacobians
a_prev (torch.tensor) – previous adjoint
grads (torch.tensor) – block of gradient values
- Returns:
block of updated adjoint values
- Return type:
adjoint_block (torch.tensor)
- class pyoptmat.ode.FixedGridBlockSolver(func, y0, scheme=<pyoptmat.ode.BackwardEulerScheme object>, block_size=1, rtol=1e-06, atol=1e-08, miter=100, linear_solve_method='direct', direct_solve_method='thomas', direct_solve_min_size=0, adjoint_params=None, guess_type='zero', throw_on_fail=False, **kwargs)
Bases:
objectParent class of solvers which operate on a fixed grid (i.e. non-adaptive methods)
- Parameters:
func (function) – function returning the time rate of change and the jacobian
y0 (torch.tensor) – initial condition
- Keyword Arguments:
scheme (TimeIntegrationScheme) – time integration scheme, default is backward euler
block_size (int) – target block size
rtol (float) – relative tolerance for Newton’s method
atol (float) – absolute tolerance for Newton’s method
miter (int) – maximum number of Newton iterations
linear_solve_method (str) – method to solve batched sparse Ax = b, options are currently “direct” or “dense”
direct_solve_method (str) – method to use for the direct solver, options are currently “thomas”, “pcr”, or “hybrid”
direct_solve_min_size (int) – minimum PCR block size for the hybrid approach
adjoint_params – parameters to track for the adjoint backward pass
guess_type (string) – strategy for initial guess, options are “zero” and “previous”
throw_on_fail (bool) – if true throw an exception if the implicit solve fails
- block_update(t, t_start, y_start, func, y_guess)
Solve a block of times all at once with backward Euler
- Parameters:
t (torch.tensor) – block of times
t_start (torch.tensor) – time at start of block
y_start (torch.tensor) – start of block
func (torch.nn.Module) – function to use
- integrate(t, cache_adjoint=False)
Main method: actually integrate through the times
t.- Parameters:
t (torch.tensor) – timesteps to report results at
n (int) – target time block size
- Keyword Arguments:
cache_adjoint (bool) – store the info we’ll need for the adjoint pass, default False
- Returns:
integrated results at times
t- Return type:
torch.tensor
- rewind(output_grad)
Rewind the adjoint to provide the dot product with each output_grad
- Parameters:
output_grad (torch.tensor) – dot product tensor
- class pyoptmat.ode.ForwardEulerScheme
Bases:
objectIntegration with the forward Euler method
- accumulate(prev, time, y, a, grad, grad_fn)
Calculate the accumulated value given the results at each time step
- Parameters:
prev (tuple of tensors) – previous results
times (tensor) – (ntime+1, nbatch) tensor of times
y (tensor) – (ntime+1, nbatch, nsize) tensor of state
a (tensor) – (ntime+1, nbatch, nsize) tensor of adjoints
grad (tensor) – (ntime+1, nbatch, nsize) tensor of gradient values
grad_fn (function) – function that takes t,y,a and returns the dot product with the model parameters
- Returns:
updated gradients
- Return type:
next (tuple of tensor)
- form_operators(dy, yd, yJ, dt, solver=<class 'pyoptmat.chunktime.BidiagonalThomasFactorization'>)
Form the residual and sparse Jacobian of the batched system
- Parameters:
dy (torch.tensor) – (ntime+1, nbatch, nsize) tensor with the increments in y
yd (torch.tensor) – (ntime, nbatch, nsize) tensor giving the ODE rate
yJ (torch.tensor) – (ntime, nbatch, nsize, nsize) tensor giving the derivative of the ODE
dt (torch.tensor) – (ntime, nbatch) tensor giving the time step
- Keyword Arguments:
solver (chunktime.BidiagonalOperator) – inverse method
- Returns:
- residual tensor of shape
(nbatch, ntime * nsize)
- J (tensor.tensor): sparse Jacobian tensor of logical size
(nbatch, ntime*nsize, ntime*nsize)
- Return type:
R (torch.tensor)
- update_adjoint(dt, J, a_prev, grads, solver=<class 'pyoptmat.chunktime.BidiagonalThomasFactorization'>)
Update the adjoint for a block of time
- Parameters:
dt (torch.tensor) – block of time steps
J (torch.tensor) – block of Jacobians
a_prev (torch.tensor) – previous adjoint
grads (torch.tensor) – block of gradient values
- Returns:
block of updated adjoint values
- Return type:
adjoint_block (torch.tensor)
- class pyoptmat.ode.IntegrateWithAdjoint(*args, **kwargs)
Bases:
FunctionWrapper to convince the thing that it needs to use the adjoint sensitivity instead of AD
- static backward(ctx, output_grad)
- Parameters:
ctx – context object with state
output_grad – grads with which to dot product
- static forward(ctx, solver, times, *params)
- Parameters:
ctx – context object we can use to stash state
solver – ODE Solver object to use
times – times to provide output for
- pyoptmat.ode.odeint(func, y0, times, method='backward-euler', extra_params=None, **kwargs)
Solve the ordinary differential equation defined by the function
funcin a way that will provide results capable of being differentiated with PyTorch’s AD.The function
funcis typically a PyTorch module. It takes the time and state and returns the time rate of change and (optionally) the Jacobian at that time and state. Not all integration methods require the Jacobian, but it’s generally a good idea to provide it make use of the implicit integration methods.Input variables and initial condition have shape
(nbatch, nvar)Input times have shape
(ntime, nbatch)Output history has shape
(ntime, nbatch, nvar)- Parameters:
func (function) – returns tensor defining the derivative y_dot and, optionally, the jacobian as a second return value
y0 (torch.tensor) – initial conditions
times (torch.tensor) – time locations to provide solutions at
- Keyword Arguments:
method (string) – integration method, currently “backward-euler” or “forward-euler”
extra_params (list of parameters) – additional parameters that need to be included in the backward pass that are not determinable via introsection of
funcscheme (TimeIntegrationScheme) – time integration scheme, default is BackwardEulerScheme
block_size (int) – target block size
rtol (float) – relative tolerance for Newton’s method
atol (float) – absolute tolerance for Newton’s method
miter (int) – maximum number of Newton iterations
linear_solve_method (str) – method to solve batched sparse Ax = b, options are currently “direct” or “dense”
direct_solve_method (str) – method to use for the direct solver, options are currently “thomas”, “pcr”, or “hybrid”
direct_solve_min_size (int) – minimum PCR block size for the hybrid approach
adjoint_params – parameters to track for the adjoint backward pass
guess_type (string) – strategy for initial guess, options are “zero” and “previous”
- pyoptmat.ode.odeint_adjoint(func, y0, times, method='backward-euler', extra_params=None, **kwargs)
Solve the ordinary differential equation defined by the function
funcin a way that will provide gradients using the adjoint trickThe function
funcis typically a PyTorch module. It takes the time and state and returns the time rate of change and (optionally) the Jacobian at that time and state. Not all integration methods require the Jacobian, but it’s generally a good idea to provide it make use of the implicit integration methods.Input variables and initial condition have shape
(nbatch, nvar)Input times have shape
(ntime, nbatch)Output history has shape
(ntime, nbatch, nvar)- Parameters:
func (function) – returns tensor defining the derivative y_dot and, optionally, the jacobian as a second return value
y0 (torch.tensor) – initial conditions
times (torch.tensor) – time locations to provide solutions at
- Keyword Arguments:
method (string) – integration method, currently “backward-euler” or “forward-euler”
extra_params (list of parameters) – additional parameters that need to be included in the backward pass that are not determinable via introsection of
funcscheme (TimeIntegrationScheme) – time integration scheme, default is BackwardEulerScheme
block_size (int) – target block size
rtol (float) – relative tolerance for Newton’s method
atol (float) – absolute tolerance for Newton’s method
miter (int) – maximum number of Newton iterations
linear_solve_method (str) – method to solve batched sparse Ax = b, options are currently “direct” or “dense”
direct_solve_method (str) – method to use for the direct solver, options are currently “thomas”, “pcr”, or “hybrid”
direct_solve_min_size (int) – minimum PCR block size for the hybrid approach
adjoint_params – parameters to track for the adjoint backward pass
guess_type (string) – strategy for initial guess, options are “zero” and “previous”