Computation Strategy
Many different methods take a computation_strategy kwarg:
This describes the checkpointing/splitting strategy.
Checkpointing is a technique in neural networks to reduce the memory consumption during backprop. Usually backprop requires you to save all the intermediate tensors created during the computation. That can be very memory intensive. Checkpointing reduces this memory consumption by dropping some of these intermediate tensors, and recreating them during the backward pass. Thus, while checkpointing can save considerable amounts of memory, it does so by incurring an additional computation cost.
However, for larger models, checkpointing on its own is not sufficient. In these cases, it may be useful to split along a plate, so that we only compute part of the plate (see Split) below.
The computation_strategy kwarg can take three values:
alan.no_checkpoint: all intermediate tensors are saved, potentially requiring lots of memory.alan.checkpoint: uses checkpointing to reduce memory consumption.alan.Split: uses checkpointing, and further reduces memory consumption by splitting the computation along a plate.
- class alan.Split(platename: str, split_size: int)[source]
A class indicating how to split the computation along a plate. Always used as a
computation_strategy=Split(...)keyword argument.- Parameters:
platename (str) – The name of the plate to split.
split_size (str) – The size of each split. Note that this is the size of an individual split, not the total number of splits. That’s useful, because you can set split_size so that the model fits in memory, and it should still fit in memory if the data gets bigger.