LSTM layer block
The Long Short Term Memory (LSTM) Layer block is a recursive short-term memory that preserves temporal information in the system. Unlike standard feed back loops, the LSTM can preserve information over indefinite time gaps.
|Input ports||1 or 4|
LSTMs are used in dynamic settings ware precise timing is required.
The standard use of the LSTM, mostly found in literature, is that of one or several blocks in parallel followed by a single output layer. In Synapse such a layout would look like this:
The LSTM consists of one input, three gates (of which all need not be present), and has one output. In the middle there is a feed back loop (commonly known as the cell) that makes it possible for the LSTM to remember information from one time step to the next. The gates are used to control how much new information is let in, how strongly it is remembered and how much is let out at the other end.
As a note, you can by removing all gates, reduce the LSTM to act like the first (processed) tap of a gamma memory with a classic artificial neuron applied to its input.
At the gates
Each gate is like a traditional artificial neuron. It has a weighted input, consisting of external connections, an optional connection to current cell state (called a peephole) and an optional bias. It also has a sigmodial activation function that squashes the gate activation to between 0 and 1. The gate activations are then used to weight the signals of the input, the feedback loop and the output, and are called the input gate, the forget gate and the output gate respectively.
Ins and Outs
The input passes through a classical artificial neuronIt has external connections, an optional connection to current cell state (known as a peephole) and an optional bias. The wighted sum of these inputs are then scaled by an activation function. The output of the activation function is then brought to the mercy of the input gate.
The output is just the cell state wighted by the output gate and then, optionally, biased and squashed by an output activation function. This last bias and activation function is usually not used, which leaves just the gated cell state.
Block, or no block
What has been described above is primarily a single LSTM cell. Usually they rather appear in layers or blocks. The difference between the layer and the block approach is that in a cell block, all cells share common gates, where as in a layer every cell has it's own gate. If peepholes are in place in the layered approach, all gates have peephole connections to all cells.
So we have a delay at the core of the LSTM. How do we do to train it? It so happens that the LSTM is constructed to be trained with standard static back propagation. This is done by calculating partial derivatives forward in time, storing them and then using them with the back bropagated error signal to calculate the necessary updates.
The settings can be modified using the settings browser.
|LSTM Layer settings|
In design mode, the block GUI shows the connections currently active and lets the user cange what connections to use by clicking on them while using the block tool.
|LSTM layer design GUI|
In training mode the GUI shows the current value of the individual cells. If the layer is in block mode, it will also show the activation of the gates.
|LSTM layer training GUI|
- The LSTM block training only works with a static control system.