Logo name
Personal tools

LSTM layer block

From Piki

  • Currently0.00/5
Jump to: navigation, search

The Long Short Term Memory (LSTM) Layer block is a recursive short-term memory that preserves temporal information in the system. Unlike standard feed back loops, the LSTM can preserve information over indefinite time gaps.

LSTM Layer
Image:LSTM layer icon.png
Input ports1 or 4
Output ports1
DeployableYes
WeightsYes
MemoryYes
Interactive GUIyes

Contents

Usage

LSTMs are used in dynamic settings ware precise timing is required.

The LSTM block is intrinsically a four input port block. It can, however, be set to use the same input for all four ports, and this way be camouflaged as a single port block.

The standard use of the LSTM, mostly found in literature, is that of one or several blocks in parallel followed by a single output layer. In Synapse such a layout would look like this:

Two blocks, three layer LSTM layout


Operation

Structural layout of the LSTM
Structural layout of the LSTM

The LSTM consists of one input, three gates (of which all need not be present), and has one output. In the middle there is a feed back loop (commonly known as the cell) that makes it possible for the LSTM to remember information from one time step to the next. The gates are used to control how much new information is let in, how strongly it is remembered and how much is let out at the other end.

As a note, you can by removing all gates, reduce the LSTM to act like the first (processed) tap of a gamma memory with a classic artificial neuron applied to its input.

At the gates

Each gate is like a traditional artificial neuron. It has a weighted input, consisting of external connections, an optional connection to current cell state (called a peephole) and an optional bias. It also has a sigmodial activation function that squashes the gate activation to between 0 and 1. The gate activations are then used to weight the signals of the input, the feedback loop and the output, and are called the input gate, the forget gate and the output gate respectively.

Ins and Outs

The input passes through a classical artificial neuronIt has external connections, an optional connection to current cell state (known as a peephole) and an optional bias. The wighted sum of these inputs are then scaled by an activation function. The output of the activation function is then brought to the mercy of the input gate.

The output is just the cell state wighted by the output gate and then, optionally, biased and squashed by an output activation function. This last bias and activation function is usually not used, which leaves just the gated cell state.

Block, or no block

What has been described above is primarily a single LSTM cell. Usually they rather appear in layers or blocks. The difference between the layer and the block approach is that in a cell block, all cells share common gates, where as in a layer every cell has it's own gate. If peepholes are in place in the layered approach, all gates have peephole connections to all cells.

The return

So we have a delay at the core of the LSTM. How do we do to train it? It so happens that the LSTM is constructed to be trained with standard static back propagation. This is done by calculating partial derivatives forward in time, storing them and then using them with the back bropagated error signal to calculate the necessary updates.

Settings

The settings can be modified using the settings browser.


LSTM Layer settings


  • (Layout)
  • Size: Number cells, this also translates to the number of output features.
  • Cell block: True if all cells are to share common gates. False if each cell should have its own gate.
  • Single Port: A quick way to set the same input on all ports. (The main input and the gates will still apply their own set of weights to the common input signal.)
  • Inputs: Number of features of the main input.
  • Input Gate: Number of input features of the input gate.
  • Forget Gate: Number of input features of the forget gate.
  • Output Gate: Number of input features of the output gate.


  • (Propagation)
  • Input: Propagate signals through the main input port when using other than the block propagator on a backward pass.
  • Input Gate: Propagate signals through the input gate input port when using other than the block propagator on a backward pass.
  • Forget Gate: Propagate signals through the forget gate input port when using other than the block propagator on a backward pass.
  • Output Gate: Propagate signals through the output gate input port when using other than the block propagator on a backward pass.


  • (Squashing)
  • Input Function: The function applied to the weighted main input.
  • Gate Function: The function applied at the gates. The default is a Logistic sigmoid that squashes the signal between 0 and 1. (0 creating a closed gate, and 1 forming a wide open gate.)
  • Output Function: The function applied to the block output.


  • (Connections)
  • Input Bias: Apply a bias to the main input.
  • Input Gate: Apply the input gate.
  • Input Gate Bias: Apply a bias to the input gate input.
  • Input Gate Peep: Allow a peephole connection from the cells to the input gate.
  • Forget Gate: Apply the forget gate.
  • Forget Gate Bias: Apply a bias to the forget gate input.
  • Forget Gate Peep: Allow a peephole connection from the cells to the forget gate.
  • Output Gate: Apply the output gate.
  • Output Gate Bias: Apply a bias to the output gate input.
  • Output Gate Peep: Allow a peephole connection from the cells to the output gate.
  • Output Bias: Apply a bias to the output.
  • Squash Output: Apply the Output Function to the output.


  • (Biases)
  • Input: Initial value of the main input bias. A small random noise is still added to all but the first bias.
  • Input Gate: Initial value of the input gate bias. A small random noise is still added to all but the first bias.
  • Forget Gate: Initial value of the forget gate bias. A small random noise is still added to all but the first bias.
  • Output Gate: Initial value of the output gate bias. A small random noise is still added to all but the first bias.
  • Output: Initial value of the output bias. A small random noise is still added to all but the first bias.


GUI details

In design mode, the block GUI shows the connections currently active and lets the user cange what connections to use by clicking on them while using the block tool.

LSTM layer design GUI
Single port with all gates connected
Single port with all gates connected
Multi port with disconnected forget gate
Multi port with disconnected forget gate

In training mode the GUI shows the current value of the individual cells. If the layer is in block mode, it will also show the activation of the gates.

LSTM layer training GUI
Image:LSTM layer training GUI.png

General advice

  • The LSTM block training only works with a static control system.

See also

This page was last modified 09:51, 16 February 2010.  This page has been accessed 5,785 times.  Disclaimers