Logo name

Tutorial 3

From Piki

  • Currently3.00/5
Jump to: navigation, search

In this tutorial we will use Synapse to solve a time series prediction problem.

Before following this tutorial, it is strongly recommended that you complete the previous tutorials if you have not done that already.


Contents

The problem

When beer is produced, large fermentation tanks are used where the beer brews. These fermentation tanks get over time sedimentation layers on the bottom. This affects the volume of beer you get and from time to time these sedimentation layers needs to be removed to unclog the system and to restore the tank to full production capacity. In this tutorial we will be looking at data collected from such a tank. Our task will be nothing less than to try to see into the future: to predict the future beer volume produced by the tank.

The data

The data file is located in Synapse\Sample Data\Beer and called "beer.txt". There is only one feature in the file:

  • Beer - the daily production of beer from the fermentation tank (in liters)

Requirement specification

In the previous tutorials we have been trying to predict one or more output features from a number of input features. The samples were independent input-output examples. They were examples of a static system. In this tutorial our input feature is the same as our output feature with the difference being a shift in time. Previously we were concerned with "How much?" - i.e. the value of a sample. This time we're also asking "When?" - time is involved. We are using past values to predict future ones.

In this beer example, it is of interest to know future production values because they are an indicator of the clogging in the tank. When production levels drop rapidly, you need to clean the tank. It could perhaps be nice with a heads-up, to know when this is going to happen. So in this case the precision on the absolute value of liters produced is of less importance than the precision of the timing.

As interruptions in beer production are no laughing matter, our goal is to be able to predict beer production levels four days in the future. The focus is on precision in timing when there are large changes. We want to know if production levels are going to start dropping and want to know it four days before it happens.

Preprocessing

Start up Synapse and make sure you are in preprocessing mode. From the Data Unit Manager add a new Data Unit. Load it using the CVS format from the "beer.txt" file located in Synapse\Sample Data\Beer.

Drag-drop the Data Unit to the nearest empty visualizer area and select "Value vs Sample":

Look around in the plot, but don't edit the data yet. A relevant thing to observe is the start of the curve which is quite different from the rest. This is just after the tank is installed and the production output behaves a bit differently before it starts operating normally. We're not interested in this, so we're going to remove that part. Drag the Data Unit to the remaining empty visualizer area. Select "Grid View".

At the bottom of the grid, in the Select box, write "Row < 35". Then click Remove.

Notice how the first 35 samples (0 - 34) disappear and and the topmost row is labeled 35. Now click apply. You should see a 'Select' filter appear on the on the filter stack and the samples in the grid should be reindexed from zero.


This is all the preprocessing we will do here. Outlier and similar filters are out of the question as they potentially add or remove samples without any regard to order. This would contaminate and damage the timeline.

Design

Go to Design mode. The first thing we are going to do is change the control system. Up until now we have been using Static XProp that disregards the temporal aspect. As we want to build a dynamic system, we need its temporal counterpart, the Dynamic XProp.

Click anywhere on the work area. In the settings browser you should see the settings for the work area. Under the "Control" category, change the "ControlSystem" parameter to "Dynamic XProp" using the drop-down list.

Drag-drop the "CSV" Data Unit on to the work area from the Solution Explorer (under "Resources"). To make things more varied we won't be using any snippets this time. We'll build the system by ourselves from scratch.

Ok, so which blocks do we need? Well, we are going to need the usual Weight Layers and Function Layers - say two of each. Drag-drop them from the component bar to the work area. We are also going to need a Delta Terminator, so feel free to add one. In addition we are going to need a component that we have not used so far: the Gamma Memory component. The Gamma Memory is a recursive short-term memory that remembers past inputs. This is a key component in temporal systems as we want to use past inputs to predict future outputs. Add two Gamma Memory components to the work area. These are the components you should now have on the work area:



In the tool bar select the "Select Tool" (or press F1). We'll arrange the components now in the proper order before linking them together. You can move components simply by dragging them about. You can move one at the time, or several together by selecting them first.

Move around the components and arrange them until you have a layout similar to this:

Switch to the Link tool. (You can press F2 instead of using the toolbar. You can also toggle between Move/Select and Link mode by holding down the space bar.) From the source drag a link to the gamma next to it. As mentioned earlier and as the name implies, the Gamma Memory is a memory component - a short term memory component.


To understand what the Gamma does, we need to understand a more simple structure called a tapped delay line (TDL). If you want to understand it a bit better, read on - otherwise skip this section:


A TDL remembers its past inputs and outputs them simultaneously as separate features. It has a number of taps; each tap corresponds to one delay. The total number of taps is its total memory depth.

Suppose we have the following time series: 1, 2, 3, 4, 5 and a TDL with two taps. This is what happens when we send those five samples through the TDL:

Sample TDL Input TDL Output
1 1 0 1
2 2 1 2
3 3 2 3
4 4 3 4
5 5 4 5

So a TDL with two taps outputs two features, one containing its previous input and one containing its current input. This way historical information can be infused into the system. The Gamma memory is a bit more advanced though. It still has taps, but they are not independent like in the TDL. There is a feedback connection between each tap. In the case of the TDL, the memory depth is restricted by the number of taps. The Gamma on the other hand acts as an infinite-response filter (IIR) that in theory has no limit to the depth.

This is a schematic model of the Gamma Memory. Z-1 is a delay while mu is an interpolation weight:


The Gamma Memory can from a signal analysis perspective be seen as a recursive low-pass filter. Each output tap gives a more filtered version of the original signal. The Gamma is ideal for adaptive systems as its weight mu, which controls the memory depth, can be adapted using the usual algorithms. One final comparison to the TDL: If we input a sinus wave, this is what we get with the two systems:


In the validation bar we can see that we have a message from the Gamma: "To be useful a kernel tap needs 2 or more taps". So let's fix that. Click on the Gamma Memory to select it. In the settings browser, under "Settings", change the "Taps" setting to 5.

Using the Link tool connect the Gamma Memory to the Weight Layer next to it. The Weight Layer is our long term memory. It contains a set of weights that get adapted when we train the system. The weight layers encapsulate de facto the knowledge of the system.

In the validation bar you can see more messages, but we'll ignore them for now. Connect the Weight Layer to the Function Layer next to it. The Function Layer's role is to keep the output from the Weight Layer within reasonable levels as well as introducing non-linearity into the system which allows for very complex outputs to be constructed.

Select the Function Layer. In the settings browser, set its "Inputs" setting to 5.

Connect the Function Layer to the second Gamma Memory. Set the number of taps on the Gamma Memory to 4 and connect it to the second Weight Layer. Connect the Weight Layer to the last Function Layer. From the Function Layer make a link to the green port on the Delta Terminator. From the Data Source drag a link to the blue port on the Delta Terminator.

You should have something like this on your work area:

As we want to predict the values four samples (days) we need the input to the system to be delayed four samples relative the output. Make sure that you have the Link Tool (F2) selected. Select the link between the Data Source and the Gamma Memory. In the Property Browser set the "Z" setting to "-4". This will make sure that all signals that pass through that link get delayed by four samples.

this will show on the link as

This way the input signal is delayed by four samples relative the output signal. To illustrate:

Sample Input Output
(1) 3 0
(2) 1 0
(3) 4 0
(4) 1 0
(5) 5 3
(6) 0 1
(7) 2 4
(8) 6 1
(9) 5 5

Our system is now complete as far as structure goes. We do however want to add some plots that will be useful during training.

To gain some layout space we will reorder the components slightly and reduce the size of the Delta Terminator. Switch to the Selection Tool and select the Delta Terminator. By dragging its lower edge , reduce its size to about this:

Select the last Weight Layer and Function Layer. Move these down and to the left while positioning the Delta Terminator in the upper right corner.

Below, add a "Merger" component. You will find it in the component bar under "Signal Flow". A Merger, as the name implies, merges signals. It has two or more input ports and one output port. The signals that come on the input ports get merged into a single signal that comes out on the output port. So if you have two signals with one feature each coming in to the Merger, out will come a signal with two features.

Make a link from the Data Source to the green port on the Merger. This will be our reference signal - the desired output value. From the last Function Layer (our system output) make a link to the blue port on the merger. You should have something like this now:

Now, from the component bar, place two Value/Sample Plots on the work area and resize them to your liking. Link the Merger to both plots. Finally, select the second plot. In the settings browser, under "Buffering",set the "Set"parameter to "Validation".

Now for an explanation: The Merger receives two signals: the first one is signal from the Data Source. Since it isn't delayed, this is the same as our desired output signal. The other signal is what the system outputs. We want to compare them side by side in a plot. Actually - two plots, one for the training data and one for the validation data. That's why you set the second plot to only show validation data.


This is what you should have on your screen:


Now we are ready to train the system. Go to Training mode.

Training mode

Set the Batch Length to 500 in the Control System Pane. Now, instead of just pressing play, we are going to do something different: We are going to use a Batch Processor.

Batch Processors are a general class of automation methods. They can be used for instance to adapt systems remotely over network or to optimize system structures. We are going to do the latter here.

As you might have noticed, the components have a bunch of parameters that can significantly alter their function and have an important impact on overall system performance. Some of the Batch Processors allows you to automatically adjust these parameters. It does however come with a cost - to find the optimal parameters, the systems needs to be trained many times, which can take considerable time.

Batch Processors can be found in the right upper corner of the training view.

Select "Genetic Optimizer" and click "Enable". You will now see a list of all the components that have optimizable parameters as well as these parameters. Check the "Step.Step Rule" checkbox and the "Taps.Gamma Memory" checkbox:

In the settings browser set the Populations to 10, Generations to 100 and Epochs to 100:

These are the settings we are going to let the Genetic Algorithm optimize. Press the Play button to start the optimization:


As this will take some time to complete, feel free to grab a coffee while the system is training. When the optimization is complete, as indicated by the progress bar, press the "Disable" button to turn control back to the regular control system.

In the Control System Pane, set Max Epochs to 500 and press Play (this time on the main toolbar). After the training is finished you'll have something similar to this (results may vary significantly as there are many random elements involved in the optimization):


These are half-decent results, but hardly spectacular. Why? Simply because we didn't run the optimization long enough. Given the number of parameters to be adapted far more generations are needed and larger population sizes for GA to be useful in practice.

If you set the number of generations to 100, populations to 20 and epochs to 500 and leave the system adapting overnight, you'll get far better results. Then you can get something like this, which is considerably better:

It is very important to remember that you need to train your system in the regular way after an optimization. The optimizers try to find optimal values for various parameters of the system, but not the weights of the system. To save time it is common to train less during optimization than you would otherwise. For instance the system in the example picture above was trained for several thousands epochs after optimization.


This is also not too difficult to achieve, without using a Batch Processor - if you know what you are doing. If you know the theory and have experience with similar systems, you'll know how to set the parameters to values that give good results.

The moral of the story is that while Batch Processors allow you to optimize system parameters automatically, they are a very computationally expensive alternative to a deeper understanding of the behavior of adaptive systems. They should however not be ruled out entirely as they will from time to time surprise with innovative solutions.

This page was last modified 14:48, 17 August 2010.  This page has been accessed 4,672 times.  Disclaimers