Wizard Tutorial
From Piki
To demonstrate the use of Synapse wizards, we are going to take a look at a function modeling problem, using real-world data. What we are going to do is to make an adaptive system that learns how to predict NO2 air pollution.
If you are familiar with elementary concepts such as samples, features, input and output data then you can move on to the first real tutoral.
Contents |
Data
The data that we are going to use is in the Synapse\Sample Data\NO2 directory. It originated in a study by the Norwegian Public Roads Administration that looked at air pollution as an effect of traffic volume and meteorological conditions.
The features (variables) listed in the data set are the following:
- no2 – concentration of NO2 particles
- cars_per_hour = the logarithm of cars per hour on the road
- temp = temperature (*C)
- wind_speed = wind speed (m/s)
- temp_diff = the temperature difference between 25 and 2 m above ground
- wind_dir = wind direction (degrees)
- hod = hour of day
- day = number of days since the first measurement
We wish to make an adaptive system that learns to predict the concentration of NO2 particles given the other available data (i.e. cars per hour, temperature etc). Hence no2 is our output feature, while the others are our input features.
| Overview |
|---|
|
The wizard
Now let’s start Synapse. You should see the Synapse start page:
| Synapse start page |
|---|
|
If you don’t see it, or have Synapse already started, go under the “Tools” menu and select “Start Page”.
What we want to do is to model a function – NO2 concentration as a function of cars per hour, temperature, wind speed etc. We are trying to build an adaptive system that will model an input-output relationship from our data file. So, click on “Function Modeling” on the start page.
Importing data
Input data
The following window will appear:
| Importing data |
|---|
|
We need to supply Synapse with the data from which our adaptive system will learn. Specifically, here we are asked to import the input data. In this case however both the input features and output features are located in the same place (one text file). So click on “Import data…”. The following window will appear:
| CSV Format |
|---|
|
Here we must select the format that the data uses. In our case it is a plain text file, so we should select “CSV File” and click “Next>”.
| Selecting file |
|---|
|
Click on the “Browse” button and navigate to the no2.txt file that is in the Synapse Sample Data\NO2 directory. The “Next” button will become enabled, so click on it:
| Import preview |
|---|
|
We now have the following screen:
| Parsing options |
|---|
|
Here we can select various formatting options. In this case however, we’re in luck and the data file is well-formatted enough for the default options to work. The “Use 15% of data for validation” part is worth taking a closer look at.
Adaptive systems learn from data we supply it. But how can we be sure that they will work with other data – data that they haven’t seen?
The answer to that is validation. Instead of using all of the data available for training the system, we leave some aside with which we later test the system. This makes sure that we know how well the system is capable of generalization i.e. how well it works on data it hasn’t been trained on. The 15% here refers to how much data should be put aside for validation. In this case, we can leave it at the default value.
Click “Finish” to end importing the data. We return to the guide, and now we have our data:
| Back to wizard |
|---|
|
Click on “Next”:
| Selection of input features |
|---|
|
Here we are asked for the input features of our system. As our file contains both the inputs and outputs of the system, we need to deselect all features that we don’t wish to use as inputs. In this example, we want to remove “no2” from the input list as it is our output. It wouldn’t be a very useful system if we had to input no2 concentration to get no2 concentration in the output, right?
So we deselect the no2 checkbox by clicking on it:
| Selection of output data |
|---|
|
Output data
This time we are asked for the output data. As we saw earlier, our text file contains both inputs and outputs. So it’s enough to click on “Same as input” and then “Next”:
| Selection of the output |
|---|
|
Now we need to select our output features. Fortunately we have only one - “no2” -and it is already selected, so just click “Next”.
| Selection of output features |
|---|
|
Data description
| Data set properties |
|---|
|
The Synapse guides use fuzzy logic to determine the optimal adaptive system structure for the specific problem. Among the fuzzy variables used for that are sample quantity and feature space. These variables are automatically determined by Synapse, but can be manually overridden – something that we won’t do here. So click “Next”:
| Fuzzy problem definition |
|---|
|
Here we can specify some attributes for the problem that will be used by the fuzzy logic system to choose a good structure for the adaptive system.
- Problem complexity refers to how difficult the problem is. This of course is a subjective estimate, but let’s say this: if x is your input and y=2x is your desired output, then the problem is trivial. If the inputs are pictures of faces and outputs are of how they will look in 20 years, well, then the problem is complex.
- Data completeness refers to how well the possible ranges of the input and output features are covered by the data. This should not be confused with number of samples. Suppose that we have an “angle” input that can be between 0-360 degrees. If we only have data that covers 15-30 degrees, it’s pretty poor completeness. We may have many samples that cover that range in detail, but that doesn’t help us very much as a large portion of the range is not covered.
- Data noise levels refer to how much random disturbance there is in the data. Real-world sensors always have certain noise levels, below which the data tends to be useless. Unless the noise is systematic, an adaptive system can’t produce better results than the limit set by the noise levels.
Now, let’s take a look at our specific case. Set the sliders so that it looks something similar to this:
| Fuzzy problem definition |
|---|
|
The problem complexity in this case is difficult to estimate, so we’ll leave them at the default value. Data completeness in this case is not all that it could be. While over a year’s worth of data is sampled, so that seasonal variations are included, it is very questionable of how well the other parameters cover their possible ranges. For instance the “cars per hour” parameter could surely vary more if different roads were selected for the data collection.
The data noise levels are the easiest to estimate as fairly high. In our NO2 case we’re talking about chemical sensors that have a fairly high margin of error.
Click on “Finish”. Synapse will now build an appropriate adaptive system and put you in the middle of the “Training” phase:
Training
| Synapse training mode |
|---|
|
Now, press the play button on the training toolbar.
| Training toolbar |
|---|
|
This is what you’ll see now:
| Synapse training |
|---|
|
The lower plot shows system output vs. desired output. The blue line is what the adaptive system thinks the no2 concentration should be and the green line shows what it actually should have been. After 1000 or so epochs you can feel free to click the pause button to stop the training. Congratulations, you have trained your first adaptive system in Synapse!
Postprocessing
As a final step, it might be interesting to actually use the system you just trained.
Click on “Post Processing” in the mode bar.
| Synapse mode bar |
|---|
| |
Click on “Probe” in the bar on the left side of the window.
| Postprocessing |
|---|
|
Press the "Refresh" button in the probe tool bar:
| Probe postprocessor |
|---|
|
On the left side of the probe, you can drag the sliders or write in the text box to enter the input values. In the upper right pane of the probe, you will see the live system output (no2 in this case) and in the lower right pane you will see a graph over the changes in the system output.
What now? Well, if you are pleased with the system, you can run the Error analyzer to see how good the system is. Or you might want to deploy the system and sell it to the Norwegian Public Roads Administration. Or if you are not pleased with the performance of it, you might want to go to preprocessing and tweak and twist the data into better shape – or to design to tweak and improve the system itself.
The best thing to do however is probably to explore the rest of the tutorials as they show how to use Synapse for real
See also
- Tutorials - The rest of the tutorials.




















