Logo name

Error analyzer postprocessor

From Piki

  • Currently5.00/5
(Redirected from Synapse:Error analyzer)
Jump to: navigation, search

The error analyzer postprocessor is a tool for testing the statistical validity of a trained adaptive system. It also provides a connection to preprocessing allowing you to use the information gained by training a system to better understand the data.

Contents

Usage

The error analyzer is used to test the performance of a trained system. It can show the distribution of the errors as well as determine confidence levels for the output of the system.

It can only be used with systems that use an error metric, such as the delta terminator block.

To start using the Error Analyzer select the error source (error metric) you wish to use and press "Refresh". Globally, there are two important elements:

  • Error Source: The source for the errors to be analyzed. This should preferably be a block like the Delta Terminator, with system output on one port and desired output on the other.
  • Refresh: Runs analysis and updates the view.

Error distribution mode

Distribution mode of error analyzer
Distribution mode of error analyzer

The Error Distribution shows how the number of instances as a function of error value. This is to see how the errors are distributed.

  • Error Metric: The error metric to use to calculate the error.
  • Feature: The selected feature to display analysis for.
  • Set: The set (training/validation) to be displayed.
  • Bins: Number of bins in histogram. Higher number of bins results in higher resolution, but also fewer samples per bin.
  • Image:Histogram save icon.png: Saves the plot to an image file.
  • Image:Histogram print icon.png: Prints the current plot (with preview).
  • Image:Histogram copy icon.png: Copies the selected data to file.

Tools

  • Image:Histogram hand icon.png: Pan in the plot
  • Image:Histogram zoom icon.png: Zoom in the plot. Right click and select "Original dimensions" to return to normal zoom.
  • Image:Histogram select icon.png: Select bins in the plot. SHIFT modifier is incremental select, and ALT is subtractive select.
  • Image:Histogram select all icon.png: Select all bins.
  • Image:Histogram deselect all icon.png: Deselect all.

You can get a context menu by right-clicking anywhere in the plot:

  • Original dimensions: Resets zoom.
  • Show World Coordinates: Displays a tooltip with the coordinates at the mouse cursor.
  • Print: Prints the current plot.


Preprocessing link

The data in the error distribution can be sent to preprocessing. There are two different modes.

Send to Preprocessing

This command is used to analyze individual error samples and see which input samples caused them. Select the desired error samples in the plot. Click on the "Send to Preprocessing" button. You will be prompted to select which data unit you wish to map these samples to.

Select the data unit and you will be transported to preprocessing and asked to select a visualizer. In the visualizer you will now see the input samples that caused the errors that you selected in the error analyzer.

This is an extremely powerful tool that lets you effortlessly find problematic cases where the system has performed poorly. After Synapse maps the errors to input samples you have the full power of preprocessing with the usual hierarchical data mining and statistics tools.

Export to Data Unit

This command saves the error data into a new data unit. This allows you to in preprocessing analyze and map correlations between errors and input data. Click on the "Export to Data Unit" button and you will be transported to preprocessing where you will see a new data unit containing the errors.

A common action is to then use the join format to join the error data unit with the input data unit to create a third one that is used for analysis.

Confidence level mode

Confidence level mode
Confidence level mode

The confidence level mode calculations are based on using the mean square error to approximate the variance. The confidence model assumes normal distribution but this is the case for the vast majority of error distributions for real-world data. If the error curve in distribution mode looks bell-shaped (with a linear error metric) then you should be fine. The bell curve can be very approximate as the confidence calculations scale well even with very rough distribution.

There are four lines in the plot: system output, desired value and the highest and lowest values within the confidence interval. The gray area shows the region within which the correct answer lies with the chosen confidence level. The confidence interval is written in the title of the plot.

You have the follwing settings:

  • Confidence Level: A confidence level of for instance 95% means that statistically 95% of the samples will fall within the calculated confidence interval (In the case in the plot above it means that no more than 5% of the values should be more off than +-8.6.)
  • Image:Error analyzer sort.png: Sort by Sample, Target, Error or Output

See also

This page was last modified 02:54, 13 March 2008.  This page has been accessed 4,483 times.  Disclaimers