Logo name
Personal tools

SOM View visualizer

From Piki

  • Currently5.00/5
Jump to: navigation, search

The SOM View is a Synapse visualizer that uses Self-Organizing maps. The SOM View can be used for data mining and clustering as well as for an intuitive visualization of multivariate data.

SOM View
Image:Somview icon.png
NameSOM View
Can modify dataYes
Produces filtersYes
Can show multivariate dataYes
Supports selectionYes
Supports clipboard operationsNo
Can export dataNo
Can export imagesNo

Contents

Operation

The SOM View is based on hexagonal self-organizing maps (SOMs).

One important difference between the SOM View and static visualizers such as the Grid view is that the SOM View is an adaptive component - it needs to be trained. This is accomplished by pressing the "Train" button in the toolbar.

SOM View interface
SOM View interface

Apart from the toolbar and the status bar, the SOM View consists of a number of regions called "Maplets". There are two special maplets, "Clusters" and "Unified distance matrix". The other maplets represent the features in the data - there is one maplet per feature. The "Clusters" maplet shows the automatic clustering of the data. The "Unified Distance Matrix" shows the average distance between the nodes in the SOM.

Visualization

Each hexagonal cell in a maplet represents a SOM node. Each SOM node is associated with a number of data points to which it is close(st) to in feature space. Each node has the same dimension as feature space. If the data has three coordinates (say X,Y,Z) then the node will also have three coordinates. The component plane maplets show the values of the nodes at each dimension.

When you select nodes in one maplet you see the selection across the board. For a practical example let's take a look at the cow data set covered in tutorial 4.

We are looking here at three variables from the data set (number of cows, number of churches and number of schools):

Image:Somview2.png


The three maplets have the same topological mapping so a node (and implicitly a group of points) in one maplet has the same position in the others. So if we for instance select the red area in the "Cows" maplet - the area where the "Cows" variable has a high value we get this selection across the board:

Image:Somview3.png


The same nodes are selected and hence the same data points. What we can directly conclude with just a glance at the three maplets is that where there are a lot of cows there are also a lot of churches but few schools.

A SOM gives you literally a map of your data which you can use to get a fast understanding of how it all fits together.

Maplet GUI interaction

Image:Somview4.png

The basic interaction is as follows: Left-click or left-drag (press left mouse button and drag) on the SOM nodes selects nodes. Right-click or right-drag on the maplet nodes unselect nodes. The same principle applies to the maplet spectrum. Left-drag on the spectrum selects the range between where you pressed the mouse button and released it. Right-drag to unselect ranges.

For instance in the example maplet above if you wanted to select nodes where HousePrice has high values, you would click on the spectrum somewhere right to the yellow region and drag to the right. You would release the button at the end of the spectrum range. This would select all the nodes that have colors between yellow and red - i.e where the HousePrice variable has high values.

The selection will be reflected in the other maplets. The selection is the same type as any other visualizer selection and you can view it in any other visualizer - including a new SOM View (allowing for hierarchical data exploration).

The number of data points associated with a node is shown as a dot in its hexagon. The size of the dot is approximately proportional to the number of data points associated with the node in question.

For customization of the maplet gui, see the section called "Customizing the GUI".

Control and Customization

SOM control and customization

To train the SOM, press the "Train" button. To change the dimensions of the SOM (default is 15x15) change the settings in the toolbar. After you have changed the SOM size, you will need to re-train it. The SOM algorithm can be fine tuned by changing the parameters found under "Options->Advanced Settings". For each setting there is a brief description of what the parameter does.

SOM control settings
SOM control settings
Feature configuration
Feature configuration

To select which features should be included in training select "Options->Feature Configuration". Here you can also set a multiplier per feature. The multiplier sets preferences for the SOM relative the features i.e. a higher multiplier means that the feature is considered more important. Thus the SOM grid will be more sorted according to that feature. If you uncheck the "Use" checkbox, the feature won't be used for training at all. Note that the feature will still be visualized with a maplet.

Clustering control and customization

The SOM View component provides automatic clustering through the use of a Neural gas applied on the trained SOM. When the SOM is trained an automatic clustering follows. The found clusters are shown in the "Clusters" maplet.

Clusters maplet
Clusters maplet

You can manually override the number of clusters by changing the number in the "Clusters" textbox in the toolbar and pressing enter. To do a new fully automatic clustering, press the "Auto" button.

The clustering algorithm can be fine tuned by changing the parameters found under "Options->Advanced Settings". For each setting there is a brief description of what the parameter does.

Meta-clustering options
Meta-clustering options

Maplet customization

Under "Options->Advanced Settings" you can find various parameters for customizing the GUI.

Maplet customization
Maplet customization

You can select which types of maplets are to be shown and what color maps are to be used. You can also set the size of the maplet elements as well as limit the total number of maplets to be shown in the visualizer.

Creating a custom filter

You can permanently keep the results of a clustering by creating a filter that is added to the stack. This filter will add a set of new features to the data unit. These features show per sample which cluster the data sample belongs to.

To create a filter select "Options->Apply as Filter". Note that while recursive and hierarchical use is possible, it can lead to meaningless results.

See also

This page was last modified 07:57, 19 June 2008.  This page has been accessed 14,887 times.  Disclaimers