Adaptive systems development cycle
The canonic basic steps for building an adaptive system are:
This step is interesting in that the initial idea of the problem at hand, very well may change over the course of developing the adaptive system to solve it. It is therefore good practice to have a firm initial image of the problem, and to be vigilante to when and how it changes, and to think about how the new way of seeing the problem affects your initial goals and intentions.
As adaptive systems are data driven, it is important for the system designer to have a good understanding of the data. The goal of the data mining state is to identify the most important fields relating to the problem and to determine which derived values may be useful. There are limitless ways of visualizing data but the two most fundamental tools are the X-Y graph, which maps relations between variables and the histogram which shows the statistical distribution of the data.
In the data preparation stage you select variables, samples, construct new variables and transform existing ones.
Ideally, you would take all the variables/features you have and use them as inputs. In practice, this doesn’t work very well. One reason is that the time it takes to build a model increases with the number of variables. Another reason is that blindly including columns can lead to incorrect models. Often your knowledge of the problem domain can let you make many of these selections correctly. For example, including ID numbers as predictor variables will at best have no benefit and at worst may reduce the weight of other important variables.
As in the case of the variables, ideally you would want to make use of all the samples you have. In practice however samples may have to be removed to produce optimal results. You usually want to throw away data that are clearly outliers. In some cases you might want to include them in the model, but often they can be ignored based on your understanding of the problem. In other cases you may want to emphasize certain aspects of your data by duplicating samples.
It is often desirable to create additional inputs based on raw data. For instance forecasting demographics using a GDP per capita ration rather than just GDP and capita can yield better results. While in theory many adaptive systems can handle this autonomously, in practice helping the system out by incorporating external knowledge can make a big difference.
Often data needs to be transformed in one way or another before it can be put to use in a system. Typical transformations are scaling and normalization which puts the variables in a restricted range – a necessity for efficient and precise numerical computation.
In the model building step, the actual adaptive system is designed and constructed. This is usually an iterative process where many different topologies are tested.
In this step the model is adapted or trained to data. Usually the data is split into two parts – one for training the system and one for testing it’s performs on unseen data (validation). The validation of a model is a very important step. Ahead of validation there is no way of telling the accuracy of a model. (Most adaptive systems are under certain circumstances prone to overtraining – the equivalent of learning by heart. With validation you can make sure that the model responds well to unseen data.)
Most adaptive systems are statistical by nature, and as such they have to be tested in a statistical context. In the model evaluation phase confidence intervals etc of the system output are tested. It is often necessary to repeatedly go through the Building, Adaptation and Evaluation stages to ensure that an objectively good model has been obtained.
Once the model is completed, for it to be useful, it must be deployed. This means getting it out of the development environment in a way that it can be used by external software.