|
Genna
Overview | Genna
Uniqueness | Genna
Key features | Genna
Userbase | Genna
System Requirements Genna
Overview
GENNA is a hybrid data mining
algorithm that combines genetic algorithm and nearest
neighbour technologies to provide a powerful modelling
tool for tackling classification and regression data mining
tasks. The data input to GENNA consists of a set of independent
variables and a dependent variable. The objective of applying
GENNA is to structure this data in such a way that the
dependent variable can be predicted accurately for a given
vector consisting of values assigned to the independent
variables (often referred to as the target exemplar or
case). The variables (dependent and independent) may be
categorical or numeric.
Given a training data set, rather than learning
a model through the use of induction, GENNA converts the
data into a Corporate Memory through the structuring of
the data in such a way that when called upon to make predictions,
the memory itself can be used to retrieve comparable cases
to the target, for which a prediction is required, and
predictions made based on the outcomes of the retrieved
comparables. The advantages of this approach range from
cognitive appeal through to incremental learning. By cognitive
appeal we mean that most humans approach problem solving
in this manner – retrieving similar, previous experiences,
adapting them to the current situation and then solving
the current situation based on the successful solutions
in the past. Incremental learning refers to the fact that
as new data becomes available, it gets added to the Corporate
Memory and can be used immediately to make further predictions.
This is not the case with other data mining approaches
where a model is learned from the data and any new data
can only be incorporated into the model after an expensive
learning process has been re-executed.
As can be seen from the description above of GENNA, the
key to accurate predictions is the comparability/ similarity
index used to retrieve the comparable cases and the method
by which the outcomes of the comparable cases are combined
to produce a prediction. The choice of comparability index
and prediction mechanism is a complex process that can
be viewed as an optimisation problem aimed at minimising
predictive error given certain constrains defined on the
parameters of the comparability index and prediction mechanism.
GENNA uses a genetic algorithm to perform this optimisation.
Genetic Algorithms mimic the natural process of evolution
to navigate the search space of all possible solutions,
in a non-exhaustive manner to quickly arrive at the global
optima. Starting with a population of candidate optimal
points, an iterative process of evaluation, selection,
crossover and mutation helps the population evolve and
converge to the global optima navigating around local
optima due to the parallel nature of the search being
performed. Genetic Algorithms are known for their robustness
and parallelisable nature, making them ideal candidates
for use in data mining.
back to products |
|
|