Coral
Consultancy
Training
   Created by finisco
Home Contact Us Downloads Careers Site Map
Genna Overview | Genna Uniqueness | Genna Key features | Genna Userbase | Genna System Requirements
Genna Uniqueness

Ability to use
Censored Observations

Most Data Mining techniques tend to ignore the concept of censored observations, assuming that the observed is the time of occurrence of the event. While this approach may be convenient, it leads to strong biases within the model, as the true distribution of the predicted field could be very different from the observed distribution. GENNA uniquely provides distance metrics and prediction mechanisms to explicitly handle censored observations by combining elements of evidence theory into the prediction process and well established statistical techniques like Kaplan-Meier and Wilcoxon’s test.

Ability to use Categorical and Numeric Attributes through the use of innovative distance metrics
Generally, nearest neighbour algorithms use similarity metrics that are either more suited to categorical attributes or numeric attributes. Using both these types of attributes together introduce biases within the
Ability to (semi-) automatically optimise the similarity metric used for comparable retrieval. GENNA uses innovative similarity metrics that are suitable for use by numeric as well as categorical attributes.

Automatic Indexing of data for Scalability and Speed
One of the shortfalls of the nearest neighbour family of algorithms is that as they do not build “compact” models from data for use in predictions, as the data volume increases, the speed of the prediction process can suffer. To alleviate this problem, GENNA automatically indexes the data using clustering techniques to speed up the prediction process.

Incremental Learning and Introspection
Once a model is built using data mining, an important part of the deployment is the monitoring of the accuracy of the predictions made by the model. Over a period of time, the context of the application of the model changes, a concept referred to as Concept Drift in Machine Learning literature. With this shift in context the model becomes less accurate in its application. Most data mining algorithms would need to be reapplied to new data resulting in a new model being built and applied within the new context.

GENNA approaches this problem differently, as new data is collected, whether the data represents new observations or feedback from the application of the model, it is incorporated into the current model. If the data is actually new observations this continuous learning is referred to as Incremental learning. The incorporation of data on the accuracy of the model’s application on the other hand is referred to as Introspection.

back to products
Username
Password
©Copyright 2005, Terms & Conditions, Privacy Policy t: +44 2890 278616 f: +44 2890 315196 e: info@corporateintellect.com