Products
Consultancy
Training
   Created by finisco
Home Contact Us Downloads Careers Site Map
Summary Ojectives
Syllabus
Course 1: Introduction to Data Mining
Course 2: An Introduction to Database Marketing
Course 3: Exploratory Data Analysis
Course 4: Emerging Standards in Data Mining
Course 5: Data Mining Techniques for Novices
Course 6: Advanced Data Mining Techniques

Course 3: Exploratory Data Analysis
  Benefits of Attending the Course
At the end of the course the attendees will:
• Develop data investigative skills and understand how to interpret visualizations of data
• Understand how to interpret descriptive statistics
• Appreciate the pitfalls within the data that need to be identified and dealt with prior to mining the data
• Understand the various reasons for missing data and how to handle the missing values
• Have an appreciation for the subject of data sampling and how and when it should be used
• Understand the role of data transformations on the resulting of data mining
• Appreciate the effect of the choice of mining tool on the specification of the data pre-processing

Course Content
• The Data Investigative Process
• Basic concepts in exploratory data analysis
• Using cross-tabulations to identify patterns and trends
• Correlating variables
• Data Visualisation and interpretation
• Histograms, scatter-plots, etc.
• Interpreting distributions
• Dealing with complex data structures
• Normalisation
• Data Granularity and Information Content
• Understanding the nature of data
• Descriptive Statistics-mean, median, mode, n-tiles, etc.

• Handling Noisy Data
• Best Practice on Missing Values
• Dealing with missing values
• Identifying Outliers
• Methods for dealing with Outliers

• Data Transformation
• Vectoring Variables/Pivoting e.g. lag function/time series analysis
• Range and Distribution normalisation
• De-normalising data for analysis
• Generating Summaries
• Transformation Logic-log, inverse log, root, cube, etc.
• Effect of Modelling tool
• Correlation
• Factor/principal component analysis
• Combination of variables
• Handling variable data types
• Converting categorical to numeric attributes (binning)
• Converting numeric to categorical attributes

• Sampling and Test Details
• Different Sampling methodologies
• Stratification
• Random
• Sample size selection
• Measuring the information content of data
• Validating the sample as a reflection of the underlying population
• Simple tests of significance



Pre-requisites
Basic SQL skills
Username
Password
©Copyright 2005, Terms & Conditions, Privacy Policy t: +44 2890 278616 f: +44 2890 315196 e: info@corporateintellect.com