| |
Benefits
of Attending the Course
At the end of the course the attendees will:
• Develop data investigative skills and understand
how to interpret visualizations of data
• Understand how to interpret descriptive statistics
• Appreciate the pitfalls within the data that need
to be identified and dealt with prior to mining the data
• Understand the various reasons for missing data
and how to handle the missing values
• Have an appreciation for the subject of data sampling
and how and when it should be used
• Understand the role of data transformations on
the resulting of data mining
• Appreciate the effect of the choice of mining
tool on the specification of the data pre-processing
Course Content
• The Data Investigative Process
• Basic concepts in exploratory data analysis
• Using cross-tabulations to identify patterns and
trends
• Correlating variables
• Data Visualisation and interpretation
• Histograms, scatter-plots, etc.
• Interpreting distributions
• Dealing with complex data structures
• Normalisation
• Data Granularity and Information Content
• Understanding the nature of data
• Descriptive Statistics-mean, median, mode, n-tiles,
etc.
• Handling Noisy Data
• Best Practice on Missing Values
• Dealing with missing values
• Identifying Outliers
• Methods for dealing with Outliers
• Data Transformation
• Vectoring Variables/Pivoting e.g. lag function/time
series analysis
• Range and Distribution normalisation
• De-normalising data for analysis
• Generating Summaries
• Transformation Logic-log, inverse log, root, cube,
etc.
• Effect of Modelling tool
• Correlation
• Factor/principal component analysis
• Combination of variables
• Handling variable data types
• Converting categorical to numeric attributes (binning)
• Converting numeric to categorical attributes
• Sampling and Test Details
• Different Sampling methodologies
• Stratification
• Random
• Sample size selection
• Measuring the information content of data
• Validating the sample as a reflection of the underlying
population
• Simple tests of significance
Pre-requisites
Basic SQL skills
|