TRAINING COURSE "PREDICTION in Data Mining"
|
|
|||
* How much will this family spend on their vacations ?
* How long will this driver keep his car ?
* What is a "fair" salary for this position ?
* What sales level can be expected for this new store ?
These questions, and many others, call for predicting
the value of a number in a given circumstance. Answering this type of question
is doing prediction, or regression.
Data Mining has many prediction techniques. They differ widely in terms of performances and operational characteristics. This 1 or 2 day training course (see outline below) reviews the most popular prediction available in most Data Mining software.
Outline of the course
The general problem of prediction
The "best fitting" function
Why it is difficult to find the best fitting function
Data dispersion
Complexity of the function
Uncertainties about the predictions, generalization capacity
Linear Regression
Least Squares line
Overall model quality, Rē
Assumptions and limitations
Linearity
Normality and independence of the errors
Homoscedasticity
Multiple Linear Regression (MLR)
What's "on top" of (simple) Linear Regression ?
Chosing the predictors
Variables redundency and model stability
Multiple and partial correlation
Chosing the predictors
The" curse of dimensionality"
Stepwise methods : ascending, descending, mixed.
Hunting down variables redundency
Neural Networks (Supervised)
What do Neural Networks actually do ?
The main types of Neural Networks
The Multilayer Perceptron
RBF networks
Pros and cons of Neural Networks
High prediction accuracy
Tests and interpretability of the parameters