Indicators (Class)

In a data base, categorical variables are usually represented by their modalities, which are non numerical quantities. For example, the variable "Region" might have four modalities : "America", "Europe", "Asia" and "Africa" (top image). But certain algorithms can handle only numerical variables. It is then possible to code categorical variables into a numerical form as follows :

    1) If the variable has M modalities (here, 4), then M new columns are created.

    2) Each one of these new columns is assigned to one of the modalities of the variable.

    3) For each record in the table, all positions of these new columns are set to "0", except the one column that corresponds to the modality of the variable that the record has adopted.

    4) The original column (with the modalities) is erased, or masked (bottom image).

 

 

 

Each of the new columns is now considered as a new numerical variable, whose values can only be "0" or "1". These new variables are called (class) "indicators". Therefore, the original categorical variable is replaced by as many numerical indicators as it has modalities.

 

Class indicators are particularly important in classification, where the dependent variable is, by very nature, categorical. It can easily be shown that for each new record, the class posterior probabilities are just the values of the regression functions of the indicators. Therefore, the classes are first coded as indicators, then the regression functions of these indicators are estimated. Such techniques as Logistic Regression and Neural Networks operate just this way.

 

 When there are only two classes, only one class indicator is created. It takes the value "0" or "1" depending on which class the record belongs to.

 

Inertia  (of a cloud of points)

Please see here.


Intercept

Please see here.

 

Interpretability

A Decision Tree generates rules that states facts in business terms : it is interpretable. Under certain conditions, coefficients of a Linear Regression can receive an interpretation in terms of the influence that each input variable on the model's prediction.

 

On the other hand, a Neural Network (supervised or unsupervised), even if producing excellent results, produces nothing like "rules" : it is not interpretable, and is said to function like a "black box" (although there are now many tools available to the user that help him go beyond a simple prediction)..

 

Interpretability is obviously a valuable property for a model. It should nevertheless be kept in mind that "there is no such thing as a free lunch", and that interpretability has a price :

 

        1) Interpretability comes as a by-product of the very simple architecture of certain models. Because of this very simplicity, interpretable models usually cannot provide high quality results (high bias).

 

        2) At the other end of the spectrum, Neural networks, because of their richer architecture, are both powerful and non interpretable.

 

In other words, there is balance to be found by the user between performance and interpretability. Only the nature of the problem can decide on which side the scale should tip.

 

Download this Glossary