# Decision Trees

‘All decision tree algorithms have the same structure. Basically, it's a divide and conquer algorithm” very similar to the game of 20 Questions.

1. Take the entire data set as input.

2. Search for a split that maximizes the "separation" of the classes. A split is any test that divides the data in two (e.g. if attribute5 < 10).

3. Apply the split to the input data (the "divide" step) into two parts.

4. Re-apply steps 1 and 2 to each side of the split (the recursive "conquer" step).

5. Stop when you meet some stopping criteria.

6. (Optional) Clean up the tree in case you went too far doing splits (called "pruning")

Where the algorithms differ are:

**Finding a split**: Methods here vary from greedy search (e.g. C4.5) to randomly selecting attributes and split points (e.g. Random forests).

**Purity measure**: Measures here include: Information gain, gain ratio, Gini coefficient, minimum description length, Kullback–Leibler divergence and Chi-squared values.

**Stopping criteria**: Methods here vary from a minimum size, to a particular confidence in prediction, to certain purity criteria.

**Pruning method: **Methods here include no pruning, reduced-error pruning, and in ensemble cases such as bagging, out-of-bag error pruning.’

Courtesy of Waleed Kadous at Quora