Imagine we have two continuous tuple where . We have the within-leave variance of leaf .

where , the mean values of leaves in the class.

The basic algorithm is as follows:

  1. Start with a single node containing all points. Calculate and S.
  2. If the points have all the same values for all the independent variables, stop. Otherwise, search over all possible splits and take the one that minimizes . If the value of is , or the new node would contain less than points, then stop. Otherwise take the split to create two new nodes.
  3. In each new node, start over with step 1.