A Decision Tree is a supervised learning model used for both classification and regression tasks. It classifies data instances with high accuracy and interpretability. The model follows an inductive inference approach, learning patterns from observed examples and applying them to unseen data.
Features:
- Represents knowledge in a tree-like structure.
- Used for solving complex decision-making problems.
- Can handle both categorical and continuous target variables.
- The function f(x)f(x)f(x) is learned from the input dataset XXX to predict the target class.
Structure of a Decision Tree:
- Root Node: The top-most node that represents the first decision.
- Internal Nodes / Decision Nodes: Represent tests/conditions on attributes.
- Branches: Outcomes of test conditions, connecting nodes.
- Leaf Nodes / Terminal Nodes: Represent final decisions or class labels.
Each path from root to a leaf represents a decision rule (a conjunction of attribute tests). The full tree represents a disjunction of rules (logical OR of multiple paths).
Process of Decision Tree Learning:
1. Tree Building (Learning Phase):
- Starts from the root.
- At each level, select the best attribute to split using metrics like Information Gain, Gini Index, etc.
- Continue recursively until all data instances are classified or no more splits are possible.
2. Classification (Inference Phase):
- Given a test instance, start from the root and follow the branches by evaluating the conditions.
- Reach a leaf node which contains the predicted class label.
Advantages of Decision Trees:
- Easy to visualize and interpret.
- Simple and intuitive – no need for domain expertise.
- Can handle both categorical and continuous variables.
- Capable of modeling non-linear relationships.
- Fast training and prediction.
Disadvantages of Decision Trees:
- May overfit if not pruned properly.
- Unstable – small changes in data can lead to a different tree.
- Handling missing values or continuous attributes can be difficult.
- Struggles with multiple-output classifications.
- Learning an optimal tree is NP-complete (computationally hard).
General Algorithm for Decision Tree Construction
Steps of the Decision Tree Algorithm:
- Select the Best Attribute
- Use an attribute selection measure (e.g., Information Gain, Gini Index) to identify the best attribute from the training dataset.
- Place it at the root node of the tree.
- Split the Dataset
- Divide the training dataset into subsets based on the values of the selected attribute.
- Each branch from the root corresponds to a possible value of that attribute.
- Repeat Recursively
- Apply steps 1 and 2 recursively to each branch (subset).
- Continue until a stopping condition is satisfied.
Stopping Criteria (When to Stop Splitting):
- Pure Node / Homogeneous Data:
- All data instances in the node belong to the same class.
- In this case, entropy = 0, and the node becomes a leaf node.
- Minimum Node Size Reached:
- If the number of instances in a node falls within a defined threshold
(e.g., 0.25% to 1% of the total dataset), the node is converted into a leaf.
- If the number of instances in a node falls within a defined threshold
- Maximum Tree Depth Reached:
- If the maximum allowed depth of the tree is reached, further splitting is stopped, and the current node becomes a leaf.