C4.5 is an improved version of ID3. It addresses the limitations of ID3 and enhances its capabilities.
Features of C4.5:
- Handles both discrete and continuous attributes.
- Supports missing values (represented as
?
; ignored in calculations). - Performs post-pruning to reduce overfitting.
- Uses Gain Ratio instead of just Information Gain.
- Robust for large and noisy datasets.

Algorithm : C4.5 Decision Tree Construction
Input: Training dataset T
Output: A Decision Tree
Steps:
- Calculate
Entropy_Info
for the dataset. - For each attribute
A
:- Calculate
Info_Gain(A)
- Calculate
Split_Info(A)
- Calculate
Gain_Ratio(A)
- Calculate
- Choose the attribute with highest Gain Ratio.
- Use it as the root node and split the dataset based on its values.
- Recursively repeat for subsets with remaining attributes until:
- All instances belong to a single class.
- No more attributes to split.
- Leaf node is reached.
Predict Job Offer Using C4.5








