Make use of entropy and information gain to discover the root node for the decision tree using the ID3 algorithm

Make use of entropy and information gain to discover the root node for the decision tree using the ID3 algorithm

Applying ID3 Algorithm to Find Root Node

Training Dataset

S.No.CGPAInteractivityPractical KnowledgeCommunication SkillsJob Offer
1≥9YesVery goodGoodYes
2≥8NoGoodModerateYes
3≥9NoAveragePoorNo
4<8NoAverageGoodNo
5≥8YesGoodModerateYes
6≥9YesGoodModerateYes
7<8YesGoodPoorNo
8≥9NoVery goodGoodYes
9≥8YesGoodGoodYes
10≥8YesAverageGoodYes

Step 1: Entropy of the Dataset

Total examples = 10, Positive = 6, Negative = 4

Entropy(S) = -p+log₂(p+) - p-log₂(p-)
           = -0.6 log₂(0.6) - 0.4 log₂(0.4)
           ≈ -0.6 * 0.737 - 0.4 * 1.322
           ≈ 0.971 bits

Step 2: Calculate Information Gain

Attribute: Interactivity

  • Yes (5): 4 Yes, 1 No → Entropy = 0.7219
  • No (5): 2 Yes, 3 No → Entropy = 0.9710
Gain = 0.971 - [0.5 * 0.7219 + 0.5 * 0.971]
     = 0.971 - 0.84345
     = 0.1276

Attribute: CGPA

  • ≥9 (3): 2 Yes, 1 No → Entropy = 0.9183
  • ≥8 (4): 3 Yes, 1 No → Entropy = 0.8112
  • <8 (3): 1 Yes, 2 No → Entropy = 0.9183
Gain = 0.971 - [0.3 * 0.9183 + 0.4 * 0.8112 + 0.3 * 0.9183]
     = 0.971 - 0.8741
     = 0.0969

Attribute: Practical Knowledge

  • Very good (2): 2 Yes → Entropy = 0
  • Good (5): 3 Yes, 2 No → Entropy = 0.971
  • Average (3): 1 Yes, 2 No → Entropy = 0.9183
Gain = 0.971 - [0.2 * 0 + 0.5 * 0.971 + 0.3 * 0.9183]
     = 0.971 - (0 + 0.4855 + 0.2755)
     = 0.210

Attribute: Communication Skills

  • Good (5): 4 Yes, 1 No → Entropy = 0.7219
  • Moderate (3): 2 Yes, 1 No → Entropy = 0.9183
  • Poor (2): 0 Yes, 2 No → Entropy = 0
Gain = 0.971 - [0.5 * 0.7219 + 0.3 * 0.9183 + 0.2 * 0]
     = 0.971 - (0.3609 + 0.2755)
     = 0.3346

Step 3: Information Gain Summary

AttributeInformation Gain
Interactivity0.1276
CGPA0.0969
Practical Knowledge0.210
Communication Skills0.3346 ✅

✅ Conclusion:

Communication Skills has the highest Information Gain and is therefore selected as the root node for the decision tree using the ID3 algorithm.

Leave a Reply

Your email address will not be published. Required fields are marked *