5 a] Explain feature selection algorithms and selection criterion.
Feature Selection Algorithms and Selection Criteria
Feature Selection Algorithms:
- Filter Method:
- Filters prioritize features based on metrics like correlation with the outcome variable.
- They offer a quick overview of predictive power but may ignore redundancy and interactions.
2. Wrapper Method:
- Wrapper feature selection explores subsets of features seeking to optimize model performance.
- It involves forward selection, backward elimination, and a combined approach to adjust feature sets within regression models.
- Selecting an Algorithm in Wrapper Method:
- Stepwise regression, including forward selection, backward elimination, and a combined approach,
optimizes model performance based on predefined selection criteria.
Selection Criterion:
- R-squared:
- Represents the proportion of variance explained by the model.
2. p-values:
- Indicate the likelihood of coefficients being non-zero in regression analysis.
3. AIC (Akaike Information Criterion):
- Minimized to optimize model selection based on the number of parameters.
4. BIC (Bayesian Information Criterion):
- Minimized to optimize model selection based on parameters, observations, and log likelihood.
5. Entropy:
- Measures disorder or impurity in a dataset. Wrapper Method – Combined Approach:
- Blends forward selection and backward elimination to balance relevance and redundancy by adding and removing features iteratively based on significance and impact on model fit. Decision Trees in Embedded Methods:
- Decision trees are used in machine learning and statistics to make decisions based on data by modeling relationships between variables.
- They consist of nodes, branches, and leaf nodes representing decisions, outcomes, and predictions,
respectively.