In real time we will have lot of variables/features and some of the variables might carry same information(like age and date of birth),some of the variables like firstName, LastName etc.. which wont have any values during model building, so we need to remove the variables and this process we called it Feature selection. Let’s takeContinue reading “Feature selection”
Tag Archives: chi square
Chi Square
We know correlation is used to check the relation between two continuous variables,We should also have some kind of mechanism to check the relation between two categorical variables,and that is Chi-Square. Steps to check the relation between two categorical variables: Define hypothesis Define alpha Find out the Degree of freedom Define the rule Calculate theContinue reading “Chi Square”
How and when does the Decision tree stop splitting?
By default Splitting will stop when the tree reaches 100% purity, means when the child/subset node has homogeneous/single class or we can also say when child/subset node is pure(means all classes will be either Yes or No), this will lead to overfitting problem. In simple when my algorithm learned everything from my training data, It willContinue reading “How and when does the Decision tree stop splitting?”