Rank correlation is when two variables are ranked the change in one shows the same/positive/negative change in another rank when we measure it across two points. Don’t worry if you still don’t understand, we will find Kendall rank correlation using below dataset. We are trying to see if there is any correlation if size ofContinue reading “Kendall Rank Correlation”
Author Archives: viswateja3
Chi Square
We know correlation is used to check the relation between two continuous variables,We should also have some kind of mechanism to check the relation between two categorical variables,and that is Chi-Square. Steps to check the relation between two categorical variables: Define hypothesis Define alpha Find out the Degree of freedom Define the rule Calculate theContinue reading “Chi Square”
Linear discriminant analysis
Before we talk about linear discriminant analysis we will have a quick look on disadvantages of Logistic regression. Two-Class Problems. Logistic regression is intended for two-class or binary classification problems. It can be extended for multiclass classification, but is rarely used for this purpose. Unstable With Well Separated Classes. Logistic regression can become unstable when the classes are wellContinue reading “Linear discriminant analysis”
Bootstrap Sample
Before we discuss about Bootstrap Sample, read about Sampling With Replacement and Sampling Without Replacement A bootstrap sample is a random sample that is performed with replacement. Bootstrapping is a resampling with replacement which uses sampling with replacement, It will generate N number of samples and each sample is the same size of population. Let’sContinue reading “Bootstrap Sample”
Random Forest
Before we discuss about Bagging and Random forest we have to understand about Bootstrap sample. Bagging: Is also called bootstrap aggregator it gives best accuracy than decision tree and to reduce the variance. Bagging is very easy when you know how Decision tree and bootstrap sample works.It will use the greedy search algorithms like Entropy, Gini,Continue reading “Random Forest”
Similarity
Suppose we have 10 students in the class and you want to find which students are similar? Now how do we find this? may be based on height, color, marks score by subject or overall score and so on…. Based on the above common points, we can say student A and B is similar inContinue reading “Similarity”
K-Nearest Neighbor
K-Nearest neighbor is one of the non-parametric supervised learning. There is no concept like model building training data it is a instance based learning. We can use KNN for both classification and regression problems. One thing is like more about KNN is we need to pass only one hyper parameters which is K(number of nearestContinue reading “K-Nearest Neighbor”
Spearman Rank Correlation
Rank correlation is when two variables are ranked the change in one shows the same/positive/negative change in another rank when we measure it across two points. Don’t worry if you still don’t understand, we will find Kendall rank correlation using below dataset. We are trying to see if there is any correlation if size ofContinue reading “Spearman Rank Correlation”
Probability Distribution Function
It is divided into two parts Discrete Probability Distribution Function and Continuous Probability Distribution Function. Discrete Probability Distribution Function: Discrete means a variable that takes values from afinite or countable set, Lets say when we roll a dice we might get the value in between 1 to 6. When we roll a dice the probability of gettingContinue reading “Probability Distribution Function”
Uniform Distribution
A uniform distribution, sometimes also known as a rectangular distribution, is a distribution that has constant probability. It is defined by two parameters a and b where a is minimum and b is maximum Lets solve the below problem. What is the probability of x >20 (P(x>20)) and x will be uniform distributed with parameters a=10Continue reading “Uniform Distribution”