Chi Square

We know correlation is used to check the relation between two continuous variables,We should also have some kind of mechanism to check the relation between two categorical variables,and that is Chi-Square.

Steps to check the relation between two categorical variables:

  1. Define hypothesis
  2. Define alpha
  3. Find out the Degree of freedom
  4. Define the rule
  5. Calculate the test results
  6. Calculating the final result

Don’t get confused with the above steps, if you are not clear then read the articles related to hypothesis testing and distributions .

As per the above steps first we need to define hypothesis.

Step 1)Define Hypothesis:      

H0 = Null hypothesis = There is no relationship between Flight status and weather

H1 = Alternative hypothesis = There is relationship between flight status and weather

Step 2)Define Alpha:

        Alpha level is also called as significant level,which is used to cutoff ratio to accept or reject Null hypothesis, for more information read this article

Usually alpha level is defined in most of the test as 0.05 means 5%

Step 3)Identify Degree of freedom:

    To identify the degree of freedom, we need to build the frequency table(means how many times flight is delayed in case of rainy, sunny and overcast for all possible cases),

                                Fig 1

Degree of freedom = (number of rows – 1) * (number of columns – 1)

                        =        (2-1) * (3-1) = 2

So degree of freedom = 2

Step 4) Define the criteria :

        Now define the criteria or rule using chi square table which is available here, for our reference below is the screenshot of the chi square table.

        We know the degree of freedom and alpha value, so if you look at the below screenshot on vertical you will see alpha value and horizontal you will find Degree of freedom, if we see the value is 5.99

Now we can reject the Null hypothesis if the chi square value is greater than 5.991

Step 5) Calculate the statistical value:

 Lets understand the above formula   

like above we need to calculate the combination of all possible cases like (Delayed,rainy),(Delayed,sunny),(Delayed,Overcast) and so on.

We can use frequency table to calculate this values shown in Fig 1

Lets calculate for all possible scenarios and below is the result.This result also called expected result.

Our frequency table is also called Actual result.

Step 6)Calculating the final result:

        To calculate the final result we will use below formula.

If we substitute for all then the final value will be 8.5

In step 4 we found the chi-square values as 5.991 and our final result is 8.5 means 8.5 > 5.991, means we can reject the null hypothesis means There is relationship between flight status and weather

Published by viswateja3

Hi

Leave a comment