Wednesday, May 26, 2010

Correlation

The relation between two variables is called correlation.
In other words, if a change in the value of one variable there arises a change in the value of another variable is called correlation.
Ex : - Height and weight, Price and demand, rainfall and crop.

Types of Correlation : -

1. +ve correlation : - If a change in the values of one variable in one direction there arises a change in the value of another variable in same direction then the two variables is called +ve correlation or Direct correlation. In other words, If one of the variable X is increase/decrease then their affect on another variable Y is also increase/decrease.(i.e) Both variables change have same direction.
Ex: - Rainfall and crop, Investment and profit , Year and Population(in India).

2. -ve correlation : - If a change in the values of one variable in one direction there arises a change in the value of another variable in opposite direction then the two variables is called -ve correlation or Indirect or Diverse or Inverse correlation. In other words, If one of the variable X is increase/decrease then their affect on another variable Y is also decrease/increase.(i.e) Both variables change have opposite direction.
Ex: - Price and Demand,

3. Perfect +ve Correlation :- When two variables move proportionately in the same direction, i.e., the increase in the values of one variable leads to corresponding increase in the values of other variable, then the correlation is Perfect +ve correlation.

4. Perfect -ve correlation :- When two variable move proportionately in the opposite direction. i.e., the increase in the value of one variable leads to corresponding decrease in the values of other variable, then the correlation is Perfect -ve correlation.

5. Uncorrelation : - if a change in the value of one variable there doesn't arises a change in the value of another variable is called uncorrelation. In other words, if one variable increase or decrease then the doesn't affect on the other variable that means always constant then the relation is uncorrelation.

Measurement of Correlation : -

The following are the important methods of correlation of bi-variate data.

I. Scatter Diagram Method
II. Karl Pearson's coefficient of correlations
III. Spearman's Rank Correlation Coefficient
IV. Concurrent Deviaton Method
V. Method of Least Square

Now we discussed about
I. Scatter Diagram Method : - The diagrammatic representation of bi-variate data is called scatter diagram. It is also called Dot diagram. Take one variable on X - axis and another variable on Y - axis. Plot all (xi,yi) points on XY-plane. The collection of points are called Cluster. The cluster may be

i) Upward Direction, it is called +ve correlation.
ii) Downward Direction, it is called -ve correlation.
iii) Upward Linear Direction, it is called Perfect +ve correlation.
iv) Downward Linear Direction, it is called Perfect -ve correlation.
v) Neither Upward not Downward Direction, it called uncorrelation.

Properties :
1. It is a simple and non mathematical method for studying the correlation.
2. It is easy to understand and easy to interpret.
3. It gives us an idea about the nature of correlation between two variables.
4. It not gives us degree of correlation.
5. It gives us direction of the correlation.
6. The number of observations is very large then this method is very tedious and complicated.

II. Karl Pearson's coefficient of correlations : - The linear relationship between two variables is called Karl Pearson(British Biometrician) correlation coefficient. It is also called Pearsonian coefficient of correlation or Product moment correlation coefficient. It is denoted by 'r' or 'r(x,y)'. It is based on the Covariance, Variance of X and Variance of Y.

Properties : -
1. Karl pearson correlation coefficient is always lies between -1 and +1
2. r(x,y) = r(y,x)
3. r(x,y) = r(u,v) where u = (x-a)/h and v = (y-b)/k where a,b are origins and h,k are scales. So correlation is independent of change of origin and scale
4. If X and Y are two independent random variables then r(x,y) = 0.
But the converse is need not be true. (i.e) if r(x,y) = 0 then X and Y are need not be independent (dependent or independent).
5. Karl Pearson Correlation coefficient is attempts to determine the degree of association between two variables. (i.e) the two variable are inter-related.
6. It is used when the data in quantitative.
7. The sign of 'r' is based on sign of Cov(x,y)
8. If the value of r > 0 then X and Y are +ve correlation.
If the value of r < 0 then X and Y are -ve correlation. If the value of r = +1 then X and Y are perfect +ve correlation. If the value of r = -1 then X and Y are perfect -ve correlation. If the value of r = 0 then X and Y are independent. 9. It measures both degree and direction of correlation. 10. The relationship between two variable is X and Y is i) Y = a + b X and b > 0 then r = +1
ii)Y = a + b X and b < 0 then r = -1 11. It is pure number and is a relative measure of association between two variables. 12. It can be used to measure the correlation for individual series as well as for grouped data. 13. Coefficient of Determination:- The square of the correlation coefficient is coefficient of determination (i.e.) r^2. It depicts what percentage of the totla variance is explained by the measure of coefficient of correlation. It depicts the degree of dependence of a dependent variable on the concerned independent variable. It is also called index of determination. (i.e.) r^2 = Explained variation / Total Variation. 14. Coefficient of Non Determination:- It depicts lack of dependence of given independence on given independent variable. (i.e.) 1-r^2 = unexplained variation / Total variation 15. Probable Error : It is an instrument which measures the reliability and dependability of the value of 'r'. It can be calculated by P.E(r) = 0.6745 (1-r^2) / Sqrt(n) It helps in interpreting its value. Sample Correlation coefficient is r and Population correlation is ρ (say). If ρ is not known then ρ can be estimated by using P.E. in such a way (r - P.E(r)) < ρ < (r+P.E(r)) where(r - P.E(r)), (r+P.E(r)) are called lower and upper limits. That means the value of ρ is lies between these two limits. Comparison : i) If r < P.E(r) then there is no correlation and not significant ii) If r >= 6 P.E(r) then ther is correlation exist and significant
iii) If r = P.E(r) then there is correlation


III. Spearman's rank correlation coefficient : -

This method is developed by Prof. Charles Edward Spearman in 1904. It is also called Rank Correlation. It helps us in determining the coefficient of correlation between the ranks of individual in the two attributes. The relation between two qualitative series is called rank correlation.

The formula for rank correlation is ρ = 1-(6∑〖di〗^2)/(n(n^2-1)) Where di = R1 - R2, where R1, R2 be the ranks of Xi's and Yi's respectively.

Order of Assign the Ranks : The ranks may be assigned to the different values either in ascending or in descending order. In whatever order, the ranks may, be given, the same order of ranking must be followed in case of both the variables.

Tied Ranks : In some situations, same value(identical item) is repeated more than once in any series. Then we give same rank, the ranks are called Tied Ranks. But in this, we give average of progressive ranks to each identical item. In this way, we have some error involved. The error can be over can by a factor m(m^2-1)/12 where m is No. of identical item.

IV. Concurrent Deviation Method : - The correlation coefficient is obtained by concurrent deviation method is r =

2 comments:

  1. Could you please post the derivation of the formula of concurrent deviation method... It would be of great help

    ReplyDelete
  2. +- Square root (2c-m)/m

    Where m number of pairs of observations
    C = number of concurrent deviation or number of positive signs in the product deviation column

    ReplyDelete