Wednesday, February 26, 2014
Measure of Central Tendency
Measure Of Central Tendency
We will discuss about the measure of central tendency. It is also called 'Average'. Average gives central value/point from the given data. It helps us an idea about the whole data.
Types of Averages: -
1. A.M. 2. Median 3. Mode 4. G.M. 5. H.M.
A.M. It is denoted by Me. (Quantitative Data). Mean is stable measure.
Properties of Mean:
1. Sum of deviation of set of values from the mean is zero.
2. Sum of square of deviation of set of values from mean is minimum.
3. Combined Mean exist:
Ist group mean and number of observations n1 and IInd group mean and number of observations n2 then combined mean
Uses:
1. Mean is simple to understand and easy to calculate.
2. It is rigidly defined and so its value is always definite.
3. Its computation is based on all the observations.
4. It is also capable of being handled algebraically, i.e. its further algebraic treatment is possible.
5. Its value is least affected by sampling fluctuations.
Demerits:
1. The value of A.M. is highly affected by extreme values of the variable.
2. A.M. cannot be calculated if even a single observation in the series is missing.
3. A.M. can be a value that does not exist in the series.
4. A.M. cannot be determined if the data is open-end class.
Median: It is denoted by Md . Median is value of the size of the central value of the arranged data. It is the value of the middle item and divides the series into two equal parts. The no. of items which are < Md and the no. of items which are > Md are same. Median is positional average.
Individual Series:-
Step i) Arrange the data in ascending order or descending order.
Step ii) Identify the the no. of values N
if N is odd then Md = (N+1) / 2 th term
if N is even then Md = N/2th and (N/2) + 1 th terms
Discrete Series: -
Step i) Arrange the xi’s in ascending order
Step ii) Find less than cumulative frequencies (L.C.F).
Step iii) Identify L.C.F just greater then N/2, the corresponding xi is Median.
Continuous Series: -
Step i) Arrange the xi’s in ascending order
Step ii) Find less than cumulative frequencies (L.C.F).
Step iii) Identify L.C.F just greater than N/2, the corresponding class is Median class.
Uses:
1. It is easy to calculate and understand. In some cases it can be located simply by inspection.
2. It is a truly defined average as it is the central position of the given data
3. Median can be defined for qualitative data.
4. It is not affected by extreme observations.
5. The value of median can also be located graphically.
6. Median is especially useful in the case of open-end distribution.
Demerits:
1. For locating median, data have to be arranged in ascending or descending order which is quite tedious for a long series of observations.
2. Its determination is not based on all the observations.
3. It is not capable of further algebraic treatment.
4. Its value is affected by fluctuations of sampling
5. In a continuous series, median is calculated by using an interpolation formula.
Mode: It is size of which possess maximum frequency. The value of the variable which occurs most frequently in distribution is called mode. Mode is model value. It is denoted by Mo.
I.S: - Mode is the value which is maximum frequency in the data set.
D.S:- Mode if values of Xi which is corresponding to maximum frequency
C.S:- Step 1: First we identify mode class according to the highest frequency.
Step2:
Where, l is lower limit of modal class.
f is frequency of modal class
f1, f2 is preceding and succeeding frequency of modal class
Uses:
1. It is comparatively easy to understand.
2. It is the simplest descriptive measure of average.
3. It is not affected by extreme items. It can be obtained even if the extreme values are not given.
4. It can be determined for open-end distributions.
5. Mode has been defined as the most typical value of a distribution. Therefore, it is a useful average for many practical situations, such as, average size of shoe, average price of a commodity, the average type of dress, average wages and so on.
Demerits:
1. It is not precisely defined.
2. It is not based on all the observations.
3. It is not capable of being handled algebraically as its value is not based on all the observations.
4. The mode does not exist in many cases in many cases while there may be more than one mode in other cases. i.e. it is not useful as an average in such situations.
5. The value of mode is significantly affected by the size of the class-interval which is the basis of grouping the frequencies.
6. If the data contains single mode then it is called Uni-variate Model or Unimodel distribution.
If the data contain two modes then it is called Bi-Model distribution.
If the data contain more than two modes then it is called Multi-model distribution.
Geometric mean: - It is denoted by G
I.S:- D.S:- C.S:-
Merits:
1. G.M. is highly useful in averaging ratios percentages and rate of increase between two periods.
2. G.M. is important in the construction of index numbers.
3. It is based on all observations.
4. It is rigidly defined.
5. It is less affected by the extreme values.
6. It is useful in studying economic and social data.
Demerits:
1. It is difficult to understand.
2. Non-mathematical persons cannot do calculations.
3. It has restricted applications.
4. Any one of the observation is zero then G.M. is zero
Harmonic mean: It is denoted by H.
I.S:- H.M. is reciprocal of average of reciprocal of observations.
Merits:
1. It is rigidly defined.
2. Its computation is based on all the observations.
3. It is capable of further algebraic treatment.
4. It is also not affected much by sampling fluctuations.
5. It is very useful for measuring average relative changes in certain types of rates or ratios.
Demerits:
1. It is not easy to understand.
2. It is rather complicated to calculate.
3. It gives more weight to small observation and thus may lead to fallacious results. However, in view of this property, the harmonic mean is more useful when more weights are to be given to smaller observations.
4. Any one the observation is zero then H.M. fails.
Relation between Mean, Median and Mode:
i) If the data is symmetric then Mean = Median = Mode.
ii) If the data is asymmetric ( not symmetric) then Mode = 3Median – 2 Mean.
Relation between A.M, G.M, H.M:
i) If the observations are same then A.M. = G.M. = H.M.
If the observations are different then A.M. > G.M. > H.M.
Generally
ii)
Two Numbers: If a, b are two numbers then
Measure of Dispersion
MBA
- Measure Of Dispersion
Dispersion measures the extent to which the items
vary from some central value. Dispersion is also called Scatter, Spread or Variation.
Measure of Dispersion: It expresses
quantitatively the degree of variation but not the direction of variation. It
is called 2nd degree average. The measures are
1.
Range 2.Quartile Deviation 3.Mean
Deviation 4.Variance and Standard deviation
1. RANGE:
It is denoted by R.
I.S.: R = Max(xi) – Min(xi) D.S.:
R = Max(xi) – Min(xi)
C.S.: R = Upper bound of last Class – Lower bound of
First Class
Merits:
1. It
is simple to understand.
2. It
is easy to calculate and provides a broad picture of the scatteredness in the
data quickly.
Demerits:
1. Its
composition is not based on all the observations.
2. It
is affected by extreme items.
3. It
is very much influenced by sampling fluctuations.
4. Range
is a crude measure of dispersion. It does not tell us about the variation in
the observations relative to the average.
5. Range
cannot be calculated for open end class.
2. Quartile
Deviation: It is denoted by Q.D. It is also called Semi-inter Quartile
Range.
Q.D.= (Q3 + Q1) / 2
Quartiles: The whole data can be divided into 4 equal
parts by using 3 quartiles namely Q1, Q2 and Q3.
Merits:
1. Q.D.
is simple to calculate.
2. Q.D.
is easy to understand.
3. Q.D.
is useful for measuring variations in open-end classes.
4. It
is also very useful when extreme items are likely to affect the analysis.
Demerits:
1. The
calculation of Q.D. is not based on all observations.
2. It
is not possible to give it further algebraic treatment.
3.
It is very much affect by sampling
fluctuations.
3. Mean Deviation: It
is denoted by M.D.
Merits:
1. M.D.
is simple to calculate.
2. M.D.
is easy to understand.
3. Its
computation is based on all the observations.
4. It
is not affected by extreme observations.
Demerits:
1. In
the calculation of mean deviation, the algebraic signs of the deviaitons are
ignored therefore its definition is non-algebraic. It view of this, mean
deviation is no used in further statistical calculations.
2. If
is not well defined measure as any of the central value can be used in its
computation. More so, the mean deviation calculated from different averages
(mean, median, or mode) will not be the same.
4.
Variance:
Standard deviation: The concept of S.D. was first introduced by Karl Pearson. It is one of the most popular and important measure of dispersion. It satisfies most of the properties of a good measure of dispersion.
Properties:
- Combined
Variance:
Merits:
1. It
is rigidly defined.
2. Its
computation is based on all the observations.
3.
It is amenable to further algebraic
treatment which makes it the most important and widely used measure of
dispersion. For example, the S.D. is used in computing skewness, correlation etc.
It is important statistical measure in sampling theory.
4. Among
all the measures of dispersion, it is least affected by sampling fluctuations.
Demerits:
1. S.D.
is comparatively difficult to calculate.
2. It
gives greater weight to extreme observations.
3. It
is an absolute measure of dispersion and cannot be used for comparing
variability of two or more distributions expressed in different units.
Relative Measure of Dispersion:
1. Coefficient
of Variation = C.V. =
Correlation Analysis
Correlation
Bi-variate distribution:
If the distribution has two variables taking values at a time.
X
: x1,x2,…xn
Y
: y1,y2,….yn
Ex:
The bivariate distribution of height and weight of the students is
Height cm : 165 150 155 170 163 162
Weight kg : 40 45 50 60 48 46
Def: The
relation between two variables is correlation. In other words, If the change in
one variable affects a changes in the other variable, the variable are said to be
correlated or correlation.
Ex: i) Height and Weight ii) price and demand iii) Yield and rainfall
Types of correlation
1.
Positive
correlation: If the two variables deviate/change in
same direction, i.e., if the increase (or decrease) in one variable in a
corresponding increase (or decrease) in the other variable, the correlation is
said to be direct or positive correlation.
Ex:
Income and Expenditure, Height and Weight
2.
Negative
correlation: If the two variables deviate/change in
opposite direction, i.e., if the increase (or decrease) in one variable in a
corresponding decrease (or increase) in the other variable, the correlation is
said to be diverse or negative correlation
Ex:
Price and Demand, Volume and Pressure of a perfect gas
3.
Perfect
+ve correlation: Both variable changes in the same
directions with same proportionalities.
4.
Perfect
–ve correlation: Both variable changes in the opposite
directions with same proportionalities.
5.
Un-correlation:
If the change in one variable does not effect of the changes in the other
variable is un-correlation.
Measurement
of correlation:
1.
Scattered
diagram: If one variable X is plotted on the X – axis and the
other one is plotted on the Y – axis, then each paired observation shall have
one point on the graph. The diagram of the dots so obtain is called scatter diagram.
This
diagram can certainly give an immediate and fairly good picture as how the two
variables are mutually related. If the points are very dense it will indicate
that the two variables are high correlated. If the points are very widely
scatted, that would indicate that a poor or weak correlation. It gives only
direction of the correlation but not degree of correlation.
The diagram may be
2. Karl Pearson correlation
coefficient: It is denoted by
It is also called ‘Product moment
correlation coefficient’. It gives us both nature and degree of correlation
Properties:
i.
The linear relationship between two
variables is k.p.c.c.
ii.
K.p.c.c. value always lies between two
variables is -1 and +1. (i.e.)
iii.
If rXY
> 0 then +ve correlation. If
rXY < 0 then –ve
correlation.
If
rXY = +1 then Perfect +ve
correlation. If rXY = - 1 then Perfect –ve
corr.
If
rXY = 0 then
Uncorrelation.
iv.
rxy
= ryx
v.
Let the new variables
vi.
If X and Y are two independent r.v.s
then rxy = 0 but converse is need not be true.
Merits:
Merits:
1.
It is most important and popular method
for measuring the relationship between two variables. It gives a precise and
quantitative value indicates the degree of relationship existing between two
variables.
2.
It measures the direction as well as
relation between two variables.
Demerits:
1.
The value of the coefficient is affected
by extreme items.
2.
Its computational procedure is difficult
as compared to other methods.
3.
Coefficient’s value lies between -1 and
+1, therefore it value needs a careful interpretation.
3.
Spearman
rank correlation: It
is denoted by
.
Properties:
i.
S.r.c.c. value always lies between -1
and +1. i.e.
.
ii.
S.r.c.c. gives us both nature and
degree of correlation
iii.
Repeated
Ranks: In xi’s
or yi’s, some values are
same then we give same rank. But we did not give same rank to them. So some
error occurred. The error can be rectified by C.F. The S.R.C.C formula be
Merits:
1. The
method is simple and easily understandable as compared to the Karl Pearson’s
method.
2. The
method is especially useful if data is qualitative.
3. The
method can be applied to irregular data also as it does not assume that the
data should be normal.
Demerits:
1. The
method can be applied to ungrouped data only.
2. The
ranking procedure involved in this procedure ignores the actual magnitude of
data and as such, the results obtained are only approximates.
3. The
computation procedure becomes difficult as the number of paired observations
increases.
Probable Error: It is measure
of reliability of correlation coefficient.
If r < PE (r), there is no
correlation and no significant.
If r > 6PE(r), there is
correlation and significant.
Subscribe to:
Comments (Atom)









