Measure of Central Position (grouped data)
To find a set of quantitative data, it is indeed necessary to define numerical measures that describe essential characteristics of the data. Further, any measure indicating the center of a set of data, arranged in order of magnitude, is the measure of central position or measure of central tendency. The most commonly used measures of central position are the mean, median, and mode.
Mean
Observe the following achievement scores of pupils in mathematics: 18, 19, 20, 21, 22, 23, 24, 25, 26 and 75. If you add all the score divided by the number of pupils, the mean of all items is 27.3. This figure is no longer a representative value since most scores is less than 27.3 except for the pupil that obtained the score of 75. The example gives one property of mean that is "mean is strongly influenced by extreme value."
For grouped data, the mean can be computed using the long method and the coded deviation method (short method). However, for this discussion we will focus on getting the mean using the coded deviation.
Getting the mean using the coded deviation method, the original observations are converted to coded deviations (d’). Here, you choose for an assumed mean . In choosing for assumed mean, any reasonable value in the distribution will do but generally the highest frequency is taken. We use this equation,
, whereis the assumed mean,is the sum of the differences of frequency and unit coded deviation, N is the total number of observations, i is the class interval. To illustrate this see the example below:
Class Interval
|
x
|
f
|
d’
|
fd’
|
118 – 126
|
122
|
3
|
- 3
|
- 9
|
127 – 135
|
131
|
5
|
- 2
|
- 10
|
136 – 144
|
140
|
9
|
- 1
|
- 9
|
145 – 153
|
149
|
12
|
0
|
0
|
154 – 162
|
158
|
5
|
+1
|
+5
|
163 – 171
|
167
|
4
|
+2
|
+8
|
172 – 180
|
176
|
2
|
+3
|
+6
|
N=40
|
Σfd': -9
|
Median
Another measure of central position is Median. Observe the following distribution:
a). 2,
3, 8, 10, 16, 17, 18
b). 2, 3, 8, 10, 16, 17
What is the mid-value of a and b? If your answer is 10 for a and 9 for b , you are
correct. The unit 10 and 9 is the median in the distribution for set a and b. This example brings us
the description that median is the middle measurement/item/value in a set of
measurement arranged in an increasing or decreasing order. For set a, median is easily identified since the set is odd while in set b which is even, there were two middle values (8 & 10). To get the median for set b, add these two values then divide it by 2. Thus, (8+10)÷2=9.
Moreover, the median is a positional measure. The values of the individual
items in the distribution do not affect the median. Example, in this distribution: 2, 4, 5, 6, 7,
15, 37… 6 is the median despite of two deviant values (15 and 37). This means
that median is not affected by extreme values. Because of this,
median can be considered an appropriate measure if you don’t
want extreme values to influence the average.
For grouped data, that is when data are given in frequency distribution form, we first determine in what class interval we can find the N/2th case. This means that we have to ascertain the value which divides the distribution into equal parts. To understand deeply, the table below presents the frequency distribution of 38 scores, where half of the scores (that is N/2=38/2=19) lies above the median (in ascending order of distribution this can be identified below the median with larger values in the interval, in this case it is 25) and half below (in ascending order of distribution this can be identified above the median with smaller values in the interval in this case it is 23).
SCORES
|
F
|
<cf
|
>cf
|
40-44
45-49
50-54
55-59
|
1
1
4
7
|
1
2
6
13
|
38
37
36
32
|
60-64
|
10
|
23
|
25
|
65-69
70-74
75-79
80-84
|
9
3
1
2
|
32
35
36
38
|
15
6
3
2
|
N=38
|
If median is taken from above, that is considering the >cf,
the N/2 which is equal to 19, it would fall between 60-64. This can
be done by counting the frequencies upward from the bottom and finding where
the N/2 (19) item is found. In our example, N/2 (19) lies in the interval 60-64, whose boundaries are 59.5 and
64.5, thus the following equation can be used:
where U refers to the upper boundary where the median lies; N/2 is the half of the total number of observations; Fub is the sum of all frequencies above the upper boundary; fm is the frequency of the median class; i is the length of the interval. Thus, the median is:
where U refers to the upper boundary where the median lies; N/2 is the half of the total number of observations; Fub is the sum of all frequencies above the upper boundary; fm is the frequency of the median class; i is the length of the interval. Thus, the median is:
If we take median from below, we consider the <cf that is
counting the frequencies from above to bottom. We observe the same procedure as
in the median from above, only we use this equation:
where L is the lower boundary of the class interval; N/2 is the half of the total number of observations; Flb is the sum of all frequencies below the lower boundary; fm is the frequency of the median class; i is the length of the interval. Thus the median from below is:
Mode
The mode on the other hand is the simplest measure of central position, simplest in a sense that it can be easily identified. In an ungrouped data the item that occurs most often is the mode. Which is the mode in this set of scores: 17, 18, 18, 20, 21? The score often occurring in the set is 18, so 18 is the mode. A distribution with one mode is known to be unimodal while a distribution with two or more modes is said to be multimodal. Below are samples showing unimodes and multi-modes:
Sample 1 Sample
2 Sample 3
21 21 16
20 21 14
19 19 16
19 19 15
19 17 14
17 16 15
15 15 17
Sample 1 is an example of
unimode where 19 is the item occurring most often in the distribution. Sample 2
and 3 are examples of multimodes, where there are more than one item occurring
frequently in the distribution. What are the modes in sample 2 and 3? For
sample 2 the modes are 21 and 19 while in sample 3, the modes are 16, 15 and
14.
When data are grouped, the mode is defined as the midpoint of the interval containing the largest number of cases. Moreover, the modal value can be also
computed if data are grouped, thus we use this equation:
where Lmo refers to lower boundary of the
modal class (usually obtained in the class interval with the highest
frequency); fmo is the frequency of the modal
class; f1 is the frequency above the modal class; f2
is the frequency below the modal class; and i refers to the length
of class interval.
Let’s find the mode using
same data.
No comments:
Post a Comment