There are many ways of summarizing the data you gathered one, in particular, is the Frequency Distribution Table (FDT). The frequency distribution is a useful summary of most kinds of data. It sorts observations into categories and describes how often observations fall into each category. In simple terms, frequency distribution refers to the tabular arrangement of data by classes or categories together with their corresponding class frequencies. Data presented in the form of a frequency distribution are called grouped data.
In constructing a
frequency distribution table, considerations must be given to the number of
classes (class size) to be used and the class intervals to be employed. Below
is a presentation of a technique in constructing a frequency
distribution.
Step 1: Get the range (R) by subtracting the highest score from the lowest score
k = 1+3.322 logn (n is the number of observations), which is rounded off to the next higher integer
Step 1: Get the range (R) by subtracting the highest score from the lowest score
- example: HS=119; LS=35 ---> R=HS-LS --->R=119-35=84 --->R=84
k = 1+3.322 logn (n is the number of observations), which is rounded off to the next higher integer
- example: n=50
- k=1+3.322 logn
- k=1+3.322 log50
- k=1+3.322 (1.70)
- k=1+5.6474
- k=6.6474 ≈ 7
- example: c=84/7=12
CLASS
FREQUENCY. This refers to the number of
observations belonging to a class interval, or the number of items within a
category (Pagoso, 1986). To illustrate, consider the following scores of ten
pupils in a competitive test: 15, 15, 15, 18, 18, 19, 22, 22, 24, 24.
Scores frequency
15 3
18 2
19 1
22 2
24 2
Class
frequency can be arranged in different forms. This can be done using cumulative
frequency and relative frequency.
- CUMULATIVE FREQUENCY. This is a tabular arrangement of data by class intervals whose frequencies are cumulated. There are two kinds of cumulative frequency (cf). These are: “less than” cumulative frequency (<cf) whose sum of frequencies for each class interval is less than the upper class boundary (Cb) of the interval they correspond to.
Example:
Ci f <cf Cb
15-16 3 3 14.5-16.5
17- 18 2 5 16.5-18.5
19-20 1 6 18.5-20.5
21-22 2 8 20.5-22.5
23-24 2 10 22.5-24.5
Each number in <cf column is interpreted as:
three items are less than 16.5; 5 are less than 18.5 and so on.
On the other hand,
the “greater than” cumulative frequency (>cf) whose sum of
frequencies for each class interval is greater than the lower class boundary of
the interval they correspond to.
Example:
Ci f >cf Cb
15-16 3 10 14.5-16.5
17- 18 2 7 16.5-18.5
19-20 1 5 18.5-20.5
21-22 2 4 20.5-22.5
23-24 2 2 22.5-24.5
Each number in >cf column is interpreted as: 10
items are greater than 14.5; 7 items are greater than 16.5 and so on…
- RELATIVE FREQUENCY. This is a tabular arrangement of the data showing the proportion in percent of each frequency to the total frequency. This can be obtained by dividing the class frequency by the total frequency.
Example: Ci
f rf (%)
15-16 3 30
17- 18 2 20
19-20 1 10
21-22 2 20
23-24 2 20
Thus, if we have a
class frequency of 3, the relative frequency is 3/10 or 30%
CLASS
MARK. This can be obtained by
adding the lower limit and upper limit and dividing the resulting sum by 2.
Example in the interval 75-79, the lower limit is 75 and the upper limit is 79 gives us the average of 77, thus: x=(lower limit + upper limit) / 2 = (75+79)/2 = 154/2 =77
CLASS
BOUNDARY. This refer to the true limits
of the distribution, where lower class boundary [Li] is computed by
subtracting ½ unit from the lower class limit while the upper class boundary [Ui]
is obtained by adding ½ unit to the upper class limit. To show this concept
let’s use the interval 75-79 again. In this interval we know that the lower
limit is 75 and the upper limit is 79. To get the lower boundary and upper
boundary we simply:
- Lower boundary = 75-0.5=74.5
- Upper boundary = 79+0.5=79.5
Class Interval
|
f
|
c (class mark)
|
Class Boundaries
|
<cf
|
>cf
|
75-79
80-84
85-89
|
2
14
14
|
77
82
87
|
74.5-79.5
79.5-84.5
84.5-89.5
|
2
16
30
|
30
28
14
|
Measure of Central Position: Mean (grouped data)
To find a set of quantitative
data, it is indeed necessary to define numerical measures that describe
essential characteristics of the data. Further, any measure indicating the
center of a set of data, arranged in order of magnitude, is the measure of
central position or measure of central tendency. The most commonly
used measures of central position are the mean, median, and mode.
Mean
Observe the
following achievement scores of pupils in mathematics: 18, 19, 20, 21, 22, 23,
24, 25, 26 and 75. If you add all the score divided by the number of pupils, the mean of all items is 27.3. This figure is no longer a
representative value since most scores is less than 27.3 except for the pupil
that obtained the score of 75. The example gives one property of mean that is "mean is
strongly influenced by extreme value."
For
grouped data, the mean can be computed using the long method and the coded
deviation method (short method).
For long method, we use this equation:
, where f is the frequency, x is the class
mark, and N is the total frequency or total number of observation or
cases.
Example:
Class Interval
|
f
|
x
|
fx
|
118 – 126
127 – 135
136 – 144
145 – 153
154 – 162
163 – 171
172 – 180
|
3
5
9
12
5
4
2
|
122
131
140
149
158
167
176
|
366
655
1260
1788
790
688
352
|
N: 40
|
Σfx: 5,879
|
For the coded deviation
method, the original observations are converted to coded deviations (d’).
Here, you choose for an assumed mean . In choosing for assumed mean, any reasonable value in the distribution will do but
generally the highest frequency is taken. We use this equation,
, whereis the assumed mean,is the sum of the differences of frequency and unit coded
deviation, N is the total number of observations, i is the class interval. To understand deeply below is an example:
Class Interval
|
x
|
f
|
d’
|
fd’
|
118 – 126
|
122
|
3
|
- 3
|
- 9
|
127 – 135
|
131
|
5
|
- 2
|
- 10
|
136 – 144
|
140
|
9
|
- 1
|
- 9
|
145 – 153
|
149
|
12
|
0
|
0
|
154 – 162
|
158
|
5
|
+1
|
+5
|
163 – 171
|
167
|
4
|
+2
|
+8
|
172 – 180
|
176
|
2
|
+3
|
+6
|
N=40
|
Σ: -9
|
Nice Post Keep Updating Data Science online Training Hyderabad
ReplyDelete