Monday, July 8, 2013

Constructed-response Test (Supply Type Test)

1. Objective Supply Type Test: Completion & Short Answer

The item types falling under this format require learners to construct a response to a question or directive. The sub-types, however, differ in terms of the structure of the response needed to answer the item. Two variants are completion type (also known as Gap-Filling) and short answer type.

A. Completion Type or Gap-filling. The item structure in this type of test consists of a stimulus that defines the question or problem, and a response that defines what is to be provided or constructed by the learner. It is an incomplete statement with a blank often used as a stimulus and the response is a constructed word, symbol, numeral, or phrase to complete the statement. Fill in the blank and close test fall in this format.

Fill in the Blank Test. This is a test format where students are given a statement with a blank and they are required to fill it in with the most appropriate answer possible. It requires mostly the remembering level of Bloom’s Taxonomy and if well designed a question can test higher-order thinking. This test consists of a stem and a blank where the students would write the correct answer (see table 1). It is suggested that the blank in this format should not be more than two (2).

Table 1: Sample Completion Type Item Structure


Advantages:
  • allow for a wide sampling of content.
  • minimizes the chance of students guessing the answer.
  • requires students to think about the correct plausible answer, as opposed to choosing from multiple possible answers.
Disadvantages:
  • usually, recall-knowledge based questions
  • marking questions can become time-consuming as there could be multiple unique answers which are all potentially correct.
  • difficult to create a fill in the blank question that can only result in one answer unless there are relevant clues.
  • providing an answer key for students to choose from can help eliminate the multiple answer possibility, but limits the thinking process of answering the question.
Cloze Test. This test format allows the student to fill several gaps (generally more than two blanks) in a discourse depending on the learning outcome. It is also called the "cloze deletion test". This format is an exercise, test, or assessment consisting of a portion of text with certain words removed (cloze text), where the teacher asks the participant to restore the missing words. Language teachers often utilize this form for integrative testing where more than one type of skill (e.g. vocabulary and comprehension skills) are needed to fill in the gaps. Cloze tests require students to understand context and vocabulary to identify the correct words that belong in the deleted passages of a text (see table 2). 

Table 2: Sample Cloze Test

source: De Guzman & Adamos, 2015

The Development of the Cloze Test
Research indicates that teachers at many elementary schools require their students to read books and materials that they often struggle to read. This condition is largely based on the graded system which assumes that all children learn all things at virtually the same time. It seems imperative that teachers choose materials that match students' reading skills. To accomplish this, the first task is to determine the appropriateness of reading materials for various students. To some extent, the standardized achievement tests offered at least once a school year in most school systems, provide such information. However, the results of such tests do not provide a reliable index of reading success in various materials. The reasons for this are: 
  1. Achievement tests are based on limited samples; they cannot predict achievement accurately in specific materials that draw on varied concepts, sentence patterns, etc. 
  2. Achievement tests are most reliable in the middle ranges of achievement. They often mislead in measuring the achievement of those in the lower reading ranges. 
Because standardized tests cannot accurately determine the suitability of given reading materials, many reading authorities suggest informal tests of the involved materials. The best test of reading skill relies on the student's ability or inability to read the given material. Thus, if a sixth-grade teacher wishes to find out which students can read and comprehend the sixth-grade geography text, the teacher must:
  1. Direct each student to read a specified portion of the text.
  2. Direct the student to demonstrate some degree of understanding. A student can do this by answering questions about the selection.
This method of testing materials is generally called "informal reading inventory testing." In most instances, the label is equated with the task of finding pupils' reading levels by asking them to read a series of increasingly difficult selections (followed by comprehension questions). Students in the earlier stages of reading development read the various materials both orally and silently, while higher-level students read silently before answering the questions. Although potentially valuable, "informal reading inventory testing" involves many qualitative decisions on the part of the teacher, such as:
  1. Oral Reading
    • Mutes are oral reading errors? 
    • What is the maximum number of oral reading errors that can be permitted? 
    • How fluent should oral reading be? 
    • How do you determine fluency? 
  2. Silent Reading 
    • What is a reasonable amount of time to read the given selection? 
  3. Comprehension 
    • What are the most important elements that the student should remember about the selection? 
    • To what extent are the questions relevant to the main elements of the selection? 
The quality of judgments in the above depends upon very sophisticated judgments. In fact, the judgments can be so sophisticated that reading experts suggest that teachers may make completely inappropriate judgments if they use the prevailing error marking systems. At this point, the question many teachers ask is, "If teachers cannot depend upon achievement tests or their own observations to determine the suitability of reading materials for different children, what, then can they use?"

We have two very different ways. Several diagnostic reading test authors have developed tests that can more accurately predict the proper instructional level of texts and others have presented data to indicate that their special instruments will predict more accurately than achievement tests. Another way has been seen in the "cloze technique" procedure as developed by John Bormuth (1967).

In the "Cloze Test Procedure," the teacher instructs students to restore omitted words (usually every fifth word) in a reading passage. Based on reviewing students' restored words from the text passages, the teacher can determine a more accurate level of comprehension.

Wilson L. Taylor introduced the term "cloze procedure" in 1953 and thoroughly researched the value of closure tasks as predictors of reading comprehension. Basic to the procedure is the idea of closure wherein the reader must use the surrounding context to restore omitted words. Comprehension of the total unit and its available parts (including the emerging cloze write-ins) is essential to the task.

To use the Cloze Test Procedure to score material, follow this protocol:
  1. Administration
    • Omit every 5th word, replacing it with a blank space for the student to write in the answer. 
    • Instruct students to write only one word in each blank and to try to fill in every blank. 
    • Guessing is encouraged. 
    • Advise students that you will not count misspellings as errors. 
  2. Scoring: 
    • In most instances, the exact word must be restored. 
    • Misspellings are counted as correct when the response is deemed correct in a meaningful sense. 
Validating the effectiveness of the Cloze Test as a measure of readability and comprehension is interesting because of: (1) the ways in which reading comprehension is scored, and (2) the almost universal finding of high correlations between cloze and other prediction instruments.

Initially, Taylor (1953) compared cloze score rankings of passages of varying difficulty with readability rankings of the same passages by two common readability formulas, Dale-Chall and Flesch formulas. The passages were similarly rank-ordered by each technique. The Cloze Test scored the readability of very difficult text passages more accurately than the Dale-Chall and Flesch formulas.

Guidelines in the construction of Completion Items
  1. There should only be one correct response to complete a statement. This contributes to efficiency in scoring since a key to correction can easily be prepared in advance when there is only one expected response. Proper wording of the incomplete statement must be carefully done to avoid having more than one correct answer. An exception to this rule is if you are testing for verbal creativity where giving diverse but acceptable responses is desirable. This, however, should be explicitly mentioned in the test instructions. For instance, the more synonyms students can give to the word costly like expensive, exorbitant, pricey, the more points they can earn (see table 2). Objective scoring will likely have to be modified also. Another example, in table 1 item a, the way it is worded may be open to more than one acceptable answer such as square, rectangle, or parallelogram. To eliminate other terms it can be worded this way, "A quadrilateral with four equal sides is called _____."
  2. The blank should be placed at the end or towards the end of the incomplete statement. This will provide the reader with an appropriate and adequate context before he/she gets to answer the blank and consequently avoid being perplexed.
  3. Avoid providing unintended clues to the correct answer. The validity of a student's score is jeopardized when he/she answers correctly an item without really knowing what the correct response is. His/her score may represent a different kind of ability apart from what is intended to be measured. This happens when a student who doesn't know the answer would find one by using unintended grammatical clues, e.g. presence of indefinite articles "a or an", before the blank to suggest a response that starts with a vowel.
2. Short Answer Test. Short-answer questions are open-ended questions that require students to create an answer. Instead of supplying words to complete statements, relatively short answers are constructed as direct answers to questions. This format is commonly used in examinations to assess the basic knowledge and understanding (low cognitive levels - declarative & procedural knowledge) of a topic before more in-depth assessment questions are asked on the topic. Short Answer Questions do not have a generic structure. Questions may require answers such as complete the sentence, supply the missing word, short descriptive or qualitative answers, diagrams with explanations, etc. The answer is usually short, from one word to a few lines. Often students may answer in bullet form. Item statement in this format is generally in the interrogative form (see table 3).

     Table 3: Sample Short Anwer Test

Guidelines in constructing a good short answer question:
Writing short-answer items similarly follow guidelines in writing completion items. Below are guidelines suggested by McMillan (2007) and CETL (2009).
  1. State the item so that only one answer is correct
  2. State the item so that the required answer is brief. Requiring a long response would not be necessary and it can limit the number of items students can answer within the allotted period of time.
  3. Do not use questions verbatim from textbooks and other instructional materials. This will give undue disadvantage to students not familiar with the materials since it can become a memory test instead of completion.
  4. Designate units required for the answer. This frequently occurs when the constructed response requires a definite unit to be considered correct. Without designating the unit, a response may be rendered wrong because of a differing mindset.
    • Example:
      • Poor: How much does the food caterer charge? This item could be answered in different ways like cost per head, per dish, per plate, or as a full package
      • Improved: How much does the food caterer charge per head?
  5. State the item succinctly with words students understand. This is true for all types of tests. The validity of a classroom-based test is at risk when students cannot answer correctly, not because they do not know, but could be due to the messy wording of the question.
    • Example:
      • Poor: As viewed by creatures from the earth, when does the blood moon appear in the evening?
      • Improved: When does a blood moon appear?
  6. Design short answer items which are the appropriate assessment of the learning objective
  7. Make sure the content of the short answer question measures knowledge appropriate to the desired learning goal
  8. Express the questions with clear wordings and language which are appropriate to the student population
  9. Ensure that the item clearly specifies how the question should be answered (e.g. Student should answer it briefly and concisely using a single word or short phrase? Is the question given a specific number of blanks for students to answer?)
  10. Write the instructions clearly so as to specify the desired knowledge and specificity of response 
  11. Set the questions explicitly and precisely.
  12. Direct questions are better than those which require completing the sentences.
  13. For numerical answers, let the students know if they will receive marks for showing partial work (process-based) or only the results (product based), also indicated the importance of the units.
  14. Let the students know what your marking style is like, is bullet point format acceptable, or does it have to be an essay format?
  15. Prepare a structured marking sheet; allocate marks or part-marks for an acceptable answer(s).
  16. Be prepared to accept other equally acceptable answers, some of which you may not have predicted.
Common Points of Completion and Short Answer
  • Appropriate for assessing learning outcomes involving knowledge and simple understanding
  • Capable of assessing both declarative and procedural knowledge
  • Both are easy and simple to construct
  • Both are objectively scored since a key to correction can be prepared in advance
  • Both need an ample number of items to assess a learning outcome. A single completion or short-answer item is not sufficient to test mastery of a competency.
Advantages:
  • relatively fast to mark and can be marked by different assessors, as long as the questions are set in such a way that all alternative answers can be considered by the assessors.
  • relatively easy to set compared to many assessment methods.
  • can be used as part of a formative and summative assessment, as the structure of short answer questions are very similar to examination questions, students are more familiar with the practice and feel less anxious.
  • Unlike MCQs, there is no guessing on answers, students must supply an answer.
Disadvantages:
  • only suitable for questions that can be answered with short responses. It is very important that the assessor is very clear on the type of answers expected when setting the questions because SAQ is an open-ended question, students are free to answer any way they choose, short-answer questions can lead to difficulties in grading if the question is not worded carefully.
  • typically used for assessing knowledge only, students may often memorize Short Answer Questions with rote learning. If assessors wish to use Short Answer Questions to assess deeper learning, careful attention (and many practices) on appropriate questions are required.
  • accuracy of assessment may be influenced by handwriting/spelling skills

2. Non-Objective Supply Type Test: Essay

This is a free response test question. Unlike the completion and short-answer items which are highly structured to elicit only one short correct answer, essay items are less structured to allow the students to organize freely their response using their own writing style to answer the question. It allows measuring students' abilities to organize, integrate, and synthesize his knowledge, to use the information to solve problems, and to be original or innovate in his approaches to problem situations. This can be a composition test or definition illustration test. This format, therefore, is appropriate for testing deep understanding and reasoning. Some of the thinking processes involved in answering essay questions are comparison, induction, deduction, abstracting, analyzing perspective, decision-making, problem-solving, constructing support, and experimental inquiry (Marzano, et.al., 1993). They actually involve higher-order thinking skills.

14 types of abilities that can be measured by essay items (Stecklein as cited by Santos, 2007)
  • comparison between two or more things
  • the development and defense of an option
  • questions of cause and effect
  • explanation of meanings
  • summarizing of information in a designed area
  • analysis
  • knowledge of relationship
  • illustrations of rules, principles, procedures, and applications
  • applications of rules, laws, and principles to new situations
  • criticisms of the adequacy, relevance, or correctness of a concept, idea, or information
  • formulation of new questions and problems
  • reorganization of facts
  • discrimination between objects, concepts, or events
  • inferential thinking
Types of Essay
  1. Restricted response question usually limits both the content and the response. The content is usually restricted by the scope of the topic to be discussed, limitations on the form of response are generally indicated in the question. Another way of restricting responses in essay tests is to base the questions on specific problems.
    • Examples:
      • Write the life sketch of mahatma Gandhi in 100 words?
      • State any five definitions of education?
  2. Extended response has no restriction is placed in students as to the point he will discuss and the type of organization he will use. Teachers in such a way so as to give students the maximum possible freedom to determine the nature and scope of the question and in a way he would give a response, of course, being related topic and in a stipulated time frame to these types of questions. 
    • Examples:
      • Global warming is the next key to disaster. Explain.
      • Do children need to go to school? Support your answer.
Guidelines in constructing an essay test
as suggested by Miller, Linn & Gronlund (2009)
  1. Restrict the use of essay questions to those learning outcomes that cannot be measured satisfactorily by objective items. Objective items cannot measure such important skills as the ability to organize, integrate, and synthesize ideas showing one's creativity in writing style. The use of essay format encourages and challenges students to indulge in high-order thinking skills instead of simply tote memorization of facts and of remembering inconsequential details.
  2. Construct questions that will call forth the skills specified in the learning standards. A review of learning standards in school curricula will show that they range from knowledge to deep understanding. The performance standards require the learners to demonstrate the application of principles, analysis of experimental findings, evaluation of results, and creation of new knowledge, and these are explicitly stated in terms of the expected outcomes at every grade level. The essay questions to be constructed then should make the students model how they are to perform the thinking processes.
  3. Phrase the question so that the student's task is clearly defined. Restricted-response type of essay questions especially states the specific task to be done in writing. As much as possible, the students should interpret the question in the same way according to what the teacher expects through the specifications in the question.
  4. Indicate an approximate time limit for each question. This should be especially considered when the test is a combination of objective and non-objective format like the inclusion of essay questions. Knowing how much time is allotted to each one will make the students budget their time so they do not spend their time on the first question and consequently missing out on the others.
  5. Avoid the use of optional questions. Some teachers have the practice of allowing the students to select one or two essay questions from a set of five questions. Some disadvantages of this practice may include: not being able to use the same basis for reporting test results, or students being able to prepare through memorization for those they will likely choose.
  6. Plan what mental process are to be tested before writing the test (student's analytical skills? knowledge? or his ability to synthesize?)
  7. Use essay questions to test the students' ability to organize information
  8. Use keywords to phrase your essay questions (example: compare, explain, predict...)
  9. Focus your essay question on only one issue at a time
  10. Inform the test taker that questions will be graded on the strength of their evidence, presentation, and organization of thoughts on an issue and not on the basis of the position taken on an issue.
References:
  1. AR@HKU (2009). Types of Assessment Methods. Available online @ http://ar.cetl.hku.hk/am_saq.htm
  2. De Guzman, E. & Adamos, J. (2015). Assessment of learning 1. Adriana Pub. Co., Inc: Q.C.
  3. McMillan, J. (2007). Classroom assessment: Principle and practice for effective standards-based instruction, 4th ed. USA: Pearson Education, Inc.
  4. Miller, M., Linn, R., & Gronlund, N. (2009). Measurement & assessment in teaching, 10th ed. New Jersey: Pearson Education, Inc.
  5. Santos, RG (2007). Assessment of Learning 1. Lorimar: QC
  6. https://sites.google.com/site/creatingtestquestions/home/4-fill-in-the-blank)