When you are trying to evaluate students, numbers help. You need some objective ways to measure progress. You can’t run a class-room, a school or a school system on feelings and opinions. But numbers can also be misleading. Here are a couple of examples:
Readability Measures
If you go to Intervention Central and type a passage into the CBM Reading Fluency Passage Generator, it will give you several different readability measurements for the same passage.Some of these measurements use their own numerical scales, but most of them use a grade level number. For example, the number 2.2 represents an early second grade reading level, the number 3.8 represents a late third grade level, etc.
If you submit the first full-text reading passage in the DIBELS First Grade Benchmark Assessment, entitled “Spring Is Coming”, you will find that the SPACHE readability score is 2.37 indicating a solid second grade reading level. The Dale-Chall score is 1, indicating early first grade, the Coleman-Liau indicates a 1.8 grade level, the Fog Index indicates a third grade level at 3.3, and the Automated Reading Index is actually negative. (If a low number is easy to read and a higher number is harder, what is negative readability?) So what is the real grade level of this reading passage?
The answer is complicated, too complicated to give completely here. A scientifically realistic grade level for this passage would involve testing many thousands of students, then quantifying their response to the reading passage, not scoring the reading passage itself. To do this scientific “norm referencing” for all the passages in any assessment tool would be impossible. The value in grade level readability measurements is not their accuracy, but the ability they give you to compare two or more different pieces of text. It is not very important that you accept Coleman-Liau and reject Dale-Chall or vice versa – or that you prefer some other measurements. It IS important that you compare one Colemen-Liau score to another Coleman-Liau score, Dale-Chall to Dale-Chall, etc. We like to take four or five measures and then average them. Then we compare one passage to another using the average of the same measures.
In this manner you can look at passages from a “balanced literacy” basal, or from a children’s literature book used in a guiding reading class, or from the heavily controlled reading material in a structured phonics program. Some kinds of students may thrive in one of these approaches and struggle in an-other. That does not mean they are not all making progress. If everyone ends up reading at roughly a sixth grade level (or above) in sixth grade, does it really matter how they got there? In order for you to assess the student’s forward progress, however, you will need to look at more than just one or two numbers.
Multiple Dimensions
Another reason for not taking any one measurement too seriously is that fact that reading is a multi-dimensional process, whereas most short-term assessments (less than a year between measurements) only measure one dimension at a time. Progress in one area may not mean as much as you think. An excellent example of this point occurred when a teacher we knew was doing her masters research. She compared the growth of two separate but similar groups of students using the Nonsense Word Fluency Section of the DIBELS Second Grade Benchmark Assessment. One group received phonics instruction in the Stevenson Program and one did not. The Stevenson students in fact made more progress and sustained the progress more effectively (on average). However, the numbers that the DIBELS instrument generated did not indicate very dramatic growth for either group. What was dramatic, however, was some of the information we gathered when we looked at the raw data. The Stevenson students finished with a 95.9% accuracy rate (averaging all students with all phonemes) and the non-Stevenson pupils finished with a 65.3% accuracy rate. Any reading specialist knows that reading real words with 65% accuracy is completely inadequate.
This is not meant to be a criticism of the DIBELS assessment, which is very useful. This instrument was designed to measure speed not accuracy. But think about the danger of putting too much emphasis on a single number. Accuracy, fluency, comprehension and more – if you try carefully to evaluate progress in all these dimensions all the time, you would have no time to teach. Therefore, it is important to know what the numbers you are looking at really mean – and it is important to look beyond the numbers as well.