Teacher Evaluation Pitfalls

By W. James Popham

Despite the current clamor to evaluate teachers' effectiveness on the basis of their students’ test scores, no evidence currently exists to show that the tests intended for use in such evaluations are up to the job. Put simply, there is no proof -- none at all -- that these tests can accurately distinguish between welltaught and badly taught students. Let’s see why.

Almost every field of endeavor has its fundamental precepts, namely, its collection of experience-honed rules and insights that govern worker conduct. Lawyers, for example, recognize that the outcome of any courtroom conflict will be determined by earlier rulings from higher courts. Similarly, electricians invariably try to repair malfunctioning equipment by first “checking the power source” to see if electricity is actually reaching it.

In educational testing, the most important precept -- by a country mile -- emanates from the concept of assessment validity. What the validity principle means, in simple language, is that, if any test’s scores are to be interpreted in a particular way, then adequate evidence must be on hand to support the accuracy (that is, the validity) of this score interpretation.

To illustrate, because college admissions exams such as the SAT and ACT are employed to predict high school students’ subsequent success in college, developers of those exams collect evidence showing that students’ SAT and ACT scores are, in fact, predictive of college grades. The validity of those two exams’ score-based interpretations has, over the years, been confirmed by a two-ton truckload of validity evidence.

Let’s turn, now, to the test-based evaluation of teachers. The primary interpretation to be made from test performances is that “high-scoring students were well taught,” while “low-scoring students were not well taught.” Yet, even though recent federal financial incentives have enticed many state officials to demand test-based evaluations of teachers, absolutely no evidence exists that the tests to be used in such evaluations are capable of differentiating between effectively and ineffectively taught students. The use of students’ test scores to evaluate teachers, you see, runs counter to the most important commandment of educational testing -- the need for sufficient validity evidence.

