THE RUBRIC for the very first standardized test that Todd Farley scored seemed simple: one or zero. If the fourth-grade student provided just one example of bicycle safety in a drawing - wearing a helmet, both hands on the handlebars or stopping at a red light - he'd get a one. No examples - zero.
But for Farley, author of Making the Grades: My Misadventures in the Standardized Testing Industry, it wasn't that simple. The student had indeed included one example: the rider in the drawing was wearing a helmet. He was also doing an Evel Knievel-like leap over a chasm spewing flames. Baffled, Farley consulted his supervisor; he was told that the rider was wearing a helmet and that that was enough to indicate that the child understood the basics of bicycle safety. Score: One.
Farley encountered many answers that did not quite fit the rigid set of rubrics in his 15-year career. One high school girl who wrote a beautifully moving and well-constructed essay about "A Special Place" could only rate a three out of four because her piece did not include the words "a special place." Farley also cites a number of questionable practices by the testing company, including hiring scorers not fluent in English, requiring workers to mark one essay every two minutes for eight hours a day and little cross-checking of scores.
Since passage of the controversial No Child Left Behind Act in 2002, standardized tests have become the cornerstone of educational evaluation. They are now the chief determining factor in deciding the fates of students, teachers, principals, schools and entire school districts. The fact that these tests were never designed for those purposes has not prevented school "reformers" and politicians from increasingly mandating their use. Recently, the New York Times followed the example of the Los Angeles Times in its decision to publish ratings of schoolteachers based on the "Value-Added Method (VAM)." This widely criticized and unproven method posits that a teacher's effectiveness, or lack thereof, can be determined by the use of a highly complicated algorithm which measures students' changes in test scores over time. As Linda Darling-Hammond, a member of President Obama's transition team on education policy, points out, many other conditions such as "home and community support, individual students' needs, health issues and attendance, prior teachers and schooling, summer learning and the specific test used" are not factored into this equation.
There are a number of other flaws in the Value Added Method. Only reading and math teachers can be judged by test data because only those subjects are formally tested. As reported in the Huffington Post, Tennessee's school districts solved this problem by averaging their reading and math teachers' scores and assigning that same score to all the other teachers in the building. Thus physical-education teachers, social-studies teachers, music teachers, etc., were automatically assigned a rating that had nothing to do with the subject they taught or how they taught it. Yet, their careers and livelihoods now depend on that one rating.
In addition, this method assumes that reading and math skills can be improved only by the teachers of those particular subjects. Of course, students employ reading skills in most other subjects, and science cannot be learned without incorporating math. Should the math and science teachers' scores be combined? What if the math teacher has twenty years experience but the science teacher only two? Does the Value Added algorithm have room for one more adaptation?
Why has standardized testing so quickly become the accepted method of evaluating teachers and schools, taking precedence over more thoughtful (and time-consuming) practices such as classroom observations and peer review? Yearly testing requirements of both No Child Left Behind and President Obama's Race to the Top have made testing a very profitable industry, particularly for companies such as Harcourt, CTB McGraw-Hill and Riverside Publishing, who together write 96 percent of the standardized tests used in this country. Estimates place the value of the testing market anywhere from $400 million to $700 million. According to PBS's "Frontline," Pearson NCS (where Todd Farley was employed) has made millions of dollars in profits since 2002 by monopolizing the market in scoring. These same companies make additional profits by selling the test-prep materials now used by districts to bolster their scores.
Before No Child Left Behind, data collected from standardized tests were used mainly to update curricula and revise standards. Now that data has become a weapon to fire teachers and deliver public schools deemed "failing" to for-profit companies and charters.
There is no data which show that testing improves student learning. Schools under intense pressure to raise test scores have had to eliminate music, art, library science, civics and other electives to make room for scripted test-preparation classes. Schools where test scores have improved are schools where, as early as kindergarten, children are taught how to fill in bubbles.
As noted writer and historian Diane Ravitch said recently: "If we continue to have more years of multiple-choice standardized testing, we will squeeze out every last drop of creativity, originality, innovation and critical thinking - the very attributes needed for the 21st century."