|
MARCH 2, 2003
|
|
|
|
Testing Times In this era of MCAS and other high-stakes school tests, the fate of your child is increasingly in the hands of the industry that designs, produces, and grades them.
By Charles P. Pierce, Globe Staff
There are four doors in sets of two above the loading dock. The questions go out through the doors on the right, and the answers come back through the doors on the left - just so nobody ever mixes up the questions with the answers. Through these doors, in just the months from January through March every year, 28 million pieces of paper will pass. There are test preparation materials and test booklets, and test sheets that go out blank and return filled in with the answers that have come to measure so much more than what a student knows. This warehouse is where the story of Major Duncan MacGregor and his bottle begins, and this is where it ends. However, an awful lot happens in between. It seems that, in the 1700s, Major MacGregor's ship, a British transport named the Kent, sank in a storm. Despairing of his life, the major scribbled an account of the wreck, secured it in a bottle, and threw the bottle into the sea. Nine years later, a household servant found the bottle on the beach and returned it to the major, who had survived the sinking of the Kent and was living in Barbados. This marvelous piece of serendipity recommended itself to an author named Dorothy B. Francis, who, in 1990, included it in her book Drift Bottles in History and Folklore. In turn, the book came to the attention of a content specialist at Harcourt Educational Measurement, which is the place with the four doors in sets of two above the loading dock, an office complex tucked into a Texas swale. Major MacGregor's bottle was a story before it became part of a book. Now, the story will be broken into fractals so that the major and his bottle can become items that could determine the educational fate of your child. In the professional jargon of the testing industry, an "item" is a question. It can be a multiple-choice question in mathematics or an open-ended assignment in English composition. Major MacGregor's bottle becomes an item in the language arts section of a customized standards-based test called the Massachusetts Comprehensive Assessment System, which will be given to fourth-graders in every school in Massachusetts. The story of Major MacGregor's bottle will produce six multiple-choice items for the MCAS. First, though, the major and his bottle are composed by an item-writer, and, after some tinkering by a content editor, the items derived from the story are sent along to another content specialist in language arts. They are then outsourced to retired teachers, who send them back with comments. (In more complex areas such as math and science, Harcourt occasionally will run its items past university professors.) The items then are placed before focus groups made up of schoolchildren. Their answers help further refine the items. Because Major MacGregor's bottle is being used in the MCAS, the items are sent to Massachusetts. A panel of eight teachers from around the state looks at how the items fit with the standards upon which the MCAS is based. Some of the items will be accepted. Others will be sent back to Texas to be refined. Eventually, the items will be tucked into an operational MCAS without the knowledge of the students taking the test. If there are obvious anomalies between the actual test questions and the items being tried out about the major and his bottle - if, for example, students doing well generally on the MCAS do poorly on the new items - then the items are sent back again for further refinement. Finally, down a secret corridor in the Harcourt complex where no visitors are allowed, the story of Major MacGregor's bottle is printed up in a booklet with a spiffy little drawing of the bottle drifting at sea. The booklet is put together with an MCAS test, which is boxed up with a lot of other MCAS tests. The major's bottle goes out the doors on the right - adrift again, this time as a question on a test, lost this time on a roiling sea of controversy.
However, we are also a country that devises ways to produce the numbers we're drunk on while remaining, as author Nicholas Lemann once put it as regards the SATs, "fundamentally unruly" as regards their application. In the field of education, our voracious appetite for scores we can use to rank our children and our schools runs headlong into the historically altruistic American view that every child should have an education. Overseas, where higher education is not a right, tests are supposed to be a Darwinian winnowing of the student population. Here, it seems, we've come to demand increased testing at least partly so we can spend a great deal of time arguing over what the results mean and deploring the fact that the tests exist at all. Drive along Bulverde Road in San Antonio, through the dips and curves where Texas begins to pile itself into the Hill Country, and you might miss the offices of Harcourt Educational Measurement. There's a modest sign and a long driveway. The building is unmarked and modern, set away from the road. The complex is so nondescript on the landscape that the odd longhorn steer occasionally comes wandering down one of the paths. It doesn't look like it, but this blank-faced building is a landmark in the universe of the scoreboard. "It has always been a question of focus," explains Margie Jorgensen, Harcourt's vice president of test development. "We had to get our arms around what was valued in the country. It's very important that we, as a society, understand how schools are working, so that we can identify the best practices." Harcourt is the leader in the development and sale of standards-based assessment tests for the nation's public schools in a country in which the thirst for numbers and their meaning seems insatiable. Indeed, Harcourt was born as a medium for testing. In 1921, a psychologist named James Cattell and two of his graduate students formed The Psychological Corp. in New York as a vehicle for the administration of the new psychological tests then coming into vogue. In 1970, The Psychological Corp. was purchased by Harcourt Brace Jovanovich, a textbook company that had branched out into educational testing. The combined firm moved to San Antonio 16 years later, just as the boom for standardized testing was beginning to build. Ironically enough, the movement toward high-stakes standards-based testing began just as colleges were moving away from a reliance on the SATs for admissions. Out in the states, however, there was a gathering drive for "accountability" in the public schools in which "accountability" would be measured by performance on tests. Often, this drive was led by the local business community, which would accept new taxes as long as the state promised businesses accountability in the money that was going to the public schools. This was the case in Arkansas under then-governor Bill Clinton. "It appealed to the prevalent ideology of the marketplace," explains Bob Schaeffer, director of public education for FairTest, a Cambridge-based advocacy group opposed to high-stakes testing. "The 'invisible hand' needs a measuring tool, and that tool is standardized testing." (It's probably helpful to point out here that the controversy over these tests extends even to the jargon. Those people in favor of a testing program refer to the tests as "standards-based," not "standardized." The former implies a kind of tailored flexibility, while the latter summons up a uniform and Procrustean approach.) Harcourt always produced what it calls catalog product - generic tests such as the old Stanford series familiar to previous generations of schoolchildren. Now, though, with states clamoring for tests specifically tailored to new local standards, the company is perfectly positioned to take advantage of a huge new market. According to a July 2001 study by the Congressional Research Service, every state except Iowa does some sort of student assessment based on testing, and that same 2001 study quotes a study from the Pew Center on the States that estimates the costs of testing to the states to be more than $422 million. Harcourt now sells custom products to 49 of the 50 states, and, over the past decade, the customized tests have been where the money is. In the first nine months of 2000, Harcourt's profits increased 38 percent, making the company so attractive that, in July 2001, it was sold for $4.45 billion to Reed Elsevier, a massive Dutch information technology conglomerate that bought Newton-based Cahners Publishing, among its other properties. It is through its customized tests that Harcourt gets involved in the various disputes in the various states about high-stakes testing. "In sort of simple terms," Jorgensen said, 'if we build something that doesn't sell, we've failed. We have to figure out what our customers want and need, and we have to build to that." Harcourt's explosive growth - indeed, the growth of the entire testing industry - has given the opponents of high-stakes testing an entirely new area of concern. They worry that the companies have become so profitable so quickly that corners are being cut. The Dayton Daily News ran a long series in 2000 that reported that Ohio's English composition tests - the most controversial of all standards-based tests because of the subjective nature of what good writing is - were being scored by unqualified people in a North Carolina strip mall to save money. Last year, NCS Pearson - with which Harcourt shares a subcontracting agreement on some jobs - was forced into a substantial settlement when a scoring error on a statewide mathematics test resulted in erroneous results being sent to 47,000 Minnesota high school students, 8,000 of whom were incorrectly told that they had failed. A judge ruled that NCS had let quality standards lapse partly to cut costs. "At first, the company argued that it should be liable only for damages done to the students," explains Walt Haney, a professor of education at Boston College and a vigorous critic of test-based education. "But [the lawyers for the students] found evidence that NCS, in order to boost profits, had been cutting corners, had no quality control in place, and had made six or seven errors in other states. And the only reason we found out about it was because this one kid's parents wouldn't let it go." The customized tests changed Harcourt. With the states creating their own standards in all fields, and with the stakes those states increasingly placed upon those standards, the company became a highly specialized place. It also became a secure facility, as structured and regimented as a military compound. The people developing customized tests for a particular state aren't allowed to discuss their work unless a representative of that state is present to give permission. Even the warehouse is rigidly segregated. The custom tests for the various states are stored on huge blue iron shelves of their own in large cardboard boxes marked, "Vermont - Math," and "DC - Social studies." There are long hallways marked SECURE AREA - NO ADMITTANCE. Visitors to the facility are redirected if they even chance to walk behind a Harcourt employee who has part of a customized test up on the screen. "Sorry, no way," said Joyce McDonald, the head of Harcourt's Performance Assessment Scoring Center, gently tugging the elbow of a visitor who turned right instead of left. "You don't get to go there." The demand for its customized tests also required Harcourt to become even more of a full-service company; tests are created at Harcourt, and, since 1988, they come back to Harcourt to be scored. And almost all of Harcourt's content specialists began their careers in the classroom elsewhere, and they brim with mission when you talk to them. "When I was teaching," said Patricia Pederson, a Harcourt content specialist in social studies who previously taught in Texas, as well as overseas in India and Kuwait, "I was never afraid of how my students would do on a standardized test. Because I knew I taught a good solid course, and when I first started teaching, we did not have national standards of any kind." Pederson's optimism is typical of the atmosphere at Harcourt. For example, the phrase "teaching to the test" is a loaded one these days. To many critics, it means that a diverse curriculum is being abandoned in order to teach children what they need to pass a standardized - oops, standards-based - test. However, at Harcourt, it is a phrase alive with possibility - shorthand for an educational program wrapped snug and warm within the seamless garment of Harcourt products. "That's part of the re-education about standards-based testing," said Harcourt's Jorgensen. "We mean to have the teacher teach that content standard, and the test measures how an excellent teacher can teach that content standard, then teaching to the test means teaching to the standards." In fact, Harcourt echoes with a heady faith and good humor that wouldn't have been out of place during the giddy early days at NASA. You'd hardly know inside the complex that Harcourt - and the other testing companies that dominate the field - is embroiled whether it likes it or not in what is currently the most volatile and widespread conflict over the future of public education. "Do we get involved in the policy debates?" said Jorgensen. "Some of us have the opportunity to speak to state boards of education, and we have lobbyists and communications specialists, of course. That's about the biggest involvement that we have. We're just little worker bees here."
The students open their booklets. They read the story of Major MacGregor and his bottle. They look over the questions, one by one. Question 3 looks interesting: What is the MAIN reason that it took nine years for Major MacGregor's bottle to be found by his servant?
a) It had sunk in the water. Well, the bottle didn't sink - Duh!- so a and c are out. The students shift in their chairs. D is sorta right - you can't find something if nobody sees it. But b is mentioned right there in the text, so b it is.
He works in an elementary school west of Boston, a school in which the students are engaged and with which the parents are both involved and happy. He also works in the universe of the scoreboard - where Major MacGregor and his wandering bottle can determine what a child has learned and, because of that, possibly where the child may learn it and from whom the child will learn it and even where that child may live. "It's definitely the feeling of the people in the field that they are being watched and constantly told that they are not good enough," the principal said. "If you listen to the rhetoric, the whole theme is we're going to help you, because you don't know how to do it. We seem to be moving toward a society of test-takers. Of course, we've always been a society of test-takers but not high-stakes test-takers. That's the change now." At this end of the production line that begins at the great warehouse in Texas - out through the right doors and back in through the left - the bright optimism of the Harcourt compound gets lost. It almost seems as if the people there are like munitions workers, happy and industrious at the task of making things that get sent beyond their walls and are then used by angry people to lay siege to one another. "We have to assume, because there's a lot of scrutiny, that the scrutiny will act as a safety net," explains Harcourt vice president Jorgensen. "I believe that educators are professionals. We want every consumer to understand the value and the limitations of every test." However, out in the states, the profitable testing industry's products - and, increasingly, the industry's profits - have become the focus of frequently angry debate. There are accusations of bias and questions of emphasis. There is study upon study: Walt Haney of Boston College authored one that appeared to show that field-tested questions answered correctly by too many students often never made it onto the actual MCAS exam. And there are lawsuits. One was filed last fall challenging the MCAS requirement for high school graduation. One measure of the passions involved can be drawn from the case two years ago of a Georgia state trooper who was sent all the way to Vermont to interrogate a woman in connection with the theft of a fourth-grade math exam. The trooper didn't leave until the woman's husband threatened to have him arrested for trespassing. At the root of this heretofore inchoate rebellion seems to be a feeling of powerlessness, a general notion that the future of schoolchildren - which is always to say, The Future of My Child - has been handed over to a faceless coalition of politicians, psychometricians, and corporate bean-counters. High-stakes testing is premised upon the assumption that everyone else involved in education - up to and including (shh!) students and their parents - has failed so dismally that a kind of exam ex machina is needed to sort the whole mess out. Both sides are dug in. Both sides have grown stubborn. Both sides have their arguments and their supporting anecdotes. What is undeniable is that testing has a life outside the classroom. Realtors in Massachusetts use MCAS results to sell homes in high-scoring communities, and the principal of a high-scoring school knows that every year, when the MCAS scores hit the newspapers, he will have a clutch of phone calls immediately thereafter from parents seeking to move their children into his school from one that didn't score as well on the exam. Perhaps the vitriol and ill-feeling were to be expected since the move to standards-based testing was predicated on a profound distrust in the schools and in the teachers who worked in them. This is so plain that nobody on the pro-testing side of the argument bothers to deny it. "We would never have education reform if there wasn't [a lack of faith]," said James A. Peyser, chairman of the Massachusetts Board of Education. "Certainly, there was a lack of faith that local school districts would be able or willing to implement standards without consequences." The debate, focused as it is on the MCAS tests, is no less intense here than it is anywhere else. Having been in the middle of the debate for more than a decade, state education commissioner David P. Driscoll declines even to temper his sarcasm at this point. "When the strength of our system comes across, the rhetoric gets toned down," he said. "In Lawrence, we had nine kids who had been in the country for less than two years, and six of them passed the MCAS. If they can, what excuse does a kid from Weymouth have? Their reason for not passing is their own effort." However, there are signs that the debate is shifting. The first came in a recent study by Arizona State University showing that a regimen of high-stakes testing did not improve academic achievement in the long term. The study also showed that children who scored well on standards-based tests from kindergarten through high school fared less well when they later took the SAT and ACT as they prepared to go to college. While the study was dismissed in some quarters because it was funded partly by teachers' unions, which historically are opposed to high-stakes tests, the attention given to the study was a blow to the confidence that many states have placed in such tests. "The whole debate has been a political debate," explains Audrey Amrein, the study's author. "Even our local superintendent here said that he won't even open the first page of the study. That's so discouraging. I mean, even if you don't change your policies, at least let it inform your thinking about them." Other circumstances, however, may force the states to take an even closer look at how deeply committed they have become to high-stakes tests as a measure of educational progress. Most states are in the middle of a horrific budget crunch, and the costs of standardized testing are increasingly being scrutinized in a worsening economic context. There are already stirrings that the requirements of the MCAS program - which the state spends $50 million annually simply to publicize - may be too rigid in a time when cuts in local aid will further weaken the ability of schools to prepare students for a test on which depends, among other things, their high school diplomas. This is an issue around which the disparate elements of the antitesting movement might easily coalesce, especially given what is quickly coming down the road from the federal government. Next year, the states will confront the requirement of No Child Left Behind, the Bush administration's signature education overhaul package. Among its other requirements, the act mandates that every child be tested in English and math from grades 3 through 12. The first complaints came from governors, who argue that while there is plenty of money in the law to conduct the tests, there's very little in it to pay for the teaching that would lead up to it, leaving the states with a whopping unfunded federal mandate in increasingly hard economic times. Even Republican governors - like Louisiana's Mike Foster, who took his case to the president himself - are complaining. As a Congressional Research Service report states: "If the amount authorized . . . for FY2002 were appropriated . . . this would meet a significant share of additional [testing] costs. However, most states would face increased costs not only for test development, but also for test administration, maintenance etc." Further, No Child Left Behind may require those same governors to abandon state programs that they've nurtured over the past decade. As a result, the governors say, even schools that have improved will find themselves described as "inadequate" under the federal guidelines. So far, only five current state programs have been conditionally accepted under the new guidelines. Massachusetts' is one of them. "Left to its own devices," said education commissioner Driscoll, "the law would seem to drive every school into 'needs improvement,' and then into 'inadequate,' and finally into 'reconstitution.' You're only as strong as your lowest performing group, and there's a feeling that you're being set up to fail. The fatalism in that - that we're gaining and we're still not doing our jobs - that's what bothers people." There's more than a little irony to this. For years, the push for education reform through high-stakes testing has been driven by a lack of faith. Now, the newly mandated federal tests will judge the reformers in the states with as cold and jaundiced an eye as that with which the reformers once judged the schools.
Harcourt will be busier as the No Child Left Behind initiative works itself up to speed. (The company is already quite familiar with the Bush approach to education, having been lucratively involved throughout the 1990s in developing the learning products for the Texas state assessment system on which the federal initiative is modeled.) For example, the provisions of the new mandate will require that every child in Massachusetts take what will essentially be an MCAS test in English and in mathematics every year from the third grade until the 12th grade, and this will happen in every state. Harcourt will be busier than it ever has been. Characteristically, this is seen not as the crisis it may be in the various states but as a daunting challenge to be vigorously overcome. "If you only teach what's on the test, you're not serving your students well," said Jorgensen, who is in charge of test development at Harcourt. "With the social and political and cultural policy that has defined your state's standards, you get great clarity within any geographic region of what a teacher should be doing." Here, within the compound, the angry shouts and whispered frustrations do not penetrate. The demonstrations and the marches and the countermarches and the lawsuits die in the scrubland outside the walls. "I would say, if there was one thing I'd want people to know, is that we really care about the kids," said Elaine Grainger, another Harcourt senior psychometrician. "What we do matters - if we do something that results in an error, that can have consequences. "If you believe in public education, and we do, and certainly I do, you know that part of it being public is that everybody has a voice in it, and, insofar as our tests work to the advantage and promotion of making sure that everyone has the opportunity to get an education, I'm there with it." And all the conflict stays outside, a conflict rooted in the kind of profound pessimism that's as clumsy and unwelcome here as the wandering longhorns are, when they turn up in the parking lot.
The scorer probably has a college degree and might even have a master's. The scorer's profile is on file at Harcourt along with thousands of profiles in the company's database. The scorer has been trained on sample questions from Harcourt's vast array of generic tests, and now the scorer sits at a long table in a very large room, ducts and pipes crisscrossed on the ceiling. It looks like 40 or 50 people waiting at the Registry to get their learner's permits. In the spring, when the real scoring crunch comes as most of the states test all of their students, the room will be jammed and a second shift will be added, usually made up of local teachers padding their incomes working nights. It's during those times that the scorer must be most careful. It's not as if there aren't disasters. The scorer does not want anything like that Minnesota thing on his watch. A lot is riding on the scorers: the high school diplomas of thousands of students the scorer will never know, funding decisions on thousands of school districts the scorer will never visit, the careers of thousands of teachers the scorer will never meet. The scorer's profile shows him to be gifted in language arts, so he gets the story of Major MacGregor and his bottle, now broken into items and reproduced thousands of times and answered over and over again. The scorer tallies all of the bubbles, and the story of Major MacGregor, now full of new and serious purpose, gets folded into one individual score that gets folded into many others. They are transmitted electronically to the publishing center, where men and women in maroon company vests bustle and hurry to get the results collated and dispatched. The results all get loaded again through the doors on the right onto one of the waiting trucks. The driver heads north toward Massachusetts. Laden with all the answers, the other trucks go off in every direction, to all parts of the new universe of the scoreboard.
Charles P. Pierce is a member of the Globe Magazine staff. This story ran
in the Boston Globe Magazine on 3/2/2003.
|