The tide, it appears, may finally be starting to change. After a generation of test-centered accountability for teachers, the state of Maine has passed a law that removes a requirement that standardized test scores be used to evaluate teachers. I suspect more states will follow, if for no other reason than all educational trends eventually fall out of favor when we realize the old ways maybe weren’t so terrible after all.
No matter what other states do, the question will remain: How do we fairly evaluate the performance of teachers? There is no easy answer, and it’s largely because there are two competing beliefs about how to identify good (and bad) teaching.
I read two articles in the last couple of days that illustrate the tension at the center of teacher evaluations. The first was written by Alfie Kohn way back in 2008, but its message is often repeated today. In It’s Not What We Teach; It’s What They Learn, Kohn asserts that “what we do doesn’t matter nearly as much as how kids experience what we do.” He provides a number of examples, explaining that it doesn’t matter what an adult intends if a child interprets the adult’s words or actions differently. Kohn writes:
“We may think we’re emphasizing the importance of punctuality by issuing a detention for being late, or that we’re making a statement about the need to be respectful when we suspend a student for yelling an obscenity, or that we’re supporting the value of certain behaviors when we offer a reward for engaging in them.
But what if the student who’s being punished or rewarded doesn’t see it that way? What if his or her response is, “That’s not fair!” or “Next time I won’t get caught” or “I guess when you have more power you can make other people suffer if they don’t do what you want” or “If they have to reward me for x, then x must be something I wouldn’t want to do.”
We protest that the student has it all wrong, that the intervention really is fair, the consequence is justified, the reward system makes perfect sense. But if the student doesn’t share our view, then what we did cannot possibly have the intended effect. Results don’t follow from behaviors but from the meaning attached to behaviors.”
It follows, then, that a teacher’s intention to teach effectively doesn’t matter if students don’t learn anything. A teacher who says, “I taught a great lesson but the kids just didn’t get it,” is, in Kohn’s view, making an incoherent statement. There can be no teaching without learning.
So given that a teacher’s practices are irrelevant if students do not learn as a result of them, it makes sense to design a teacher evaluation system that looks only at student achievement. To people with this view, nothing else should matter.
But that’s not how teacher evaluation systems work. Although we say things like, “It’s not about the teaching; it’s about the learning,” our actions betray our purported beliefs. We want to see teachers in action, and we think we can judge their abilities irrespective of how students actually do in their classes. We want to evaluate teachers based on how they perform their jobs and not on how their students perform theirs.
Every evaluation system I know of includes observations by a supervisor. Frequently, the observations carry more weight than student performance (in my state, it’s 75% observations and 25% student achievement data). And principals aren’t watching students; they’re watching teachers. Their checklists require them to. Marzano’s Teacher Evaluation Model includes four domains and 60 elements. Every single one uses teacher-centered language. Marzano requires supervisors to evaluate teachers on what they do, not on how their students do.
Because most of us recognize that it takes two to tango and that a teacher can only truly control one of the dancers. An effective classroom has an effective teacher, but it also has willing learners, a point Jody Stallings makes in this article for the Moultrie News, which serves as a rebuttal to Kohn’s perspective.
Responding to a parent who espouses the belief that we should judge teachers on how their students perform, Stallings writes, “Teachers should indeed be held accountable for teaching their students. But that’s not what you’re demanding. You’re demanding that students learn, and that’s a very different issue.”
Stallings argues that we should judge teachers not on how their students do, but on how teachers perform their jobs. In other words, he sees no incoherence in the statement, “I taught a great lesson but some of the kids didn’t learn.” That, I believe Stallings would argue, is perfectly possible. It’s also — I can say as someone who’s taught a lot of lessons, (some great, some not) — almost always the result.
Stallings asks his readers to consider a reluctant eater:
“Have you ever tried to make a child eat something he didn’t want to eat? That’s what teaching unwilling learners is like. The reality is unless they have an appetite, you can set an entire banquet in front of them and it will go untouched. The problem is that we are slipping into a world where we don’t judge teachers by the banquet they prepare but by the appetites of the children at the table.”
To further his analogy, evaluating teachers based solely on how their teaching is received by their learners would be like evaluating a gourmet chef based solely on how diners might receive his or her ginger glazed mahi-mahi. It might be the best in the world, but some diners won’t be hungry. Some will hate seafood. Some might be allergic to an ingredient. And some just prefer cheeseburgers.
Stallings’s point, and it’s a legitimate one, is that you can’t judge teachers only on what their students learn because students, like diners, are different. A chef has total control over the dish, but no control over the people who eat it. A masterful teacher in one school may get horrible results in a different one, not because she’s a bad teacher, but because she is trying to teach students who are less willing, and sometimes less able, to learn.
This is a problem.
Because if teaching isn’t about learning, then what’s it about? And if we want to design a system to evaluate teachers, shouldn’t such a system, almost by definition, take student performance into account?
But if we are going to consider student performance, how much impact can we realistically expect teachers to have on students, given that students are very different?
How much should student performance matter, and does it matter the same amount for these students over here compared to those over there?
To further complicate matters, how many students have to “fail” before we label their teacher a failure? How many have to “succeed” for teachers to be effective?
Because here is something all teachers understand: The results are almost always a mixed bag.
I’ve taught nineteen years now, and I have probably never taught a lesson where every single student hit the learning target (if I have, the target was likely too easy or students already knew the content). I have also never taught a lesson where zero students demonstrated understanding.
Results are never uniform, which suggests that it’s not the teacher who is the ultimate determinant; it is the student. And if that is true, then how can we fairly measure a teacher’s effectiveness by looking at work she does not do?
I do not have an answer, but I suspect it lies somewhere in the middle. I don’t want to be judged solely on how other people perform (especially when those other people are easily distracted by a bee in the classroom or the crooked look a classmate gave them 40 minutes ago during lunch), but I do recognize that in order to claim you’ve taught somebody there must be evidence that they learned. That said, I resent anyone who attempts to evaluate my teaching by looking at a spreadsheet instead of stepping into my classroom. For me, it’s what students learn, but it’s also how teachers teach.
What do you think? If you could design a teacher evaluation system from scratch what would it look like? What would its purpose be? How much should student performance matter? Share your thoughts in the comments.