Responding to Bill Gates’ Destructive Model of Teacher Evaluation

By Anthony Cody.

On October 7^th, Bill Gates gave a speech providing a comprehensive review of his Foundation’s K12 education strategy. This arena is their number one domestic priority, and a place where they have invested billions of dollars over the past 15 years. As we approach the end of the Obama administration, which has served as their close ally, we might expect some serious reflection on lessons learned.

The Common Core is struggling along on continual life support from Gates and the Department of Ed; the charter school illusion is less sustainable every day, and in his own state of Washington, the charter school ballot measure Gates helped get passed was recently declared unconstitutional. In spite of these setbacks, serious reflection was not in evidence. While Gates acknowledged some missteps in the rollout of the Common Core – blamed on their “naivete”, the emphasis was on advances made, with a pledge to continue pushing in the same direction.

Their confidence notwithstanding, I think their direction is severely flawed, in several respects. In the first place, they continue to believe that test scores and the Common Core “define excellence.” In the second, they have created a fatal error in attempting to embed teacher professional growth into an evaluative framework – harming both.

[Please note that the text provided by the Gates Foundation here is not the same as the speech delivered by Gates. I took the time to transcribe what he said, and will focus my response on that transcript, posted here, rather than the “official” text.]

Gates began with a model provided to him by Washington’s “teacher of the year,” Lyon Terry, throughout his talk. According to Gates, Terry draws for his fourth grade students an upwardly tilted “learning line” that continues through life. Gates then suggests that the Foundation’s work is all about helping teachers move up that learning line, together with colleagues, so as to be more effective.

Central to this has been an emphasis on what Gates calls “high impact” teacher evaluation.

Here is how Gates describes the rationale for this work:

We decided to focus on what goes on inside the classroom, and focus on the teaching profession and how we could facilitate improvement there. The evidence is very strong about the importance of an effective teacher. If you take two classrooms from within the same school, and you have a teacher in one classroom who’s in the top quartile – not at the top, but just in the top 25%, and another teacher who’s at the top of the bottom quartile, the 75%, and you look at their students’ achievement over the course of the year, their scores will be ten percent different by the end of the year. And that’s a very dramatic difference. If you go three years in a row having that top line, you would completely close the income inequity of learning in the entire country. And so making sure there’s more of those top quartile teachers, and that we’ve moved people up from that bottom quartile, that, if it’s done at scale, can have dramatic effects.

This has become known as the “Three great teachers in a row myth.” Here is some background that reveals how misguided this idea is. From Diane Ravitch:

Over a short period of time, this assertion became an urban myth among journalists and policy wonks in Washington, something that “everyone knew.” This particular urban myth fed a fantasy that schools serving poor children might be able to construct a teaching corps made up exclusively of superstar teachers, the ones who produced large gains year after year.

This is akin to saying that baseball teams should consist only of players who hit over .300 and pitchers who win at least twenty games every season; after all, such players exist, so why should not such teams exist. The fact that no such team exists should give pause to those who believe that almost every teacher in almost every school in almost every district might be a superstar if only school leaders could fire at will.

The teacher was everything; that was the new mantra of economists and bottom-line school reformers.

Gary Rubinstein goes into even greater detail regarding the research here.

As we have learned over the past few years, VAM scores are highly unstable. Many of the teachers in the top quartile one year will be in the bottom the next, because students are not assigned randomly, and VAM systems do not adequately measure learning. In particular, students who are special education, gifted or English learners, often fail to grow at predicted rates.

But the more fundamental issue here is that, for all the talk of “multiple measures,” the effect Gates and his Foundation have focused on is test score growth. That is the operational definition of a good or great teacher. Many of the most valuable things we teach are not captured by tests, and we corrupt the very foundation of our work when learning is reduced to these scores.

Gates then cites several places around the country and picks out random data points that suggest improvement. Carol Burris provides some deeper details here. The place he focuses on the most regarding teacher evaluation is Denver. He said:

For example, Denver uses a measure that combines teacher observation, student perception surveys, and evidence on how much the students are learning. It’s not just a system for taking the teachers and sorting them into groups. It’s a framework for moving up the learning line together. The principal visits the class, discusses it with the teacher, and decides where the teacher stands. If they’re not satisfied there are resources that can be brought in to get additional coaching from fellow teachers. And so there are clear paths for growth that are laid out.

The approach in Denver is advanced compared to what most teachers get in the rest of the United States. But most other places around the world, from China to the Netherlands, have been doing this thing for the past decade. And in that past decade, when the US was one of the leading school systems in the entire world, at teaching kids math, at teaching them to read and write – we have fallen behind. We are now in 14^th place. And almost all of those systems, that are not only above us, but used to be below us, they’ve moved ahead, a key element they have is how they help their teachers improve.”

This is a truly remarkable claim. According to Gates, most of the rest of the world is leaving us in the dust because of the way we evaluate our teachers. The only data point from Denver that Gates provides is an increase from 16% to 24% in the number of students scoring 21 or higher on the ACT test. This column by former Denver school board member Jeannie Kaplan provides a more sober assessment. A few of the statistics she cites:

ACT scores (the bar by which “reformers” cite college readiness) have remained stagnant, with a slight drop to 18.3 in 2015. Twenty-one is the number generally cited for college readiness; 26 is the average score needed to enter CU Boulder.
The DPS graduation rate is 62.8 percent. Colorado’s rate is 77.3 percent. The last available remediation rate stands at 52.4 percent.
Achievement gap increases since 2005, based on economics (free and reduced-lunch students and paying students): reading, 7 percentage points, from 29 to 36; math, 14 percentage points, from 20 to 34; and writing, 9 percentage points, from 27 to 36.

Now let’s shift to the main strategy Gates is offering. He states:

One problem we see in the teaching profession is that too often, teachers have to move up completely on their own. They don’t get the feedback or tools they need to improve their practice. So they move up slowly, or not at all.

In fact, Gates use the word “feedback” 14 times in this talk. And who could be against more feedback? But here is the trick. Gates envisions this feedback taking place in an evaluative framework. Teacher to teacher collaboration absent this framework is apparently invisible to him in the Gates model.

Gates claims that teachers “crave” the sort of feedback that these evaluation systems provide. But here is what I have heard from teachers in the area. One wrote me to say:

My outcomes were good, but I can tell you they use the process to get the most experienced teachers out of the district. Observations are biased. Teachers that speak out and question what is going on go through severe harassment including extra work to defend what they are doing in the classroom. The student survey is very personal asking about family things and a few teacher/climate things.

Sean Black, who also teaches in Denver, told me:

The LEAP system has defined a very limited definition of teaching, and it applies to all teachers, kindergarten, AP physics, music, and physical education just the same, all with emphasis on the language domain. Furthermore, student behaviors impact scores heavily, thus the teacher is penalized by students who fail to take opportunities to learn. Worst is that the scores are summative from the moment the observer begins the observation. The scores are all averaged in the teacher evaluation.

Chalk beat reports that teacher turnover in Denver is a shocking 22% per year, and principal turnover is even higher, especially in high poverty schools.. This does not suggest a happy work environment for teachers or administrators.

Gates reveals a few glimpses of reality here and there. He said:

In a recent study researchers at Harvard gave the teachers video cameras and allowed them to record as many of their lessons as they wanted. They could choose which ones they sent to the principal for him to look over and for them to discuss. And what we saw was that when you put the teachers in charge of deciding which lessons to seek feedback on, it redefines the power dynamic between them and the principal. The teachers get to lead the discussion and focus on what they think they need to improve.

This understanding that power dynamics affect people’s willingness to engage in these processes is a useful one. He goes on to say:

We have many systems today that are viewed negatively because they are mainly about hiring and firing. They are not a tool for this learning. If we don’t get that balance right, the whole evaluation system does not strengthen teaching, it actually inhibits it. So you get cases where teachers would prefer to have no feedback at all, which was the system a decade ago most of them worked in.

Every teacher has a right to ask about these evaluations, “Is it designed to help me get better?”

This may be a backhanded acknowledgement of some of the resistance their model has run into. School districts across the country have been struggling with all sorts of variations on the model Gates has advocated — after the use of test scores was made one of the requirements for Race to the Top and NCLB waivers. Recently, the Gates Foundation pulled the last $20 million from a $100 million grant to the Hillsborough schools in Florida. One roadblock encountered was that “peer coaches” came to be seen as bureaucrats by teachers they were assigned to work with. Now Gates seems to be saying that their model is fine, but the recipe must be just right for it to work. If things go haywire, then it was not done just right. But I think there is a more fundamental problem.

I am going to make a rather strong departure from the Gates approach.

It is not only wrong because the systems have defined excellence in terms of test scores. It is wrong about the proper role of teacher evaluation in relationship to teacher’s professional growth. Teacher evaluation systems are NOT primarily about helping people get better. Nor should they be.

Just as Gates has defined student tests as the critical lever to increase learning for them, and sought to use such tests to rank and sort students, he has defined teacher evaluation as the lever to increase teacher growth, and likewise sought to rank teachers according to their “effectiveness.” But teacher evaluation has never been an important avenue for professional growth, and there are good reasons why it is poorly suited for this task.

Teacher evaluations are directly connected to significant, even career-ending consequences. Especially under the policies advocated by the Gates Foundation, and coerced into effect by the Department of Education, evaluations are used to determine if one gets respect or is pushed into remediation – or even fired — as an “ineffective” teacher. Talk about high stakes!

Let’s think about the way we grow as individuals in a genuinely collaborative environment. In my dialogue with the Gates Foundation several years ago, I drew their attention to the Teacher Inquiry approach being used at New Highland Academy in Oakland. The teachers there work with the Mills Teacher Scholars program, and defined questions about their teaching practice to investigate. They were unhappy with the level of reading comprehension their students were showing, so they investigated and developed some strategies to apply. They met monthly and observed and collected data in one another’s classrooms. They made all the key decisions about what changes to make. They worked closely with one another, and gave each other critical feedback about their teaching. This did not need to be part of their evaluation process, and in my view, putting it in an evaluative framework would have destroyed the trusting environment this work needs to unfold.

Real reflection and growth is a risky thing. It requires seeing our own weaknesses, and even drawing attention to them with others so as to get their insights and help. What person in their right mind would draw attention to their own weaknesses in the context of a “high impact” evaluation system? The places where I have seen teachers take real risks and grow have been in Critical Friends groups, through Lesson Study, through the National Board process, and through the Mills Teacher Scholars. These practices all have one thing in common. They require deep trust between the people doing the work. That trust comes from knowing we are all supporting one another to be the best we can be. It is destroyed by an evaluative context.

Teacher evaluation ought to serve as a safety valve, an emergency brake, to handle the process by which teachers enter and leave a school. A relatively lean observation process and other feedback administrators receive should give them adequate information to understand if someone is basically competent before they are given tenure. If they are NOT competent, then that incompetence should be documented, and the person should be offered remediation or counseled out – ultimately terminated if they cannot do the job. But if they ARE competent, they should be provided the time and freedom to develop themselves in collaboration with their colleagues. The evaluation process is not the means for this development to take place, nor should we use the evaluation process to micromanage teacher growth.

Remember the lessons from Daniel Pink’s book Drive, explaining the latest research on human motivation? We are intrinsically motivated to pursue mastery in what we do. So we WANT to get better at our craft. We are driven by a sense of purpose – teachers even more than most professions. We are teaching because we want to give opportunities and knowledge to others. That is our highest calling. But don’t forget the third great motivator: autonomy. We are intrinsically motivated when we know that we are in charge of our work, when we are in charge of our own growth. Remove that, and you wreck everything.

When teachers are handed long lists of teaching standards and subjected to observations that demand evidence of them all, they are robbed of autonomy. When teachers are given detailed curriculum standards and timelines, and external benchmark tests – even scripted curriculum, autonomy is destroyed. And when the evaluation system consumes our concept of collaboration and growth, that also destroys our most important autonomy of all – the ability to guide our own professional learning. This in turn destroys motivation and morale. I believe this is a major reason our schools are suffering from devastating turnover rates, even in Denver, the system that Bill Gates extolls as the model of reform.

This is what the research Daniel Pink shares would predict. Replace autonomy with carrots and sticks and the result is that intrinsic motivation is destroyed, and performance is actually worse.

When Gates speaks about the Common Core we get a clearer picture of his model for teaching and learning. He says:

We do need to have a system that defines excellence, and that system needs to be very thoughtfully designed, things like how do you take all your reading and writing experiences across your different classes and really see what you’re missing so that different teachers can engage on your deficits.

This is a system focused on identifying deficits, rather than strengths. And this mentality is applied to students and teachers alike. Common Core tests “define excellence” for students and tell some 70% of them that they are deficient. “High impact” teacher evaluations define excellence for teachers and likewise focus on pointing out their deficits. Just as a constant stress on test scores undermines student growth, a stress on evaluation undermines teacher growth. It is no wonder that many teachers are rejecting rather than rejoicing at this constant “feedback.”

Teacher professional growth is not served well by being embedded in an evaluative framework. It is best served when teachers have significant latitude to chart their own paths as individuals, and as school staffs. Administrators can help lead this process, as can outsiders like the Mills Teacher Scholars. But the real work must be done by teachers, who are intellectually and spiritually engaged with this endeavor. That engagement is not derived from the coercion inherent in the evaluation process. It is unleashed by inspiring leadership – and that comes best from teachers themselves.

Note: For more analysis of the work of Bill Gates and his foundation, see my book, The Educator and the Oligarch, a Teacher Challenges the Gates Foundation.

Comments

howardat58 October 20, 2015 at 9:12 am


“We decided to focus on what goes on inside the classroom, and focus on the teaching profession and how we could facilitate improvement there.”
So Gates ACTUALLY went into a SCHOOL ?????
tultican October 21, 2015 at 12:51 pm


I really like your position on teacher evaluation. It is spot on.
Daphne Winders October 23, 2015 at 7:47 pm


This subject is much more complex than the ideas represented here. Bill needs to get at least a Bachelor’s Degree in Education before he has the right to inform education policy.

Responding to Bill Gates’ Destructive Model of Teacher Evaluation

Responding to Bill Gates’ Destructive Model of Teacher Evaluation

Teacher Growth Requires Autonomy and Trust

Related

Author

Comments

Leave a Reply Cancel reply

Share this:

Related

Technology and Unemployment: A Lesson

How Could Education Reformers Get it So Wrong?

The Test of Our Time: Can We Break the Shackles of NCLB?

Why I was Shaking My Head at Betsy DeVos

Leave a Reply Cancel reply