Tuesday, October 30, 2012

Why Zeros for Late Homework are Stupid -- a general theory

In a presentation today, the really-awesome Sean Stalling mentioned offhandedly that the common policy of automatic zeros for late homework assignments is devastating to kids grades, discouraging, and just dumb.  (My words, his sentiments, I think.)  But not everyone already believes this now-obvious-to-me idea--I didn't always believe it either--so here's why it should be obvious.

1.  Even one or two zeros has a really devastating effect on a student's grade, especially in a traditional 90-80-70-60 scale, even worse if the people are using one of those idiotic "honors" 94-88-80-75-65 scales.  (These scales are idiotic, because performance on more complex tasks--the kinds you'd want students to be doing in honors courses--is harder to replicate, so you should if anything have a looser grading scale that allows kids to show excellence sporadically and still get good grades.  For example, classes for math majors at the college level not-infrequently use the "sup norm": your grade is primarily based on the highest score you get on a test, rather than the average; lower scores are basically ignored.  But I digress.)  For example, a student with a consistent 90% average on 9 assignments (low A) drops to a low B with a single zero, and even if the student gets 100's on every assignment thereafter, it will take that student 9 consecutive 100's to bring that grade back up to an A.  Put differently, on the 90-80-70-60 scale, a kid needs to have 95's on EIGHTEEN assignments to get an A if he or she gets a single zero.

It's depressing that so few teachers using this system understand this.  I mean, are they blind?

2.  Students DO understand this, and so after a couple of zeros, they correctly conclude that there's really no point in trying anymore.  This outcome is bad for everyone, because it means that the students stop making any attempt to learn any of the material, and just sit around disrupting your class.  You as a teacher lose both your carrot and your stick.

3.  This system, when applied to homework, is even stupider, because the only point of homework is to (a) develop the ability to work independently and (b) develop knowledge of the material.  If the student is not doing homework, a system that very quickly tells him or her not to bother to do any more homework is obviously not going to develop his or her ability to work independently, and is obviously not developing his or her knowledge of the material.  

4.  In fact, if you think about why students might skip doing homework, I consistently hear three reasons.  (i) I can't do it or don't see the point; (ii) I can do it easily and so I don't see the point; (iii) I have too much other homework to do.  In case (i) , the student is telling us that he/she can't actually work independently on the assignment, or that there's no obvious reason to do so.  So a zero penalizes the child for not doing something he or she perceives as either undoable or pointless.  In case (ii), the child is saying that while he or she could work independently, the assignment is not actually going to develop his or her knowledge of the material.  So giving a zero feels to me like you're mad at the child for uncovering the secret that your assignments are actually irrelevant to their learning process, whereas instead you should be saying something like "good metacognition; what was something interesting you were thinking about?"  In case (iii), the issue is not that the kid doesn't recognize the importance of the assignment, but that he or she has too much other stuff to do to actually get done this thing that he/she agrees is important.  So giving a zero penalizes the child for a problem not of his or her own creation, and frankly, makes you part of that problem instead of being part of that solution.  (Note that your work has been given a lower priority than other work, usually--in my experience--other work that the student felt more relevant to his/her learning, or more within his/her grasp, or with more obvious positive/negative consequences.) 

5.  Therefore, you as a teacher should do everything in your power to avoid students getting zeros:  give them second chances to do assignments, make them up in front of you, show proficiency in alternative ways, etc.  In particular, you should avoid as much as possible policies that give automatic zeros for any but the worst behaviors.

6.  Finally, giving an automatic zero for a late assignment is just stupid, because what you're telling the kid is that doing the work one single day after it was due is totally useless.  But what kind of teacher assigns work that is meaningless past a 24-hour expiration date?  This is supposed to be cognitive development, not milk left out on the counter.  Of course, there are occasionally assignments that really need to get done by a specific time in order to set something up for class.  But then that can be communicated directly, outside of the code of grades.  "I really need you to do those coin flips tonight, because tomorrow we're going to aggregate our data."  "I really need you to practice these derivatives tonight, because tomorrow we're going to work on applications."  "It's really important that you do the assigned reading every night, because otherwise you'll have nothing much to say in our discussion of the texts the next day, and you won't even really understand what the rest of us are arguing about."  And then you make the assignments short and meaningful; in the last case, for example, you can assign a short response paper rather than an outline or ... 

If the homework is essentially skills practice, and the skills are important, then the kid will still be well-served by doing the practice a day or two later.  In fact, if you ask more questions instead of giving the kid the zero, you might find out that the kid didn't feel intellectually able to tackle the material when he/she got home: you're penalizing the kid for being a slow learner.  

7.  Finally finally, it's important to remember how much relying on work done at home for learning privileges kids who are already privileged:  kids who don't have to work to help their families pay rent, kids who don't have to watch small children (brothers/sisters/cousins) so that other family members can work to pay rent, kids who have a quiet and reasonably conflict-free space in which to work, kids who have parents or other family members whom they can ask for help, kids who have consistent access to the internet or other non-parental sources of instructional support.  Yes, it's important that kids do learn to do work outside a supervised environment, and yes, it's often difficult if not impossible to get everything covered and practiced in the time allotted.  But remember that every time you rely on homework as a part of the learning process, you're giving more advantages to the kids who already have the most, and throwing up another barrier to the success of disadvantaged kids, which is really the opposite of what public school is supposed to be about.

OK.  That's off my chest.  But I'll sign off with one last h/t to Sean, who is really awesome.  When asked by an audience member why we should assign grades in a way that allows kids multiple opportunities when "in real life, you have to get it right the first time," Sean gave the courageous--and totally true answer--that in real life, you almost never have to get it right the first time.  Most of us have made LOTS AND LOTS of mistakes in our jobs without getting fired--often,without being yelled at.  And even the "exceptions" Sean cited--surgeons and airplane pilots--are not really exceptions:  they just practiced, under supervision, in training, getting it wrong lots and lots of times in simulations (sewing cadavers, practicing takeoffs in a flight simulator or with someone else sharing controls) until their accuracy rate improved to an acceptable value for "real life."  This is learning, peoples, not the Spanish Inquisition.

Monday, October 1, 2012

What Could Go Wrong with Value-Added Metrics?

In my last post, I explained what a value-added metric is.  Simply put, a value-added metric combines three things:
  1. Data taken before and after some intervention, and
  2. A model that uses pre-intervention data, possibly along with other factors, to predict the post-intervention data.
  3. An interpretation of any differences between the post-intervention data and the model.
In the last post, the data were heights of trees; the intervention was a fertilizer treatment, and the model was the linear model based on the data from the unfertilized trees.  In the case where the treated trees grew more than the model predicted, the interpretation is that the fertilizer was effective.  In a value-added metric for teaching, the data are test scores, at the beginning and end of year.  The model predicts end-of-year gains for "typical" students.  The interpretation is typically that the differences between actual and predicted results are a measure of teacher quality.

There's been lots of misinformation about value-added metrics; before we deal with what's wrong with this scheme, we need to make sure that we're not spouting half-baked criticisms that make us all sound ignorant.

Half-Baked Objection 1:  It's not fair to penalize teachers whose students don't end the year at grade level when those kids start the year behind grade level.
The VAM doesn't simply score students based on their end-of-year scores, but looks for growth from the beginning of the year to the end of the year.  So if a group of students starts 5th grade reading at the 3rd grade level, and finishes the year reading at the 4th grade level, the teacher is supposed to get credit for a year of growth.
Half-Baked Objection 2: Students aren't plants, and teachers aren't fertilizer.
Of course they aren't.  But by itself, this objection says "You can't measure anything." And while measuring teachers badly hurts the profession, claiming that what we do can't be measured doesn't help either.
*     *     *     *     *
What is it reasonable to expect of a measure of teacher quality?  Let's establish a few criteria:

  1. Longitudinal Consistency Teachers change over time, but not necessarily that much in any given year.  So unless we have evidence that a teacher is taking substantial steps to improve his or her practice, or strong evidence that something has come unhinged, we would expect teacher scores to stay roughly the same from one year to the next. If teacher scores fluctuate wildly, that casts doubt on whether the score is really measuring something that the teacher is doing.
  2. External Validity There are research-based strategies for exemplary teaching; that is, people have actually compiled lists that describe what teachers need to do to be effective.  One such model is Charlotte Danielson's Framework for Teaching, but it's not the only one.  Because these strategies are themselves validated by research demonstrating their impact on student learning, we would expect that, in general, teachers who are doing the things on these lists would score highly on the value-added metric, and that teachers who are not doing these things would score poorly.  Of course, there's no canonical list that we need treat as gospel: it's possible that, over time, our views of what constitutes good teaching will evolve, and that this evolution will be informed by results of a metric system.
  3. Fairness We don't want our measurement system to treat one group of teachers differently from another, and it should be mostly immune to sabotage or "gaming" by malevolent or savvy administrators and teachers.
  4. Appropriate Incentives Peter Drucker's maxim "What is measured, improves," has a corollary:  make sure you measure the things that you want to improve.  In an era when almost any fact can be Googled, when the phrase "21st Century Skills" has gone from a war cry to a banality, we need to be careful that our metric creates incentives for teachers to teach the skills, concepts, and habits that we want kids to learn.  We also want to ensure that the metric doesn't create perverse incentives for teachers to skip over crucial content, revert to large-scale rote memorization, or avoid teaching certain students.  For example, the current NCLB regime has the well-documented "Bubble Effect":  it's to a teacher's advantage to concentrate on those students who are near the proficiency borderline, to the exclusion of students who are so far from proficiency that a single year's work is unlikely to make the difference.  
There are probably lots of other criteria we could use, but this list makes a fair start.  The next question is: how well do current systems measure up?