Angles of Reflection: 2012

Tuesday, October 30, 2012

Why Zeros for Late Homework are Stupid -- a general theory

In a presentation today, the really-awesome Sean Stalling mentioned offhandedly that the common policy of automatic zeros for late homework assignments is devastating to kids grades, discouraging, and just dumb. (My words, his sentiments, I think.) But not everyone already believes this now-obvious-to-me idea--I didn't always believe it either--so here's why it should be obvious.

1. Even one or two zeros has a really devastating effect on a student's grade, especially in a traditional 90-80-70-60 scale, even worse if the people are using one of those idiotic "honors" 94-88-80-75-65 scales. (These scales are idiotic, because performance on more complex tasks--the kinds you'd want students to be doing in honors courses--is harder to replicate, so you should if anything have a looser grading scale that allows kids to show excellence sporadically and still get good grades. For example, classes for math majors at the college level not-infrequently use the "sup norm": your grade is primarily based on the highest score you get on a test, rather than the average; lower scores are basically ignored. But I digress.) For example, a student with a consistent 90% average on 9 assignments (low A) drops to a low B with a single zero, and even if the student gets 100's on every assignment thereafter, it will take that student 9 consecutive 100's to bring that grade back up to an A. Put differently, on the 90-80-70-60 scale, a kid needs to have 95's on EIGHTEEN assignments to get an A if he or she gets a single zero.

It's depressing that so few teachers using this system understand this. I mean, are they blind?

2. Students DO understand this, and so after a couple of zeros, they correctly conclude that there's really no point in trying anymore. This outcome is bad for everyone, because it means that the students stop making any attempt to learn any of the material, and just sit around disrupting your class. You as a teacher lose both your carrot and your stick.

3. This system, when applied to homework, is even stupider, because the only point of homework is to (a) develop the ability to work independently and (b) develop knowledge of the material. If the student is not doing homework, a system that very quickly tells him or her not to bother to do any more homework is obviously not going to develop his or her ability to work independently, and is obviously not developing his or her knowledge of the material.

4. In fact, if you think about why students might skip doing homework, I consistently hear three reasons. (i) I can't do it or don't see the point; (ii) I can do it easily and so I don't see the point; (iii) I have too much other homework to do. In case (i) , the student is telling us that he/she can't actually work independently on the assignment, or that there's no obvious reason to do so. So a zero penalizes the child for not doing something he or she perceives as either undoable or pointless. In case (ii), the child is saying that while he or she could work independently, the assignment is not actually going to develop his or her knowledge of the material. So giving a zero feels to me like you're mad at the child for uncovering the secret that your assignments are actually irrelevant to their learning process, whereas instead you should be saying something like "good metacognition; what was something interesting you were thinking about?" In case (iii), the issue is not that the kid doesn't recognize the importance of the assignment, but that he or she has too much other stuff to do to actually get done this thing that he/she agrees is important. So giving a zero penalizes the child for a problem not of his or her own creation, and frankly, makes you part of that problem instead of being part of that solution. (Note that your work has been given a lower priority than other work, usually--in my experience--other work that the student felt more relevant to his/her learning, or more within his/her grasp, or with more obvious positive/negative consequences.)

5. Therefore, you as a teacher should do everything in your power to avoid students getting zeros: give them second chances to do assignments, make them up in front of you, show proficiency in alternative ways, etc. In particular, you should avoid as much as possible policies that give automatic zeros for any but the worst behaviors.

6. Finally, giving an automatic zero for a late assignment is just stupid, because what you're telling the kid is that doing the work one single day after it was due is totally useless. But what kind of teacher assigns work that is meaningless past a 24-hour expiration date? This is supposed to be cognitive development, not milk left out on the counter. Of course, there are occasionally assignments that really need to get done by a specific time in order to set something up for class. But then that can be communicated directly, outside of the code of grades. "I really need you to do those coin flips tonight, because tomorrow we're going to aggregate our data." "I really need you to practice these derivatives tonight, because tomorrow we're going to work on applications." "It's really important that you do the assigned reading every night, because otherwise you'll have nothing much to say in our discussion of the texts the next day, and you won't even really understand what the rest of us are arguing about." And then you make the assignments short and meaningful; in the last case, for example, you can assign a short response paper rather than an outline or ...

If the homework is essentially skills practice, and the skills are important, then the kid will still be well-served by doing the practice a day or two later. In fact, if you ask more questions instead of giving the kid the zero, you might find out that the kid didn't feel intellectually able to tackle the material when he/she got home: you're penalizing the kid for being a slow learner.

7. Finally finally, it's important to remember how much relying on work done at home for learning privileges kids who are already privileged: kids who don't have to work to help their families pay rent, kids who don't have to watch small children (brothers/sisters/cousins) so that other family members can work to pay rent, kids who have a quiet and reasonably conflict-free space in which to work, kids who have parents or other family members whom they can ask for help, kids who have consistent access to the internet or other non-parental sources of instructional support. Yes, it's important that kids do learn to do work outside a supervised environment, and yes, it's often difficult if not impossible to get everything covered and practiced in the time allotted. But remember that every time you rely on homework as a part of the learning process, you're giving more advantages to the kids who already have the most, and throwing up another barrier to the success of disadvantaged kids, which is really the opposite of what public school is supposed to be about.

OK. That's off my chest. But I'll sign off with one last h/t to Sean, who is really awesome. When asked by an audience member why we should assign grades in a way that allows kids multiple opportunities when "in real life, you have to get it right the first time," Sean gave the courageous--and totally true answer--that in real life, you almost never have to get it right the first time. Most of us have made LOTS AND LOTS of mistakes in our jobs without getting fired--often,without being yelled at. And even the "exceptions" Sean cited--surgeons and airplane pilots--are not really exceptions: they just practiced, under supervision, in training, getting it wrong lots and lots of times in simulations (sewing cadavers, practicing takeoffs in a flight simulator or with someone else sharing controls) until their accuracy rate improved to an acceptable value for "real life." This is learning, peoples, not the Spanish Inquisition.

Monday, October 1, 2012

What Could Go Wrong with Value-Added Metrics?

In my last post, I explained what a value-added metric is. Simply put, a value-added metric combines three things:

Data taken before and after some intervention, and
A model that uses pre-intervention data, possibly along with other factors, to predict the post-intervention data.
An interpretation of any differences between the post-intervention data and the model.

In the last post, the data were heights of trees; the intervention was a fertilizer treatment, and the model was the linear model based on the data from the unfertilized trees. In the case where the treated trees grew more than the model predicted, the interpretation is that the fertilizer was effective. In a value-added metric for teaching, the data are test scores, at the beginning and end of year. The model predicts end-of-year gains for "typical" students. The interpretation is typically that the differences between actual and predicted results are a measure of teacher quality.

There's been lots of misinformation about value-added metrics; before we deal with what's wrong with this scheme, we need to make sure that we're not spouting half-baked criticisms that make us all sound ignorant.

Half-Baked Objection 1: It's not fair to penalize teachers whose students don't end the year at grade level when those kids start the year behind grade level.

The VAM doesn't simply score students based on their end-of-year scores, but looks for growth from the beginning of the year to the end of the year. So if a group of students starts 5th grade reading at the 3rd grade level, and finishes the year reading at the 4th grade level, the teacher is supposed to get credit for a year of growth.

Half-Baked Objection 2: Students aren't plants, and teachers aren't fertilizer.

Of course they aren't. But by itself, this objection says "You can't measure anything." And while measuring teachers badly hurts the profession, claiming that what we do can't be measured doesn't help either.

* * * * *

What is it reasonable to expect of a measure of teacher quality? Let's establish a few criteria:

Longitudinal Consistency Teachers change over time, but not necessarily that much in any given year. So unless we have evidence that a teacher is taking substantial steps to improve his or her practice, or strong evidence that something has come unhinged, we would expect teacher scores to stay roughly the same from one year to the next. If teacher scores fluctuate wildly, that casts doubt on whether the score is really measuring something that the teacher is doing.
External Validity There are research-based strategies for exemplary teaching; that is, people have actually compiled lists that describe what teachers need to do to be effective. One such model is Charlotte Danielson's Framework for Teaching, but it's not the only one. Because these strategies are themselves validated by research demonstrating their impact on student learning, we would expect that, in general, teachers who are doing the things on these lists would score highly on the value-added metric, and that teachers who are not doing these things would score poorly. Of course, there's no canonical list that we need treat as gospel: it's possible that, over time, our views of what constitutes good teaching will evolve, and that this evolution will be informed by results of a metric system.
Fairness We don't want our measurement system to treat one group of teachers differently from another, and it should be mostly immune to sabotage or "gaming" by malevolent or savvy administrators and teachers.
Appropriate Incentives Peter Drucker's maxim "What is measured, improves," has a corollary: make sure you measure the things that you want to improve. In an era when almost any fact can be Googled, when the phrase "21st Century Skills" has gone from a war cry to a banality, we need to be careful that our metric creates incentives for teachers to teach the skills, concepts, and habits that we want kids to learn. We also want to ensure that the metric doesn't create perverse incentives for teachers to skip over crucial content, revert to large-scale rote memorization, or avoid teaching certain students. For example, the current NCLB regime has the well-documented "Bubble Effect": it's to a teacher's advantage to concentrate on those students who are near the proficiency borderline, to the exclusion of students who are so far from proficiency that a single year's work is unlikely to make the difference.

There are probably lots of other criteria we could use, but this list makes a fair start. The next question is: how well do current systems measure up?

Thursday, September 20, 2012

What the heck is a Value-Added Metric?

In conversations in and around Chicago these last two strike-filled weeks, one item has captured center stage: the use of value-added metrics in teacher evaluations. As I've been part of these discussions, I've noticed that both opponents and proponents have something in common: they really don't know what a value-added metric is.

You can tell these people by how they argue about the metric. For example, "If the kids start out the year behind, how is it fair to penalize the teacher for the fact that they end the year behind?" (It wouldn't be, but the "added" part of the value-added metric means that the metric is trying to describe change, not just absolute performance at the end of the year.) Or "If the class starts out at 80% and then ends at 85%, the teacher's responsible for the other 5%." (The value-added model doesn't compare average scores directly, which is good, because we would hope that students grow over the year anyway.)

I'm no fan of value-added metrics in teacher evaluations, but we won't get anywhere arguing about them if we don't even know what we're arguing about. So this blog is a sort of crib sheet for teachers and education people who haven't gotten totally immersed in the statistics and psychometrics stuff.

To make this description go, we'll apply it to a situation where a VAM might actually be useful. Say you want to determine whether a particular fertilizer treatment makes plants grow faster. If this were your fourth-grade science fair project, you'd just take two groups of plants, compute the mean height of each group, and then treat one group with fertilizer. At the end of the experiment, you'd compute the mean heights again. If the fertilized plants have a higher mean height, then the fertilizer works. Right?

Wrong. Computing means of groups doesn't tell you much about what happens to the individual plants. For example, in the (totally cooked-up-to-make-this-point) dataset below, at the end of the experiment, two groups of plants have mean heights of 3.89cm (fertilized) and 4.05cm (unfertilized). At the end of the experiment, the fertilized plants have mean height 5.97cm, while the unfertilized plants have mean height 6.05cm. So the fertilized plants have it, by a whisker: unfertilized plants grew an average of 2cm, while fertilized plants grew by 2.08cm. Problem is, in my model, all I did was assume that the below-average-height fertilized plants grew 4 cm, while the above-average-height fertilized plants didn't grow at all. All unfertilized plants grew by 2cm.

We can see these differences clearly in the scatterplots below, comparing fertilized (left) with unfertilized (right). In both graphs, the blue line is y = x, representing no growth at all. Points above the blue line represent plants that grew; points on the line represent plants that didn't grow, and points below the line (of which there aren't any) would represent plants that actually shrank.

In these graphs, the amount of growth is just the vertical distance from a point to the "no growth" blue line. Stats types often graph those distances separately, and give them a special name: "residuals". In a residual plot, the y-value is the difference (actual value - predicted value):

These residual plots make the situation clear: the "fertilizer" leaves half the plants worse off than they would have been without fertilizer. Simple means don't tell this story.

The other problem with comparing means is that data never fall on nice straight lines: as a representation of possible reality, the made-up dataset above is garbage. A much more realistic set of data for the unfertilized plants (not even worrying about the rest of the experiment) might look like this--but again, I made this one up too:

Heights of Unfertilized Plants

Again, the blue line is y = x: if a point is on this line, it represents a plant whose final height and initial heights are the same, i..e, it didn't grow at all. Points above the graph represent plants that grew more; the further the point is from y = x, the more it grew. Not all plants grew the same amount, but there is a clear upward trend, and it seems like plants that started out taller grew more. So we can draw a best-fit line:

Here, the slope of the trendline, 1.27, suggests the average plant grew 27% over the experiment. But not every plant grew exactly 27%. If we look at the residuals, we see some "noise":

The fact that these points are not exactly on the 0 line tells us that there are other factors at work, but the fact that they seem randomly distributed about the line suggests that there's no systematic factor (affecting all the plants) that our model is missing. In fact, if we calculate the mean residual for this dataset, we would get -0.06, suggesting: the average plant is within 0.1 cm of the height predicted by the model. (Whether this difference is significant or not is a whole nother story....)

Applying that same trendline to the data for the fertilized plants yields an "aha!":

In this case, all the fertilized plants lie above the 27% growth trendline, suggesting that they all grew more than 27%. Again, we can look at the residuals:

In this case, the average residual is 1.93, confirming what we see on the plot: the residuals seem clustered about 2cm above the 0 line. This result suggests that there is another factor at work, possibly the fertilizer.

If we add a new trendline to quantify the growth of the fertilized plants, we get a model whose mean residual is nearly 0:

The slope of 1.76 suggests that the typical fertilized plant grew by 76%. So we might say: fertilized plants grew almost 3 times as much, relative to their original sizes, as unfertilized plants!

The quant way of describing this experiment is in three parts. In the first part, we acquired some data on how unfertilized plants grow. In the second part, we used this data to develop a model for plant growth: a typical plant grows about 27%, and, because r² =0.83, we conclude that about 83% of the variation in the final plant heights can be attributed to the starting plant height. The other 17% is the randomness we see in the residual plot. The model doesn't tell us anything about the other 17% of variation, or even about why the plants grow proportional to their original height--it just describes what seems to happen. In the third part, we used the model to develop a conclusion about the fertilized plants: the fertilized plants all grew more than the model predicted, so (we conclude) the fertilizer must be effective.

Because we have all these numbers about effectiveness, we can even crunch them together to get an effectiveness score. We can crunch the growth rates: 1.76-1.27, or 1.76/1.27, or 0.76/0.27. We can use the mean residual of 1.93 for the fertilized plants against the unfertilized model: "The typical fertilized plant grew 1.93 cm more than unfertilized plants." Notice, though, that the "effectiveness score" quickly stops being meaningful outside of the context of how we computed it: it's just a way of summarizing the relationship between one set of data and another set of data. Because every plant grew a different amount, we can't point to a single number as being completely representative of the unfertilized plants, or of the fertilized plants, and so we can't just crunch those not-incredibly-meaningful numbers together to get something that's more meaningful.

There's lots of room to argue about why this is a lousy approach to measuring educational performance, but that's not our goal today. I want to leave with one last set of ideas about modeling.

First, what about the variation in the data--the original dataset, and the 17% "missing" variation we saw in the residuals that we said was due to "other factors"? If I'm trying to test fertilizers, I do my best to either eliminate other factors or spread them out so much that they cancel each other out. In the first approach, I might plant all my plants in the same field, or make sure that they are watered completely evenly. But of course there might be minute but significant variations in soil, sunlight, etc., which I can't physically control to be exactly the same across all plants. I probably won't level a mountain, or tear down my neighbor's barn, to make sure that all the plants get the same amount of light. So if I'm being really tricky, I divide up my growing space into a hundred or more small squares, plant one plant per square, and then randomly determine which squares get fertilizer. While one particular swath of land might get more sunlight, or less standing water, that patch will have both fertilized and unfertilized plants growing on it. Another advantage of randomizing conditions in this way is that I might be able to get valid results across a wider range of conditions: it doesn't do a farmer in a flat field of acidic soil in North Dakota any good to know that my fertilizer works really well on hilly alkaline fields in Arizona.

Of course, that kind of experimental control is really hard to obtain in education. In education, most of the variables are completely outside a researcher's control. Many are clustered: the students who are low-income or high-income come with other baggage that affects their education.

So I might use a third approach: adopt a more complex model that takes more factors into account. My original meta-model was very simple: growth is a function of original height. But if I'm trying to measure the effects of fertilizer on 50-year-old trees, I can't just plant a bunch of trees in small random plots and wait 50 years to start measuring effects. So what I do is I think of every factor that could affect tree height, measure those factors for each tree, and then come up with a more complex model that predicts height based on all these factors: soil acidity, hours of direct sunlight, proximity to sidewalks, whatever. But then when I've finished fertilizing and growing, I can do the same basic procedure I did in the simpler case: apply the model I've developed to the fertilized plants and see whether their growth pattern is substantially different from what the complicated model predicts. In education research, we might have a model for student performance that takes into account class size, school socioeconomic statistics, etc., and then we'd be trying to see how this model predicts student performance for the students we're trying to study (because they have a particular teacher, or are using a particular curriculum, or ...).

The last caveat is that, of course, the correlation we've observed doesn't tell us much about cause. We don't know why plants that start out taller tend to grow more, or why the fertilizer is effective, or even--unless we've carefully controlled or randomized other variables--whether it's the fertilizer that is making the difference. There's a great XKCD that makes this point:

I've tried to write this so that it sounds pretty reasonable. Next time we'll talk about what goes wrong when this approach is applied to measuring teacher quality in education, and the gloves will come off. I promise.

Monday, July 9, 2012

More Kakaes Followup

In his blog today, Dan Meyers skewers (rightly) Kakaes's basketball metaphor. Kakaes writes

Math and science can be hard to learn—and that’s OK. The proper job of a teacher is not to make it easy, but to guide students through the difficulty by getting them to practice and persevere. “Some of the best basketball players on Earth will stand at that foul line and shoot foul shots for hours and be bored out of their minds,” says Williams. Math students, too, need to practice foul shots: adding fractions, factoring polynomials.And whether or not the students are bright, “once they buy into the idea that hard work leads to cool results,” Williams says, you can work with them.

and, as Dan points out,

Drills aren't a basketball player's first, only, or most prominent experience with basketball.
Drills come after a student has been sufficiently enticed by the game of basketball — either by watching it or playing it on the playground — to sign up for a more dedicated commitment. If a player's first, only, or most prominent experience with basketball is hours of free-throw and perimeter drills, she'll quit the first day — even if she's six foot two with a twenty-eight inch vertical and enormous potential to excel at and love the game.
Basketball players aren't bored shooting foul shots.
Long before "math teacher" was on my resume, I was a lanky high school basketball player trying to get his foul shooting above 50%. I'd shoot for hours but I wouldn't get bored, as Williams suggests I must have been. That's because I knew my practice had a purpose. I knew where that practice would eventually be situated. I knew it would pay off in a game where I'd be called to the line for a shot that had consequences.

Dan's right on target on both points, but I don't think he goes far enough.

Our (national) approach to teaching math is to avoid doing anything requiring actual thought or creativity until we've convinced as many students as possible that there's nothing worth thinking about in math; eventually, the few "survivors" get to do actual mathematics. If we taught English that way, it would be all grammar and spelling until senior year, when a lucky few would get to read actual poetry. Right now, the problem isn't that the U.S. curriculum doesn't have enough skill practice; it's that it doesn't consist of much besides skill practice.
As Dan suggests, what *makes* skills important is their placement within the big picture of doing actual mathematics. Being able to multiply accurately isn't worth a darn--especially in the age of calculators--if you don't have good ideas about when and what to multiply (and when and what not to). We can be excited when kids know their times tables, the way we might be excited about a kid being able to spell really well, or lift something really heavy, but by itself, multiplying not a really useful skill except in the context of multiplication tests.

If we taught athletes the way we taught mathematics, there would be no Kobe Bryant, although there would be handful of strikingly eccentric bodybuilders who would get together to run around, lift heavy things, and engage in odd activities that make no sense to the rest of us couch potatoes.

Tuesday, June 26, 2012

Another idiotic "calculators = bad" article

The Slate article, "Why Johnny Can't Add Without a Calculator" is so poorly argued that I even hesitate to cite it here, but it's getting so much play that somebody has to rebut it.

Konstantin Kakaes's argument essentially boils down to three elements:

Many math teachers who use technology do so ineffectively.
Many students who are taught mathematics with calculators don't have a good grasp of basic arithmetic, or other "traditional" mathematics.
A few teachers teach math successfully without calculators to some students, where "successfully" is defined as "according to traditional criteria."

Therefore, teaching math with calculators results in students who can't do math.

Although I would agree with #1, I would take it further: for as long as there have been math teachers, there have been many ineffective math teachers, with or without technology. As Kakaes himself acknowledges late in the article (when he claims that software won't be able to teach children "any time soon"), teaching (anything) is a complex process. Teaching math requires actually understanding math, and people who understand math have always been in short supply, in and outside of the teaching profession. So a different, simpler explanation for the failure of students to learn math is that there aren't a lot of excellent (or even mediocre--see my previous post) teachers out there teaching math.

A bigger problem with Kakaes's argument is that, if it were true, we would expect to see declining math achievement in the U.S. In fact, the opposite is true. TIMSS and NAEP scores have been rising steadily for the last twenty or so years--the exact time period in which calculators became standard equipment in high school mathematics classrooms. To mention one statistic, the number of students in the U.S. passing the AP Calculus BC exam each year--half of which is no-calculator--is now more than five times the number who even took the exam in any year in the 1980's.

In fact, I would argue that calculators have made possible one of the great sea changes in mathematics education in the western world. In 1960, there was a dropout rate of 27%, and of the 73% of U.S. students who graduated from high school, very few took any math beyond geometry or trigonometry, which was still a course offered at many colleges. In 2009, there was a dropout rate of 8.1%, and of the 91.9% of U.S. students who graduated from high school, something like 50% (77% in 2004) had trigonometry or higher. Put differently: we are now in a world in which about half of U.S. students are expected to learn substantial amounts of advanced algebra and trigonometry before graduating from high school. Technology makes it possible, as great teachers like my friends John Benson and Natalie Jakucyn, to name two have shown, to increase students' access to higher mathematics. With technology, it's possible for a student who doesn't know how to add fractions to learn what a derivative is, what it means, and what you can do with it--and how to let a computer do the computations that he needs to use the derivative in an actual application.

Finally, Kakaes never engages what is, to me, the central question technology poses to the mathematics teacher, namely, what of the traditional pencil-and-paper mathematics is worth teaching? Kakaes writes:

If you learn how to multiply 37 by 41 using a calculator, you only understand the black box. You’ll never learn how to build a better calculator that way.

Besides the inaccurate alarmism of his example--even calculator-active elementary school curricula like Trailblazers and Connected Math expect that students will be able to multiply two-digit numbers by hand (and explain their computations, a higher cognitive skill than was demanded in my day)--he proves too much. If it were necessary to teach everyone a skill to ensure the supply of programmers able to create machines in the future, we would presumably also teach the following:

Computation of decimal approximations of square roots, using the "two digits at a time" method found in old textbooks, or continued fractions, or the Babylonian method.
Approximations of transcendental functions using Taylor series.
Approximations of trignonometric functions using matrix multiplication (faster and better for most angles, actually).
Approximations of transcendental functions using tables and linear interpolation.

But while there's a pedagogical value to each of these (my advanced students think that continued fractions are pretty cool, as any number theorist will attest), we just don't teach them anymore. Why would we? It's inconceivable that anyone would need to know these values accurately without a calculator, and while Kakaes is correct that many university math departments are stuffed full of old-fashioned mathematicians, even they use calculators (actually, Mathematica or Maple) to do these problems--and expect their students to do the same. My point is just that we all agree that there's a line to be drawn between what math students should be able to do by hand and what they can (and should) use a calculator for--we're only arguing about where that line is.

That line is porous at best. Zalman Usiskin has pointed out (in his NCTM Yearbook article on technology), even paper-and-pencil algorithms are technology, every bit as much as computer software. Old-fashioned long multiplication, as I've pictured it at right, is one:

This "killer app" version is fast and correct if you actually do it right--but many students find it hard to understand, hard to apply consistently, and--in practice--extremely inaccurate, because the most common errors (not shifting the second row over, for example) actually have huge effects on the results. Other algorithms (partial products, estimation with corrections, etc.) are not as fast, and (sometimes) only produce approximate answers. But students can actually understand and explain them, and apply them correctly. And if you really need a completely accurate answer quickly, in a grocery store or at a worksite, wouldn't you do what I do -- that is, pull out the calculator on your phone?

Kakaes does raise some valid points. Technology by itself isn't the sole indicator of high-quality math instruction: there's lots of low-quality math instruction with technology (just as there's lots of low-quality math instruction without technology). Promethean boards do not raise outcomes by themselves. And (as Sugatra Mitra says, in his argument for why technology can be transformative for the poorest children), for kids in affluent districts (which, by his standards, is much of the U.S.), the marginal impact of any given new technology might be quite low. But as Zalman argues, the question is never "should we teach students to use technology?", but "which technologies should we teach students to use?" That question--not this fake "can Johnny learn math with a calculator?" question--is where the discussion should start.

Update: A version of this article is now a posted response on Slate!

Monday, June 25, 2012

We Don't Need More Great Teachers

First, hats off to my friend Peter DeCraene, currently in D.C. for the Presidential Awards Recognition Program. This post is kind of in his honor.

I've been doing a lot of thinking about what would improve outcomes, on a large scale, for the truly underserved kids from around the U.S. I've had the good fortune to know many truly great teachers, and to have been part of a department that developed several of them (with more in the pipeline!). One of my students once compared being in one of my classes to watching a perfect game in baseball, and--except for the "watching" part--I have to say I've had few nicer compliments. But I've come to think that the emphasis on great teaching in our nation's current dialogue about the importance of education is at best unhelpful and at worst counterproductive. We don't need more great teachers.

Visiting China, I was struck by how rigorous the mathematics is at the nation's best schools, but also--and I was really surprised by this--how decidedly vanilla the teaching is. A typical classroom--at a top school--looks like this:

Notice: fifty or so students listening to a lecture by one teacher at the front, on a chalk board (no technology), and--this doesn't come through on the still photo--little to no actual student input. On this day, students were going over questions on a practice test for their regional end-of-year exams (already an activity I'm not sure I'd spend time on, certainly not with me doing the presenting), and for forty or so minutes, literally all that happened was that students were told how to solve problems they had gotten incorrect, and marked the correct methods in their test booklets. But the problems were something else--deep, challenging, multidimensional. For example, the problem below (from the same class's papers) asks about what happens when a trapezoid is folded into a solid:

It's almost inconceivable that even an honors geometry class in the U.S. would ask a question this complicated.

The disjunction here poses a real question: how is it that Chinese students get to the level where they can do and appreciate such challenging problems, without getting excellent instruction? And after a lot of asking and thinking and soul-searching, I think I have the answer: there are very few bad teachers. The typical middle-to-strong Chinese student gets, so far as I can tell, much more consistent instruction than a similar student in the U.S. Ask any student at my school--selective, very high-level--and you'll get the story of fifth grade, when they didn't really do any math in math class. (Or sixth grade, or third grade, or whatever.) So far as I can tell, this almost never happens in China. In China, there are some good teachers, a lot of mediocre teachers, but almost no bad teachers. Go into any Chinese math class on a given day, and my guess is, you'll see kids getting fair-to-middling teacher-led instruction in mathematics that is reasonably clear and factually correct. I'm claiming that Chinese students--unlike my students--don't get told that "zero isn't even or odd, it's special" or that you can't subtract 7 from 3, or that "there's a formula for solving polynomials of degree five or higher, we just haven't found it yet." And then this instruction is supported by a consistent experience of solving rich problems on homework and on tests. Eleven consecutive years of this kind of solid, not particularly imaginative teaching produces literally tens thousands of students who can tackle very challenging math problems--which is the point.

I've observed a lot of teachers and tried my hardest to help the teachers in my department improve their own practice, as I'm always trying to improve mine. But I don't know what to tell a teacher to make them into another Peter, or John, or Ray, or Natalie, or even me. I'm not sure it's possible to communicate to one person how they can become a great teacher, because one of the things about truly great teaching is that it's idiosyncratic: what John does has influenced me, but I can no more do John's teaching than I can do Groucho Marx's repartee.

On the other hand, I think it's possible, and probably not even that difficult, to delineate what it takes to be a reasonable, middle-of-the-road math teacher who produces a solid year of growth in the vast majority of his or her students. If we could have more of those, we wouldn't have to play catch-up--which is hard even for terrific teachers, not to mention the mediocre ones. Almost all of our students would, like the Chinese, finish eighth grade with some working knowledge of algebra and geometry--not just a collection of area formulas jumbled together--and the ability to tackle multi-step problems. In high school, kids finishing trig would actually know enough trigonometry to apply it in precalculus and calculus, because they weren't spending trig relearning facts about functions, equations, and geometry that they should have learned a year or more previously.

I'll finish this ~~rant~~ post with a brief list of items I'm looking for in the next generation of mediocre teachers. The expectations may not strike you as very high: but imagine what would happen if we could really expect them every day, every year from kindergarten through 12th grade.

Except for testing days, each day's class has an objective: something students are to know, understand, or be able to do that they didn't know or understand, or weren't able to do nearly as well the previous day. Content is not simply repeated from year to year or even day to day.
The day's objective is clearly related to overall course goals, to local and national standards, and to what the students already know.
Assessment is frequent and individual: at least a couple of times per week, students' work is collected (or assessed in class) individually to find out what they know, to give them feedback on what they need to improve, and to adjust instruction. Assessment tasks are nontrivial, especially on formative assessments.
The mathematics presented each day is correct.
The mathematics is presented each day in a reasonably logical order. When asked, a teacher can explain the motivation for each step, not just what the step is.
The time allocated to mathematics is spent actually doing mathematics, not graduation practice, watching a non-math movie, or taking a break. (I don't make these up, but please don't ask me to name names.)
The time allocated to mathematics is spent with the students either (a) doing mathematics, (b) listening to brief explanations about how to do mathematics, or (c) asking each other or the teacher questions about mathematics. ["Will this be on the test?" is not a question about mathematics.]

I'm sure there are more ... leave them in the comments.

Tuesday, June 19, 2012

What if we held professional development workshops to the same standards as our classes?

Every so often, a kind parent says to me "My child really felt that every minute in your class was valuable." Of course, I don't think that's literally true, but I'm glad that this family understood my most important goal: to make every minute valuable, in fact totally crucial. I believe that anything else is disrespectful. Think about it: by law, students are not just asked, but compelled to be in my classroom for (46 minutes, 90 minutes, whatever) each day. How can you justify forcing someone to be someplace where you then waste their time?

So I hold my classes to high standards:

If everyone already knows it, we don't cover it. If most but not everyone knows it, we don't cover it as a class; I provide an opportunity to review or relearn the idea either as a pull-out, or as part of a larger task, or as one option among many activities. If a few people know it, I give them something else to do while the rest of the class learns.
I figure out ahead of time and at the time how many people can already do what I want them to do, and how well, so I can do item #1.
I help students connect each day's lesson to course themes and to material from other courses (and also to real life). I make sure they know why that day's lesson is important.
Class time is for work that can't be done at home: because it involves high-level problem-solving, demands that they share ideas, requires higher-level thinking that they can't do independently, or because they need guided practice or reinforcement that isn't available online or with a worksheet with answers.
Class time is not for watching movies, reading, lecture, or even whole-class discussion, unless I expect ideas to build on each other, students to critique each others' ideas, etc. In particular, we don't "report out" results unless there's something to do or discuss from the reports. Time I spend talking is, as far as I can tell, mostly time wasted.
When the assigned work is done, I always have more math for students to work on, so that the ones who get done early don't sit around getting bored. This strategy also decreases the incentive for students to rush through the material without thinking carefully.

Items 4-6 can be summarized simply: class time is for doing mathematics, not for watching other people do mathematics.

Now let's turn to the typical professional development session:

"Who here knows about Gardner's Multiple Intelligences? [or Bloom's Taxonomy, or the Common Core Standards for Mathematical Practice, or ... ]" The teachers are all over the place, but it's hard to tell exactly what each teacher knows, because "Who knows about ____ ?" is not exactly a fine-grained assessment.
"Everyone do this worksheet reviewing the different Intelligences [levels of Bloom's//Standards for Mathematical Practice//etc." Now there's no opportunity for choice or differentiation. When you're done, you just wait around until everyone else is finished. There's no immediate followup task.
"Let's watch this TED talk about ___ ". Or: "Read this article about ___ " I could have done this at home. In fact, I love watching TED talks at home, so I'd be happier watching it at home and using the class time productively. Also, what am I supposed to get out of the TED talk or reading? Why not tell me up front? Occasionally, the TED talk actually shows a process or strategy that would be hard to summarize, like this one by Dan Meyer.
"Let me tell you about ... " What is my take-away? What do I need to get out of this? Could I just read what you're planning to say? and then spend group time doing some task related to the take-away?
"Well, we can wind up at many different places with this ... " Obviously, we're all professionals, and so it's hard to tell someone they're flat-out wrong. But it is important to have standards and to communicate them clearly. If the point of the activity is to rewrite a textbook activity to achieve a certain aim, and the proposed rewrite doesn't achieve it, then whom does it help to let the activity slide by?

In this area, I think we teachers are our own worst enemies. In my classes, one norm is that everyone is wrong at least sometimes, and that correcting an error or misconception is an important job for everyone. But how often do we sit in PD and watch someone say something that is clearly incorrect without challenging it? Maybe one reason why in-school or departmental PD is more effective (at least for me) than inter-school PD is that we're only willing to challenge people we know well and trust.

Tony Wagner's article Rigor on Trial lists seven questions he poses to students during a lesson to assess the level of rigor; note #6 and #7.

What is the purpose of this lesson?
Why is this important to learn?
In what ways am I challenged to think in this lesson?
How will I apply, assess, or communicate what I've learned?
How will I know how good my work is and how I can improve it?
Do I feel respected by other students in this class?
Do I feel respected by the teacher in this class?

He asks whether these questions could "be used as a set of standards for planning and assessing both adult and student learning across a district?" It's hard to imagine how much things would change if they--and the other standards to which we hold our own classes--were implemented as basic principles of PD.

Update: In this morning's PD, taking my own maxim to heart, I challenged a teacher who said that you have to go over every homework problem and every answer to in-class tasks. I said that what I see is that when the teacher "goes over" problems and answers, the energy level and engagement drop dramatically, and that time spent going over homework is mostly wasted. Immediately another teacher said "Where you do you teach? Oh, Payton." I stuck to my guns, pointing out--as we've discussed on this blog--that no matter what high school we're at, if more than 20-30% of the studentscan't do a particular homework problem, then that problem probably wasn't appropriate for independent work, and that if that's the situation for many problems on the assignment, then the assignment itself was too hard. But they'd already stopped listening....

Tuesday, May 15, 2012

ISEF Question: Do math contests decrease math research?

I'm way behind in my postings, so this is a quick one to get back in the groove ...

I'm at Intel ISEF, the world's largest HS math, science, and engineering fair, as part of a team from Chicago trying to increase the number of students doing math research in high school. On Monday, I walked down the math aisle and talked to the five or six kids I found setting up their projects--cool ideas, like using fractal dimension to quantify the distinction between cancerous and noncancerous cells, or linking quadratic residues to the number of digits in the base b expansion of 1/p. And I asked them three questions: Do you do a lot of math contests? Have you ever done a summer math program? Are you part of a math circle? All eighteen answers: no.

Now I like to think that doing these extracurricular math activities makes kids more interested in math and more likely to investigate mathematical ideas on their own, but this makes me wonder. Some hypotheses in search of more data...let me know what you think and I'll report back after more extensive conversations on Thursday.

Maybe the kids who do math research are doing it because they don't have any other outlets for their math interest, as a sort of last resort.
Maybe the kids who do lots of other math stuff simply don't have the time or energy to do math research, because the other math stuff they do consumes all that time and energy.
Maybe the kids who do lots of other math stuff are also the kids driven (literally) to lots of other "Race to Nowhere" activities, so that they don't have time or energy to explore and play, not because of the math they do, but because of everything they do.
Maybe the kids who are driven to do high-quality research are exactly the kinds of curious loners unlikely to be attracted to math contests and summer math programs (ugh! other people!) in the first place.

Hypotheses #3 and #4 are most benign, because #1 and #2 are suggesting--to my chagrin--that part of the reason more kids aren't doing the most authentic math--the "I'm wondering about..." kind of math--is that many of them are doing math contests instead. And that seems oddly backward.

Thoughts?

== pjk

Sunday, April 15, 2012

Target group

To whom do you aim the problems you are using to teach the lesson?
This question assumes that you teach by asking interesting questions and allowing students to figure out the math. If you do not teach in this mode, I am not sure this question applies. I am also not sure I have much to offer, because I think good teaching starts by recognizing that our job is to ask interesting questions and to help students figure out the math behind the questions. I think a teacher helps by watching, listening and letting the students do the work. Telling a student how it works does not work.
But I have talked about that before. This idea is new, I think. P.J. reminded me of this yesterday so I thought I would write about it before he did.

The problem should almost always be aimed at the top group of kids, say the top 25%. There is a myth that says teach to the middle. Do that, and over half of your students learn nothing. The ones above the middle already know it so you are wasting their time.

The problem needs to be accessible to everyone, but difficult enough, challenging enough, that no one can just solve it. That means it needs to be designed so the middle kids and below always have to stretch a lot, while the best kids are still challenged.

One thing to consider is that the top 25% one day will be a different group of students another day. The problems should not be aimed at a specific student but rather at a specific level of challenge. Problem solving has to happen for every student every day, or students will not learn how to solve problems.

"Oh, but students will give up, because it will be too hard for them," you say. I say, you are with them, you are walking around watching them work, and listening. You can push them in the right direction if they have a good thought, and redirect them if they don't. If everyone is about to give up, you will know it, and you can immediately fix it by intervention.

Differentiation does not mean make it really easy. It means teach students how to think. Good teaching means helping students learn what to do when they don't know what to do. That can only happen face to face in your classroom.
It is our job.

Sunday, March 18, 2012

A True Story

Back in the year 2000, there was a lot of fuss about "Reform Calculus." The Advanced Placement committee had announced changes in the AP test, and there were several calculus books being offered that were quite different from the Thomas book that most high schools had been using for years and years. After much deliberation, Evanston decided to make the change, and we adopted the Ostebee Zorn calculus book for BC Calculus. There were two of us teaching the class that year: Ron Selke and myself. We approached the year with excitement and fear.

As we worked through chapter one with the students, we both learned a lot about how to make calculus meaningful and understandable to our students. We had decided to collaboratively write tests, and so we did. The first test covered the ideas in chapter 1 rather well we thought, and we were eager to see how students performed.

To say the first test was a disaster would be an understatement. There was not one student who even tried to work all of the problems. Many students left three or four blank. Ron and I looked at the test, and it measured what we thought was important, but because of the difficulty it measured little or nothing and created considerable discontent among our students. And these are the best students we had. We adjusted the grading scale on the test, admitting that we had totally failed to create a fair test, amd promised that we would do better for the next chapter.

In considering how to fix the problem, we had several ideas. One was to break the chapters into two tests. We rejected that because it would mean giving up too much instructional time for formal assessment; we would spend the year writing and grading tests. Another idea was to only test the easy stuff. We rejected that approach as not being in the best interests of our students. We were committed to making the class a rich mathematical experience that matched the wonderful way Ostebee and Zorn were allowing the course to unfold. Then one morning Ron came to me with one of the best ideas I have encountered. And I resisted at first. I offered reasons why it was a bad idea. After all of that, I agreed to try it. I have never looked back.

Ron's idea was to make a collaborative problem part of the test. The original plan said the day before the test, we would give each student a collaborative problem. This collaborative problem would consist of several parts and would encompass the main ideas of the chapter. The problem would allow us to address some of the subtle concepts or more complicated aspects of the material covered in the chapter. Each student would be required to work with at least two other people, and each person would turn in one copy of the team's perfect, well-organized, well-written solution when the student came in to take the in-class part. It would count as about one seventh of the test grade.

After a couple of tries, we modified the conditions a bit. In particular: we gave students the collaborative problem several days before the test, and always so it would be in their hands over a weekend. We posted the collaborative problem on our websites to allow absent students to access it. We helped shy students find collaborators. We encouraged collaboration with students in the other sections of BC Calculus.

The collaborative problem turned in to one of the best educational experiences in my career. Most of the work was correct, making them easy to grade. The in-class part was now manageable, but we were assessing all of the material. More importantly, students were learning mathematics while taking a test. What an amazing experience! Ron and I listened to their conversations as they worked the problems in the math lab, we heard them talking at the beginning of class, and we flat-out asked them about their experiences with the collaborative problem. All of what we heard was exciting. There was an outcry when we did not offer a collaborative problem for a test we gave on a half chapter. We had found a way to help them consolidate their ideas before they took the in-class part, so the in-class tests were also done better.

We began to notice that the collaborative groups became entities in themselves. The students started to get together just to study. Some of them met during a common free period every day in the math lab or the cafeteria and went over homework questions. Parents praised us for the learning they saw taking place in their homes as students gathered. Collaborative groups compared reaults with other collaborative groups. Students made friends and learned that learning is not an isolated activity.

I also taught a class in Multivariable Calculus and another in Linear Algebra for those students who had finished BC Calculus and had not graduated. There was a clamor for a collaborative problem in that class, so I happily agreed to their demands.

The second year, we realized that we did not have to rewrite the collaborative problem except to fix questiosn that didn't quite go where we thought they would. By now these questions are establsihed. The students understand their value as learning tools, and so there is little evidence that they are looking at old tests.

One moral to the story: when you try something new, it rarely works the way you
intended it. But the thing to do is not to throw it out--then you wind up
doing the same things you were dissatisfied with before that led you to make
the change. Rather, you need to try and identify what's not working and
fix that piece, By iterating several times, you come up with a new
strategy that does accomplish your goals--and even gets you places you hadn't
realized you wanted to be!

By the way, our students performed better than ever on the AP test and have ever since. It is nice when one's observations are valideted by an outside source.

Monday, February 6, 2012

Grading and Formulas

John, I agree with everything you said, except that I have found that grading eventually just wears me down.

An idea I took from George Milauskas takes the "only two points for a correct answer" one step further. George would give his students "magic dots" (= the circles punched out from paper with a 3-hole punch) which they could redeem--in combination with a single point--for any formula required on a test. That is, a student could ask "What's the midpoint formula?" (even though requests for that particular formula make me cringe) and get it, for a mere one point. George's reasoning, which persuaded me instantly, was as follows:

If a student just writes down the correct formula--but does no other work--he or she will usually get one point of "partial" credit. Most of the problem consists in using or thinking about the formula, not regurgitating it. (That leads to a whole nother conversation, about problems that are just about formulas.)
Giving a student the formula allows the student to demonstrate what he or she can do with it.
Leaving the student stranded without the formula means that he or she can't demonstrate anything.

Thus, giving the student the correct formula for a one-point deduction allows me to assess what else he or she knows and can do with it in a fair and reasonable way.

Before you ask how they do on "real" tests outside my class, remember that AP, ACT, SAT all include lists of common formulas. So my students have done just fine.

John, thanks for reminding us how important this work is.

== pjk

Monday, January 30, 2012

Grading Papers

Without exception, retired teachers I have spoken to agree that the best part of retirement is not having to grade papers, especially on Sunday night. And yet there was something very satisfying and rewarding about looking at student work in detail. Grading a student's paper was like spending a few minutes inside of that student's mind. It took a lot of time and it was hard, but carefully grading papers was an important part of the teaching and learning experience.

My last blog was about how grades influence the learning experience. This one is about how the details of arriving at one of those grades are equally important. Papers must be graded fairly. A graded paper should contribute to a child's education in a meaningful way, regardless of the grade attached to it. I would like to share some of my thoughts about the process of grading student work.

At the top of every paper I intend to grade is the sentence :

"You must show enough work so that I can reproduce your results."

I have found that this phrase solves a lot of practical problems about how much work a student needs to show. It also helps when the students solves a problem in an unexpected way. It allows a student to use technology intelligently as long as I am given enough information so that I can get the same result using the same technology. It enables me to effectivly evaluate the error a student has made and give the appropriate amount of credit for the work.

This leads to another aspect of grading a student's work that developed carefully over the thousands of problems I graded. I give points for correct mathematics. I do not take off points. When a student looks at the number of points earned for a particular problem, the student will see a +10, not a -2. The students got the 10 for doing several things correctly that would have lead to the correct answer. Unfortunately, the student made an arithmetic error when computing part of the answer and so did not earn the 2 points allotted for determining the correct answer. It should be noted that one consequence of this grading policy is that a bald answer without supporting work will get 2 of 12 possible points.

We are teaching mathematics. We are assessing the quality of mathematical reasoning a students is capable of. That means we need to see the process used to arrive at the solution, and it is the process we are evaluating. It is inappropriate and short-sighted to require the students to use the process we expect, but we cannot evaluate what we can't see. So, if a student uses a guess-and-check method, I need to see the guesses and the checks. If the student uses intuition and evaluation, I need to have the intuition explained, and I need to see the evaluation.

Consequently, it takes a long time to grade a set of tests. The effects, however, make it worth the effort. I have learned a lot of mathematics by following the work of a student who took an unexpected path. But more importantly, grading lots of papers teaches the grader what sort of misunderstandings students have, and that in turn enables the grader to try to find ways to eliminate thsoe errors next time around. At the very least, one learns to warn students about a mistake students often make on a particular kind of problem. At the most, the teacher can modify the problems used to teach the concept so the class will make the mistakes early on, exposing and hopefully eliminating the potential for those mistakes to occur.

Another consequence of grading this way is that the grader learns a lot about the particular habits of mind of the student. This information in turn is very helpful when the student or the parent wishes to know what can be done to improve. The teacher is aware of a lack of organization, or not checking work,or poor computation skills, or careless marking of a diagram or any of the other habits that interfere with success and can communicate those habits to the students and parent.

Comments to the student congratulating a clever move will have more impact than criticism about a bad move. A teacher can take the opportunity to point out specifically to a student what might have resolved the error. The feedback is up-close and personal and has impact.

But perhaps the most important aspect of grading papers this way is that it conveys a sense of value to students. It implicitly tells them that the important part of mathematical work is the process. They will get points for failed attempts if those attempts are appropriate and reasonable. They will get more points for a clever observation than for remembering a few steps from a previous problem. The students will learn that mathematics is about logical reasoning and making connections more than mathematics is about remembering rules and following them carefully. And that is a big deal.

Wednesday, January 11, 2012

Grades

Grades are the elephant in the room when it comes to learning and teaching. They are an ever-present part of the relationship between teacher and student in most educational situations. Grades can interfere with learning if students believe that no matter how hard they work, their grade will not improve, or if students believe that they can get an A without doing much work. Grades can destroy that partnership if students think the teacher is unfair. Grades consume a large percentage of a teachers time.

Perhaps the worst part is that frequently grades become the goal and overshadow learning. Teachers sometimes use grades to coerce students to comply. Teachers do things like not giving credit for math work done in pen, even if the work is exemplary. Students use grades as an excuse for not doing work. Students will not investigate a problem further bacause they know it will not be tested.

And yet, grades persist. I think they can play an important role in education. Students and parents need feedback about their accomplishments, they need advice about how to improve, and they benefit from legitimate praise and criticism.

Allow me to share two very personal experiences that are vivid memories of my grade school days. For two years in a row, I was told by the music teacher that I could not sing. (She was right.) I was then instructed to mouth the words as the other children sang. As I grew up, I realized that I love music, but I can't sing, so I don't try. I have been told by several people that anyone can learn to sing. I don't believe them, because I was told at a very young age that if I sang, it would interfere with the singing process in class. I can't do singing. The second experience happened in eighth grade when my social studies teacher was trying to explain inflation. I rasied my hand and asked a question about the consequences of inflation. He looked at me and said, "You must be really good at math." Fifty years later, I remember that moment in class; I have devoted my life's work to mathematics.

Both teachers were assessing my work. Neither assessment had anything to do with grades. My grades in school were never very good because I frequently didn't comply with the teacher's wishes about how to do the work (I really liked doing math with a pen, for example). But I did learn and so consider myself to have had a good education in spite of all those C grades.

When I started teaching, I had the good fortune to be in situations where I had the freedom to decide how grades were going to be given. I spent a lot of time thinking about it, tried many plans, and eventually hit on one or two that worked. I think my grading schemes helped me become a better teacher. I would like to share some of my thoughts about grades in the next few blogs.

I have always thought that in order to earn a good grade, a student should demonstrate knowledge of the subject and the ability to apply that knowledge in a variety of situations. That means that each assessment should include some routine exercises to see if the student has learned the basic material, problems right out of the book with different numbers. The students should also be expected to do problems similar to some of the really hard problems that we did in class. And the student who wishes to earn an A ought to demonstrate the ability to apply the information from the unit, as well as from the entire course studied so far, to a new situation. So, my tests are usually about 50% routine problems, 25% difficult problems that are similar to problems they have worked on , and 25% original problems. I give them one period to work on the problems unless they are legally entitled to more time. I carefully look at their work and give credit for correct mathematics relevent to the problem.

Solving a difficult problem takes time, and there is often a certain amount of luck involved. A promising approach may lead to a dead end through no fault of the problem solver, while an equally promising approach may work just right. If we intend to assess out students' success as problem-solvers, we must ask them to solve problems on tests, not just do exercises. That in turn influences how we associate a grade with work done.

I would like to know who decided that 95% was the benchmark for excellent work and what sort of work were they thinking of. Nothing I can think of that is reasonably difficult can be done correctly 95% of the time. The best baseball players that ever lived were successful at getting on base if they could get on base 40% of the time. Most players don't even come close, because hitting a baseball is very hard to do. A 30% success-rate is outstanding.

One of the national standards of excellence in the U.S., the Advanced Placement test, gives only five grades: 5,4,3,2, or 1. In order to get a 5, a student needs to get approximately 72% of the test correct. That level of excellence will often earn college credit for the course in question.

A score of 100 or more out of 150 is considered outstanding on the National Mathematics Exam offered by the Mathematical Association of America every year. That score qualifies a student to move to the next level of competition and often means that the student was in the top 1% of students taking the exam.

P.J reminded me about Dr. Paul Sally's rubric: "If you're getting 50%, you're doing well." Dr. Sally taught Honors Analysis at the University of Chicago. No one ever accused Dr. Sally of having low standards.

That brings me to another thought about grades. My first two years I computed grades two ways. I kept track of total points earned by students, and I also assigned a letter grade to each assessment and then used the letter grades to determine a final grade. It became apparent theat the letter grade method was far superior in two respects. First, students always seemed to have a feeling for where they were. Second and more important, the letter system was fairer, the letter system meant the grade was less influenced by a really bad test, and the letter system allowed me to assign points to problems without regard to making the total come out to a pre-specified number.

Another principle I followed without exception: Every problem was worth the same number of points. I didn't want students to have to worry about how much time to spend on this one or that one bacause one problem was worth more points than another. I did want my students to look the problems over and work the ones they were most confident about first. Since I established the cutoff points, this was a very good and fair policy.

The point is that the elephant is there, and it matters. Grades influence our effectiveness as teachers, and we must spend considerable time and effort working out systems that enhance our teaching, emphasize the things we think are important, inform parents and students about the quality of their work, and are even-handed and fair.

Next time I will share with you some specific things that worked for me. Until then, please reflect on the grading policy that you are using and how it alters your ability to teach mathematics. No matter what you think, it does make a difference.

More later.

About Us

John Benson taught for forty-two years as a classroom teacher in public schools. The first year was at Frederick Douglass High School in Atlanta, Georgia, and the next forty-one years, at Evanston Township High School in Evanston, Illinois. He has been actively involved in many math organizations and attributes much of his success as a teacher to this participation. He was founder and president of the North Suburban Math league, spent twelve years grading AP Calculus exams, and taught for many summers at the Center for Talent Development at Northwestern University. He was the Presidential Awardee in Mathematics for Illinois in 1987. He has been a contributing author for four secondary math textbooks, most recently the third edition of UCSMP Geometry.

P.J. Karafiol has taught math for 18 years: at Phillips Academy in Andover, Massachusetts; at Providence-St. Mel School on Chicago's West Side; and for the last twelve years, at Walter Payton College Prep High School in Chicago, where he is the Coordinator for Curriculum, Instruction, and Assessment. Like John, he attributes any success he has had to reflection on past failures and to discussions with his friends and colleagues. He is the lead author of the ARML competition and a co-author of many other contests. He was the Presidential Awardee in mathematics in Illinois in 2009. He is the co-author of two textbooks, the third editions of UCSMP Advanced Algebra and Functions, Statistics, and Trigonometry.