One Year of Specs Grading: the Good, the Bad, and the Ugly

Spring semester 2023 is in the books. It actually has been in the books for a few days, though I have spent the time since working on wrapping up its ragged ends.

In truth, this was a second consecutive brutally difficult semester, making the 2022/23 school year one of the most difficult of my career. In addition to a series of crises external to what happened in the classroom, I was also teaching three new classes: upper level surveys on Ancient Rome and Persia, and a first year seminar. The demographics of that first year seminar were particularly challenging such that I had to functionally re-write the syllabus midway through the semester and I had one student so difficult that I came to dread walking into that classroom.

I wrestled the semester into submission eventually. My Persia class might be my favorite class I have ever taught, in large part because of the mix of students, and my revised first year seminar syllabus along with a slightly different approach to discussion allowed my students to pick up on the themes and skills that are most important for the course. In each of these three classes I was also able to build trust with the students that we were able to largely weather the techpocalypse ransomware attack that took down the network two weeks before the end of the semester. The outpouring of comments from students in the last few weeks was enormously moving, but I also want to recognize how hard I had to work to get there.

However, by way of semester retrospective, I want to focus on one academic year using Specifications Grading. I adopted this system because it promised to make my life easier, and my spring changes like an UnGrading system to assess participation and taking attendance every day worked, but, one year in, I am left wondering whether a specs model is the right fit for most of my classes.

The Good

My favorite part of specs grading is not assigning grades to assignments. The obsession with grades is deeply rooted in students, but grades themselves are often a poor match for learning. Specifications, by contrast, clearly establishes my expectations and, at least in theory, gives the students guidance on how they can earn credit for an assignment. This is still a form of grading, but the expectations provide a framework within which the students can learn and my feedback can focus on whether the student has met the expectation for that assignment. Moreover, the grades are earned across categories, meaning that the students have to engage with each part of the course and the clear expectations for each grade tier can allow students to prioritize their efforts if, for instance, they have met the requirements for their target grade in my class and need to focus instead on passing a different one.

Moreover, by modifying the expectations up or down for either the overall grades or for individual assignments I can adjust what my expectations are for the students. Thus, when our tech issues struck, I could easily fulfill every learning objectives and still lower the expectations for several graded categories in my classes, much to the relief of my students.

I particularly found specifications grading effective for relatively small, repeated assignments like journals where partial credit is particularly arbitrary and missing the rubric on one or two assignments both teaches an important lesson about following the assignment guide and has a relatively minimal overall effect on the final grade. Whether or not I continue with Specifications Grading as an overarching grading scheme, I will definitely carry these aspects forward into what comes next.

The Bad

More of my students this semester than in the fall term seemed to embrace the spirit of the specs grading and understood how the grade tiers worked, but this still left me with some students who struggled to see the connection between the work that they were completing the grade tiers in the syllabus. A couple of these were unique cases with a confluence of circumstances, but others were more persistent and connected to another issue that frustrated me last semester.

One of the keys to Specifications Grading is transparency. Every assignment guide came with a detailed rubric that spelled out exactly how to earn credit for that assignment. These rubrics were prescriptive in that they articulated the formal characteristics that I was grading on, but they were deliberately open-ended so that the students could work within the guardrails to express themselves. For instance, the journal assignment specified a length, a mandate to include a date, title, and word count, and a set of prompts like “what was the most interesting thing you learned from class this week” or “how would something you learned this week change a paper you wrote earlier in the semester.” For responses to a class movie, the rubric might be that you need to answer each question with at least 2 complete sentences appropriate for the movie.

However, I often got the sense that the students weren’t checking their work against the rubric before submitting it. In the small repeated assignments one or two times being told that an assignment wasn’t accepted put the students back on the right track, but then in some of these cases the students would trip up in exactly the same way on the next assignment.

Even more worrying was that this also happened on bigger assignments like papers where students turned in sometimes two or more drafts that seemed to rely on little more than hope that it fulfilled the rubric, even after having the students use this exact rubric for the purposes of peer review. I allow students to revise their papers both as a matter of praxis for teaching writing and because not doing so would be too draconian a policy for a specs system (see below), but nevertheless getting rounds of papers that simply ignored the guidelines, and, in at least one case, introduced new ways that the paper missed the rubric on revision, made me ask in frustration why I provide the rubrics in the first place.

But for all of these frustrations, these are not the reasons I’m considering whether to keep a specifications model or adopt some sort of hybrid system.

The Ugly

Two semesters into using Specifications Grading, my biggest question is whether it is a good match for writing-enhanced classes.

I really like the rubric I designed for grading essays in this system. Unlike most specs rubrics that use a proficient/not-proficient binary, my rubric has two “pass” tiers, one for basic proficiency and another for advanced. The advanced tier I calibrated at roughly a low-A. Earning a C in this course required revising one of three papers to the advanced tier and just the first tier for the other two, a B required revising two, and the A required all three.

Despite the promises of specs grading, I have not found that this system saves me any time at all, especially when grading papers on the learning management system, which I do as a matter of equity (e.g. costs of printing), scheduling (e.g. not having things due at class time), and convenience (e.g. I can toggle between versions). Simply put, I found that a lot of students would not be able to write well enough to fulfill the advanced tier of the rubric on one paper, let alone three. Even when they looked at the scored rubric, which was not always the case, I felt like I had to give lots of direct and actionable feedback in the paper itself, in the rubric comments, and in the summary comments on the paper. Otherwise, I feared, the students might not be able to make the connections between whatever they wrote and the rubric scores.

Let me be clear here: the system works. As I told my students, my goal at this point in their college career is to help build good writing skills and habits so so that every student knows that they can revise a (relatively short) paper to a high quality before they get to the two research-centric classes that they take in their junior and senior year. I am also comfortable with the rubric calibration because each semester I had a few students who fulfilled the rubric with no or minimal revisions to their paper, and nearly every student improved dramatically from the start of the semester to the end.

But there were also some days when I felt like I was dragging two classes worth of students (46, at final count) toward writing proficiency, on top of being responsible for the course content, two sections of tag-along non-WE sections of these courses (6 students), and the first year seminar. It was a lot. Having two sections of this process of course magnified all of the issues, but it also left me wondering whether continuing down this path toward completely spec-ified writing-enhanced courses is sustainable. I don’t relish the prospect of going back to traditional points-based grading either, which makes me wonder if I can imagine some sort of hybrid grading scheme that does what I want it to do.

Specsitol: a semester reflection

I submitted grades a little over a week ago and promptly withdrew, exhausted, into a little fort with curtain walls made of novels. At least that’s what it felt like. The specific details might be exaggerated.

Several times I tried to break through the fog that had settled over my mind, but succeeded only in producing a silly post about pizza TV shows and the weekly varia post that I start compiling as soon as the previous week’s goes up. I could barely think about the semester that had just ended, let alone put those thoughts into any sort of coherent discussion.

Simply put, I had an exceptionally difficult semester, and one that rates among the very toughest I have ever experienced. Some issues stemmed from causes external to my classes (e.g. not getting some much-needed rest this summer and early semester indexing and proofing a book manuscript that put me perpetually behind), while others stemmed from things that happened in the classes, most of which I don’t want to talk about in this space because I don’t like talking about specific student activity in a public forum even when identifying details have been redacted, especially when there is nothing of universal value that can be gleaned by doing so.

Not every problem stemmed from these issues, of course.

Dissatisfied with traditional forms of grading, I dove headlong into the world of Specifications Grading for most of my courses this semester. To stick with the metaphor, I liked these waters but they also sent me crashing into the rocks.

The formula varied a little bit, class by class, but, in general I came up with a system where the students earned credit across four or five categories of assignments (e.g. journal entries, small assignments/participation, papers). Every assignment was graded using a bespoke rubric and it either met the standard and thus earned credit, or it did not. More work, and higher quality essays (the essay rubric had two tiers, one for basic competency and another for advanced) earned higher grades. To meet these higher standards, I allotted virtual tokens that the students could use either to revise their papers or turn work in late, pegging the number of tokens to the number of papers.

I entered the semester thinking that I had worked out a reasonably simple system that would give students the agency to decide what grade they were aiming for, make my expectations for each grade level clear, and provide in-semester flexibility that would allow students to do their best work. However, I had not anticipated that putting these assignments and expectations up front in the course would lead to cognitive overload for a significant number of students. In fact, I had a conversation in the final week of class with a student who said that this semester was much harder than the course they had taken the semester before even though the workload in the two courses was identical except that I had swapped one short weekly assignment for another. While there are other explanations why this student might have struggled with my course, I’m inclined to take the sentiment at face value because I saw evidence of the same struggle from other students who were struggling to interface with the information that I had provided in a way that made it harder to complete the work itself.

The core of this problem, I think is that many students were used to traditional grading schemes that allow students to muddle through to a passing grade without too much effort. By contrast, the system I devised required students to complete assignments in each category to a specified level in order to earn the grade. Passing my general education courses last semester did not require too much work, unless you simply neglected a graded category.

I am treating this as a messaging problem for now. Traditional grading schemes remain stupid and I’m not ready to abandon my attempt to find something better just yet.

However, the issue of students neglecting grade categories dovetailed with the tokens and flexible deadlines to create absolute chaos on my end. Here there were several intertwined issues.

Several semesters ago I developed a system for deadlines where students could receive an automatic extension by filling out a Google form before the due date. This policy has proven incredibly popular with my students. However, while I intend to keep it intact in some form, I am starting to question whether the system is having the intended effect. Rather than providing students the space to do their best work, I am finding that whatever grace I provide is filled by other classes with stricter deadlines such that my students wind up writing their papers at the last minute anyway, just several days later, and I had so many students taking the extension that it became a challenge to return papers in a timely fashion.

However, it was the tokens that turned this semester into a logistical nightmare. I set up the tokens anticipating that most would be used for revisions, knowing full well that revisions coming in at any point would cause some chaos. What I did not anticipate is that some portion of students would use most or all of their tokens to turn work in late. This meant that I had not only revisions, but also new work being turned in on no particular schedule throughout the semester, and I had difficulty keeping tabs on students who hadn’t turned in assignments, some of whom I knew were working on things and some of whom I did not.

Compounding these issues was, I think, a consequence of having a significant number of first year students. Anecdotally, from talking with friends who teach in high school, some students have been conditioned to think that flexible deadlines and the like mean that an assignment is optional. Or that whatever make-up assignment gets offered will be easier than the original assignment. As one explained:

“I’ll allow X to be redone/revised/resubmitted” is increasingly being taken as “I don’t need to do X, I’ll do the makeup Y later which will be easier anyway.”

This was obviously not what had been intended, but this collision of expectations and conditioning meant that I spent a significant amount of time amid the chaos of trying to grade everything just trying to track down missing work so that the students wouldn’t fail on those grounds. Oh, and I had 50% more students than I had in either semester last year.

Then there was the grading itself. I adopted a specifications system because it promised to offload some portion of the grading onto explicit rubrics where I could check the appropriate box. I loved not assigning grades to papers, but I quickly discovered several things that meant the system created just as much work as the mystery black box of traditional grading, if not more. The issues started because, I discovered, many students simply did not complete the assignments with the rubrics in mind and did not use the rubrics to check the work before submission. This meant that I often received work that did not fulfill the simplest rubrics.

These problems were particularly acute on the written assignments with its long, detailed rubric that should have provided guidance for the papers. I quickly realized that many of my students did not have the writing background to achieve the higher proficiencies, so simply checking the rubric box was not going to provide adequate guidance or encouragement. At the same time, while some students were not going to be aspiring to those grade tiers, I also couldn’t in good conscience provide detailed feedback for some students and not for others until the very end of the semester when the possibilities of revision had passed. By the last two weeks of the term it was clear that I would not be able to get caught up, so I offered that any student who wanted to revise their work could come to office hours and have their paper(s) marked in person so that they could receive feedback on how to meet the next tier. These meetings gave any student meant that (I think) any student aiming for higher grade tiers reached them, but they also meant that those weeks were a whirlwind of paper conferences.

Finally, my small assignments policy put a cherry on top of this disaster sundae.

The policy was simple. There were some number of small papers, in-class activities, exit-tickets, one-minute essays, and other activities that took place in class. If you weren’t there, you couldn’t make up the work. Unless you were an athlete at a competition. Or you got sick. Or had other “excused” absences. Right from the start, I found myself litigating what counts as a legitimate absence, which is one of my least favorite parts about taking attendance. Then, like with non-completion of work, I found myself around the middle of the semester worried about the number of students who seemed liable to fail (or otherwise drop grade tiers) because they had failed to adequately participate in the class. Since the opportunities for these points often did not come at regular intervals, I found myself inventing “optional extra” opportunities that would allow the students to bring their grade in that category up, which, in turn, created confusion about what assignments students actually needed to complete. Often, the students who completed the optional assignments were not the ones I had in mind when I created them. And, of course, adding all of these small assignments created a flurry of paperwork that I had to manage.

Chaos.

I should point out that for a non-negligible percentage of my students this system worked exactly as I envisioned, giving them agency to achieve grades based on their goals for the semester. Had I not felt compelled to give the students aiming for the “C” the same level of feedback I gave to those aiming for an “A,” my grading might have even been manageable—but, of course, almost everyone said that they were aiming for an “A” back in August.

I am not ready to abandon this grading mode, just yet, but it needs to be modified in critical ways for it to become sustainable and productive. The changes I have in mind to this point are:

  • Streamline my messaging and expectations. This means not only being clear about my expectations in terms of earning credit across multiple categories, but also clarifying that this is a labor-based grading scheme. It is designed to be transparent and achievable, but not necessarily easy. At the same time…
  • I want to submerge the mechanics of the participation grade. Some of the chaos this semester was created by the various points that students earned for doing in-class activities, which meant that this was something I had to track. I am not planning to change the activities that I do for small assignments, but my current thought for this category is to take a page out of the “ungrading” playbook. Instead of me assigning grading, the students will complete three reflections, one at the start, one at the middle, and one at the end of the term. The first one will set expectations and think about where they are at the start of the course. The middle two reflections will both have the students assign themselves a percentile grade for their own engagement with the course material. I will then plug the final percentile grade into a formula that adds or subtracts points based on attendance and maybe what percentage of small assignments they complete where perfect participation and attendance adds to score, a range results in no change, and excessive missed classes and activities results in lost points. I see a number of ways that this could go horribly wrong and I’m still working out the kinks, but it would also relieve the demand for me to track so many different assignments or create “optional” work.
  • I am going to rewrite the longer rubrics both to make them easier to follow and so that the students can explicitly use them as checklists. Similarly, I am going to print these rubrics and distribute them directly to my students.
  • Ditto for handouts on things like writing. I provide a lot of resources for the skills that I ask the students to master in these classes, but I find that even when directing students to them via presentation in front of the class, they are not being used because most students forget that they are there. I remember sticking handouts into my backpack never to be seen again, but at least having been handed a physical copy of something might help jog memories.
  • I am changing the token system. Tokens will only be used for turning in assignments late and probably limited to just 2, with a reward to the participation grade for every token left unused. Revision will be limited to the papers, but allowed for every paper, albeit probably with firmer deadlines for when a first round of revisions need to be complete.
  • Since none of this addresses how much time I spent responding to individual papers this semester, I am also likely going to lean more heavily on the language in the rubric and invite students looking to revise their papers to higher levels of achievement to come for conferences earlier in the semester.

Looking over these changes, there are still parts of this system I am concerned about. The ungrading formula, for instance, is an awkward beast to explain in the syllabus and it could lead to uncertainty about how the various non-paper assignments contribute to their grade. But I also think that there is a real possibility that these changes might be able to preserve what I liked about last semester while also steering into the sorts of written and metacognitive exercises that I find particularly valuable for students in a way that will make it a more sustainable and productive learning environment for everyone involved.