Grade Inflation ... Why It's a Nightmare*

by Jonathan Dresner

Mr. Dresner is Assistant Professor of East Asian History at the University of Hawai'i at Hilo, and a member of HNN's group history blog, Cliopatria.

My institution successfully passed through accreditation review, a multi-year process which examines everything from physical plant to institutional identity, missions and standards, goals and how we measure progress towards those goals, governance structures, faculty culture, and every other thing they can think of. Their initial recommendations included strong support and encouragement for student learning assessment, and more effective coordination (i.e. centralization) of governance to speed up the process of improving educational effectiveness. Data driven allocation of resources, as well. It was during their last visit that I realized that there is a connection between grade inflation, accrediting agencies and the drive for standardized curricula and quantitative learning assessment.

"Learning Assessment" is one of the hottest topics in educational administration, under the rubric of the No Child Left Behind legislation mandated testing. It is also making its way into higher education, starting with public institutions' core education courses, but departments are being called upon to monitor and document the progress their majors make at upper levels as well. My department is engaged in setting up assessment for our courses and majors at all levels, not because we believe in it, but because our chair is a savvy and forward-thinking veteran. We developed quantitative learning assessment for our World History surveys. We are working on a system for on-line portfolios for history majors, and we videotape our senior symposium. We've talked about ways of assessing upper-division courses. It may seem odd for us to do this, when we don't see much point, but my chair is right: accrediting agencies and department reviewers consider these exercises "state of the art" and without them we will undoubtedly come in for criticism.

Our forward-thinking approach made us one of the stars of our accreditation reports; we were held up as a model department. Now I do think that my department is a good one: we take teaching and mentoring seriously, we do research which people in our respective fields find interesting, and we work together quite well. Our average grades are well below most humanities departments; even below most social science departments, and compare well to the natural sciences, so grade inflation is not a pressing issue with us. Faculty from other departments, employers in the community and graduate schools, all seem quite satisfied with our majors, and our major numbers and enrollments have been rising for about four years. Still, we're 'assessing' everything, and I'm inclined to think that my chair is right about the need to do this sooner rather than later, on our own schedule and in our own way instead of waiting for mandates and deadlines from above.

Because those mandates and deadlines will come. When is your next departmental review? When is accreditation renewal? Find out now, and plan accordingly. This is a multi-year process: it took us a year's worth of biweekly meetings to work out our World History instrument; it'll take months to get our on-line portfolios up and running; then we have to gather at least four or five years of data before we have enough to take a good look at. Maybe we could have done it more quickly if we borrowed ideas and tests and such from other departments? We did borrow, and read and learn, but our program -- like yours -- has quirks and particularities that required us to tinker with what we borrowed.

What's the connection between grade inflation, accreditation and review, and assessment? Grade inflation (and its primary/secondary equivalent, social promotion) has made grades and advancement difficult to rely on as a measure of academic success. Stakeholders are looking for alternative ways to gauge the quality of our product, and tools to aid and inspire us to more effective teaching. Since the institutions themselves have not committed to a solution, governing bodies, including accreditation agencies and government, are seeking to impose one. For primary and secondary education, this has come in the form of high-stakes testing, including NCLB assessments and Massachusetts-style graduation tests. If we are going to avoid similar 'solutions' being imposed on post-secondary education, we need to develop alternatives which credibly address the problem.

Grade Inflation

First, we have to acknowledge that grade inflation is a reality, and more pronounced in some fields than others. At my own institution, the highest grades seem to come from pre-professional programs (nursing, education, agriculture, management, communications) and artistic fields (drama, dance, music), and cultural studies (women's studies, Hawaiian studies). Other departments with lower averages might still have a grade inflation problem, depending on the average quality and work of their students. History's average over the last decade has been around 2.9 on a 4-point scale, on the B/B- cusp, one of the lowest in Social Sciences; our internal discussions suggest that our survey World History average is on the high C+ side. We don't have a stated standard of grading in the department, but our average and median grades tend to come pretty close, and we rarely disagree seriously on prizes or theses, where multiple readers grade the same work. That doesn't mean that we don't have a grade inflation problem, but it could be worse.

Grade inflation has three primary causes: student culture, pedagogical culture and institutional culture. The expansion of the student body since WWII has brought students with a wider range of abilities to college, and also drew in the best students from previously under-represented groups. It has also widened the gap between the level of colleges themselves: there are now significant differences between the average quality of students at various institutions, differences enshrined in things like the Petersen Guide 'tier' rankings. Because of the view of the bachelor's degree as a baseline credential for professional employment, many of these students are unengaged with their educations, and consider college an extended form of high school, where attendance and endurance matter more than engagement. This is particularly true of pre-professional students, who may take their major courses seriously but who don't engage with general education or distribution courses, but anyone with experience teaching intro-level courses recognizes the phenomenon. Plus, students take grades very personally: the grade is about them, not about their work. So differing standards seem unfair, and students respond poorly to the implicit criticism of low grades, particularly when they get accustomed to unearned high grades at earlier levels or in other courses. The ideology of 'student as consumer' has changed the power relationships within the academy, placing satisfaction higher than intellectual growth as a measure of success.

This is reflected in, and exacerbated by, the abuse of quantitative measures of teaching effectiveness. There is considerable research on these instruments, most of which shows strong influence from appearance, class format, even class time, but the only studies I'm aware of which claim that students are generally good judges of teachers are the ones that assume it as a proposition. Our own instrument is at least honestly titled the Perceived Teaching Effectiveness Form, but it is used in a mindlessly straightforward fashion by tenure/retention committees and administrators: for an untenured faculty member, failure to score above the norm is considered a career-threatening flaw. Technically, PTEF results are confidential, but failure to disclose them in contract dossiers is considered prima facie evidence of poor quality. Lip service aside, other evidence of teaching effectiveness, including creativity, technology use, syllabus adherence, and high quality content, is not even secondarily important. So teachers are strongly motivated to produce high scores, and one of the easiest ways to produce high scores is by demanding little and giving easy, high grades. The situation is complicated by the increased demands being placed on teachers: pedagogical innovation and new technology; higher publication standards; higher teaching loads and larger classes. The need to bring in majors and raise enrollments is another factor making raising standards difficult. Unless it is done in a uniform fashion, it will result in students shifting to 'easy' classes, and those faculty and departments who raise standards will face the wrath of administrators and budget committees. Student retention and graduation rates are used as measures of institutional effectiveness, which mitigates against failing (or even discouraging) even the most unprepared students.

Finally, partially as a result of the above-mentioned forces, and partially as a result of intellectual currents usually grouped under the term 'relativism', there has been a shift away from hard-and-fast standards, absolute grades, and critical responses to student work reflected in grades. Some of this is a result of experimental pedagogy: intrinsic rather than extrinsic rewards; self-directed curricula, self-esteem building. Some of this is a result of new ideas about knowledge: post-modernism, feminism, relativism and multiculturalism have added dimensions and reduced certainty. These are not fundamentally bad ideas, but their inconsistent application and misapplication, along with the student and institutional issues above, has degraded the authority of faculty to set standards to which students feel obligated to adhere and the willingness of faculty to use grades as both reward and punishment.

Why is Grade Inflation a Problem?

This is something which is more often assumed than explained, but a clear understanding of the problems associated with grade inflation is essential. The problems go beyond a vague sense of moral or intellectual decline and have practical, long-term implications. Inflated grades interfere with teaching and learning, with hiring and tenure, with the quality of our work environment and with the academy's relationship with the wider community.

The first and most obvious effect of inflated grades is that it becomes harder to use grades as a shorthand form of communication with any nuance. Sure, individual teachers can explain "what grades mean" semester after semester, but when minimally acceptable work is worth a C, or a B or an A, depending on the course, it is hard for students to keep track. Narrative responses to work help, but, unless an assignment involves revision, students tend to ignore anything except the grade; conversely, narrative responses without a grade will tend to be interpreted in the most positive possible light, so the ultimate grade comes as more of a shock if it is not as high as expected.

The disjunction between graduate training institutions and student expectations at the institutions at which most Ph.D.s get hired makes it likely that faculty starting out will have difficulty connecting with their students and will have standards somewhat higher than the norm for their hiring institutions. Harvard's Career Counselors refer to the "H effect", the assumption by interviewers that a Harvard-educated Ph.D. will be disappointed by the quality of local students and have difficulty teaching at their level. To some extent it is justified, particularly since new faculty mentoring is rarely structured or effective, and it results in an elevated rate of dismissal from first hires. These are rarely reflected in official 'tenure rate' figures, as those refer only to faculty who apply for tenure, whereas most institutions will dismiss untenurable or borderline candidates at earlier stages of review, which does not count. If I have one word of advice for newly hired faculty under our current regime, it is: do not admit that you have difficulty with any aspect of teaching, because even an honest attempt to grow and improve will be taken as evidence that you have serious problems. This also distorts our sense of the academic market, as the turnover creates more openings than would exist under a more humane system, thus making it seem like there are more jobs available for the new Ph.D.s.

The corollary to the disjunction is the breakdown of morale and collegiality which comes from struggling against what feels like constantly falling standards. New Ph.D.s trained to high levels of professionalism discover that their efforts to 'raise standards' are met with hostility by students (who don't want to work that hard) and suspicion by fellow faculty (who understand the implicit criticism). The very real differences between departments in grading become factions, and the sense of a threat to academic freedom by standards imposed from outside makes nearly all academics bristle and stiffen. So, instead of addressing the question directly, it becomes a festering issue that won't be discussed, and the only solution is for departments with high standards to grit their teeth and bring them down to the norm in order to effectively compete for students, and therefore resources.

Finally, grade inflation has led to public dissatisfaction with educational results. The same forces that have driven the primary/secondary assessment movement seem to be pushing into higher education as well. Granted, much of the critical reportage about higher education is poor quality, anecdotal, and political. But there remains a steady and credible strain of business and political and social organizations concerned about the process and results of higher education. And it is these groups, through their influence on state and national legislators and, through the US Dept. of Education, their influence on the regional accrediting agencies that is pushing us towards assessment, and will continue to push until we, or they, find a solution to the problem.

Solutions Already Being Tried

There are a few active attempts to solve the problems of grade inflation and educational effectiveness. Some of them are at the level of the individual school; more come from 'suggestions' of accrediting agencies; post-graduation testing is already standard in graduate school admissions and certain professional arenas.

Colleges and universities have tried a variety of techniques to deflate grades. Some have adjusted their grading systems: Princeton instituted a limit to A-level grades. Harvard adjusted its GPA calculation to narrow the A-/B+ gap and that has reportedly been effective in reducing the A-level overload slightly. Most institutions don't go much further than passing around department-level data on grade averages, though a few institutions have followed up with enough pressure and discussion to bring the outliers closer to norm. Some have tried acculturation through discussion, but without hard data there is mostly a chorus of 'it doesn't work that way in our department' and the discussion ends. Tenure, for all its charms, is a serious barrier to making progress at the institutional level: it both insulates its possessors from pressure to change and provides strong motivation for grade leniency to the untenured. Academic freedom, precious though it is, is used to insulate faculty against discussions of content, workload, grading or pedagogy.

The accreditation agencies have their own ideas. They use their accreditation review to promote the scholarship of learning and integration of current 'best practices.' Many of the themes of these best practices are encapsulated in the push for the development of 'Master Syllabi' for both multi-section courses and for departmental curricula, that would clearly lay out learning goals, particularly those learning goals which could be demonstrated, assessed, evaluated in some kind of graded fashion. Interestingly, they do not seem terribly interested in grade inflation. Perhaps they've given that up as a losing battle, but instead they focus on 'learning assessment' using metrics separate from those used to evaluate students for grades. Pre/post-testing, portfolios developed over time, post-graduation interviews and graduate tracking are emphasized. There is little discussion of how 'best practice' applies to different disciplines, or different levels; we're supposed to figure that out ourselves, but without deviating significantly from the 'standards of best practice' that they articulate.

Syllabi seem to be very important to these agencies. Collecting syllabi was an important part of the accreditation review, and they pushed to make syllabi more public and accessible through internet publication. Syllabi have grown, as others have noted, to articulate clear goals and standards for students, contain an outline of the course that goes well beyond a 'reading and assignments schedule' and introduce students to the discipline, where the course fits in the discipline, and to general academic practice through discussion of how to handle reading and writing assignments, labs, discussions, etc. This, in addition to a growing collection of boilerplate text: disability accommodation; advising; civility; academic honesty; offensive material disclaimers. Any ambiguity or reservation about the idea of 'syllabus as contract' seems to be over and done. How this is supposed to be superior to addressing these issues in a course catalog or in class is, honestly, beyond me, but my syllabi have been selected as 'model syllabi' several semesters running, so I must be doing something right.

One consistent strain running through our accreditation, and others I have heard of, is pressure to strengthen centralized institutions of governance. I got to meet with the accreditation team on their last visit, because of my position on the CAS Curriculum Review Committee. They were quite concerned about the way in which general education standards were set and enforced, particularly about the independence of the individual college governance bodies from the University-wide Congress and its committees. Several of their recommendations included weakening or eliminating separate college governance of curriculum. They were also clearly concerned about the Curriculum Review Committee's lack of mandate to review the workload and pedagogical aspects of new or revised courses. While they did not directly address the questions of tenure and academic freedom, it was pretty clear that a more centralized, less 'free for all' approach was preferable. 'Post-tenure review' with an eye toward continued teaching effectiveness is already being put in place or seriously discussed throughout the American academy, and some have argued that tenure is, or will soon be, both obsolete and toothless.

A few institutions have largely abandoned grades as a measure of the success or ability of college graduates, or found ways to supplement those grades with standardized norms. Ironically, the most widespread form of national post-graduate testing is graduate admissions tests. Lip service is paid to grades, recommendations are carefully read for faint praise, and personal statements give admissions officers some way to tell applicants apart. But the existence and ubiquity of the use of these standardized tests is perhaps the most damning form of self-criticism possible: the very academy which grants grades cannot rely on them as a measure of quality or achievement. Professional accreditation in several fields is test based (nursing, teaching and accounting come to mind immediately), recognition that completion of the relevant bachelor's degree may not, in fact, indicate technical mastery of crucial material. The tests, of course, influence the curricula: some departments have gone so far as to include a 'preparation for the test' course as a component of the major.

What's Next?

My suggestions, which most readers will cheerfully ignore in favor of their own, focus largely on the nexus between grade inflation, student evaluation of teachers, and tenure review. In the short term, some form of open grade norming -- perhaps as simple as putting the class or department median on transcripts along with the student's grade -- would reduce the opacity of grades. In the long run, outlier departments must be called to account, and discussion of grades, standards and norms must be ongoing, data-driven and interdisciplinary. Reform of social promotion and grade inflation at the primary and secondary level would help immensely.

The training of Ph.D. students also needs to be shifted in more practical and professional directions, starting with an emphasis on teaching as a skill in graduate school. Not just tossing TAs in sections, but mentoring, review, professionalization; also, graduate coursework should include not just dissertation-related topics but general education in areas which students will most probably have to teach. I, for example, got through graduate school without taking a single course of Chinese or Korean history, though as a modern Japanese historian in a small department I spend a great deal of time teaching China, along with World History (at a previous post I taught East Asian Civ and Western Civ), and only about 1/3 of my teaching time in Japan. General education and teacher training would not be useful only for academia-bound students: the ability to structure a presentation, to impart useful information clearly, to see both the broad sweep and sharp details of an issue, would benefit people in many professional fields.

After hiring, a thorough reform of the institutional culture is necessary, and though that seems daunting, it can be done effectively at a departmental level before being done at an institutional level. One essential component is an environment in which teaching techniques and issues can be discussed without fear that sharing concerns or difficulties will be used against you in retention and tenure. Faculty need some form of confidential mentoring, or some form of mutual discussion which allows everyone to display strengths and be critiqued (instead of creating an artificial division between 'master' and 'student' teachers). Tenure/retention review should include both quantitative and qualitative material, and problems, if noted, must be followed up with mentoring and support. Such review should not stop with tenure, and I am one of those who feels that it would be possible to design post-tenure review that would allow the most egregiously bad faculty to be removed from the classroom without threatening academic freedom. But these reviews and discussions must be sensitive to disciplinary differences and to variation in the student population in order to be meaningful: the techniques which work with upper-division English courses will probably run into problems in world history surveys, and lab techniques don't translate well into philosophy; and sometimes lecture really is the best way to impart information and understanding, though it's terribly old-fashioned.

If these or similar methods are not adopted, if grade inflation continues and no strong articulation of standards is forthcoming, the worst-case scenario is easy to project. National standards for college curricula, enforced by NCLB-style testing in non-professional subjects, have already been discussed by national legislators. Accrediting agencies and federal funding would force schools to address their curriculum to these tests, which would entail the functional loss of academic freedom with regard to syllabi and classroom activity. Faculty who failed to follow institutional guidelines (which would be very closely modeled on national guidelines and adjusted to the tests) would be penalized, probably with dismissal, and tenure would be obsolete. Students would be forced to take more general education courses, but would have fewer choices regarding how to fulfill their requirements. At this point, college really would become an extension of high school.

We are faced with change: things will not simply continue as they are for very long. We must decide what sort of change we prefer. I would prefer that we be accountable to ourselves, individually and as an intellectual and teaching community, and that others respect that system because it produces high quality results. If we cannot demonstrate those results, and that accountability, it will be imposed on us in a form which we may not recognize or appreciate.

*This article is dedicated to the Invisible Adjunct.

Related Links
John Merrow:"Grade Inflation: It's Not Just an Issue for the Ivy League"