Evaluation using a Marking Matrix

The woes of the double cohort and the burden of mass marking have become a familiar refrain among faculty in Ontario Universities. It is well known, however, that Universities across the world are having to adapt to increasing numbers of students. One of the central problems, especially as the end of the academic year draws ever closer, is how to mark student work and provide useful feedback to large numbers of students. In social science and humanities disciplines that emphasize research and writing skills, the student essay can present particular challenges to assessors, especially when there are large numbers of papers to read and evaluate.

Some of us have been around long enough that we can remember a time when it was possible to take a pile of student essays home for the weekend, and treat each one as an exercise in marking "the whole paper". How much more difficult is that to do when the "whole essay" being marked is the seventy-third in a pile of 250? Once one has read over a dozen undergraduate essays on the topic of "the effects of the industrial revolution on politics of class consciousness" it becomes difficult to treat each as a unique expression. All the essays begin to blur into each other. Under such conditions it is quite easy, especially for those of us who are long practiced at it, to organized the pile of papers according to an ordinal measure (ie. rank the papers from best to worst). Such a measure may be quite reliable, but how valid is it, and how do we show the validity of the measure? Ranking student papers on the basis of where they are in comparison to the competition does not help us to provide good, relevant, structured and systematic feedback. One way to structure the process of marking student essays is by making use of a "marking grid" or "matrix". Below is an example of such a marking matrix that I am currently using in marking 2nd year and 4th year take-home essays.

Essay Evaluation Matrix

Substance
Criteria	Ranking
Originality of approach	1 2 3 4 5
Relevance to question	1 2 3 4 5
Coherence of argument	1 2 3 4 5
Depth of analysis	1 2 3 4 5
Range of relevant literature covered	1 2 3 4 5
Use of evidence	1 2 3 4 5

Presentation
Criteria	Ranking
Literacy	1 2 3 4 5
Accuracy	1 2 3 4 5
References and bibliography	1 2 3 4 5

Ranking scale: (1 = excellent; 2 = Good; 3 = Satisfactory; 4 = Poor; 5 = Unsatisfactory)

This matrix is a form of structured subjectivity. The criteria on which I base my marking falls under two basic headings: presentation and substance. I place a high regard on well presented papers with good scholarly style and proper referencing. My students know this ahead of time. This matrix measures presentation using three criteria: literacy (ie. grammar, use of metaphor, alliteration, and other elements of writing style); accuracy (ie. spelling mistakes and other proof-reading issues); and references and bibliography. But, while most of us would probably agree that presentational issues are important, we are all relatively more concerned with matters of substance.

Considering this, we would probably want to weigh criteria relating to substance differently from those relating to presentation. In this marking matrix substance is considered using six criteria, the first of which is "originality of approach". That ought to be balanced against "relevance to the question"; a student can, theoretically at least, be highly original and totally irrelevant. I use the criteria "coherence of argument" as a measure of how well structured the paper is; very poor ranking would be for a "stream of consciousness", or a "one damn thing after another" essay written in haste the night before. The criteria "depth of analysis", at least the way I employ it, is concerned with how well students marshal their theoretical vocabulary in explanation. "Range of relevant literature" can be used as a measure of the student's efforts in the library; what I look for here is evidence of extra reading beyond the course text books and recommended reading list. I use the last criteria "use of evidence" as a way of getting at how well the student links theoretical ideas to empirical data.

Other criteria can be substituted for those used in the above matrix. The point is that, prior to marking the papers, indeed prior to giving the students the essay questions, the criteria should be set out. Structured criteria can then form the basis of a dialogue, or series of dialogues. First of all, using a marking matrix such as this one provides a basis for dialogue between markers. This can be particularly important to those of us managing Teaching Assistants. The advantage should be obvious: how many of us have had to field questions or complaints arising from perceptions of differing marking standards brought to bear by different TAs? Using marking matrices can also provide a tool for structured dialogue between faculty teaching on different courses which can be useful in a variety of ways. A marking matrix usefully structures dialogue between teachers and individual students. Students wanting to know where they went wrong or what they can do to improve can see clearly what the issues are and advice can be tailored to those specific issues. Using a marking matrix is not a substitute for giving written feedback, but it can help us in structuring what feedback we give. The use of a five point scale automatically gives a summary of both the positive and negative aspects of the paper, but written feedback can help by saying precisely what the student did right or wrong, what is worth keeping and what is worth trying to improve.

Students can be heard to remark that assessment is not an objective process, and they are right. But subjectivity need not be capricious, which is usually what students imply when they lay the charge of "subjectivity". Using a marking matrix is a way of structuring subjectivity so that it is uniformly applied to all of the essays in that great big pile you have to take home. It is one way of structuring the subjectivity of your TAs, so that all members of the marking team are using the same set of criteria. Lastly, providing the student with a feedback sheet which includes not only the marking matrix but also some written feedback can be pedagogically sound providing, that is, that the criteria are themselves pedagogically relevant and clear from the outset. The marking system will be biased, but it will be biased in terms of factors relevant to the assignment.

The main reason I have for advocating the use of a marking grid and structured feedback is that it makes assessment part of an active learning process. Students learn not just that they got a 78%, or that they were in the top 20 percentile of the group. They also learn the reasons for their individual mark. There is another, less good, reason for adopting this technique as a marking strategy, and that is that it is a coping mechanism for handling large numbers of students. There are efficiency gains that can be made in terms of how much time we devote to marking "the whole essay". By structuring our approach to student assessment and feedback we can make better, more efficient use of our time and still work to ensure that our students are getting the benefit of constructive evaluation.