Analysing HSC Results

Some thoughts on priorities and processes

Apr 16, 2024

Most schools I’ve seen have an expectation that teachers will, in some way, analyse HSC results of their students. But what that means and looks like varies.

I get asked about this a lot, so I want to share some thoughts I have about how I might analyse HSC results in a way that is meaningful, but not so burdensome that it’s not worth the effort.

The Ultimate Question

There’s no point doing anything with data unless you get to the point of answering this question:

What are we going to do next?

It’s obvious. But if that’s not the end point (or something like it), it’s no use doing the work. But, at the same time, ‘What’s next?’ is different for different roles. A classroom teacher needs to think about this differently to a head of department, who needs to think about this differently to the senior leadership team.

What’s next, in the context of your role, is something you should know you need to answer. It’s the ultimate goal here. What is meant by it needs to be defined before you start.

The Two Big Questions

After analysing HSC results, there are two main questions I want to be able to answer. They are:

What are we teaching well?
Who are we serving well?

There is, I think, more to it than just this, but if you can answer these two questions in some detail, answering what’s next is probably going to be pretty straightforward.

What Are We Teaching Well?

Because of the way moderation in the HSC works, unfortunately, what I really mean here is:

‘What are we teaching well that’s being translated to marks students achieve in what is externally marked?’

It’s important for students to understand that their individual HSC is 50% in school assessment and 50% externally marked exams/projects/performances. But as far as cohorts and schools are concerned, all the marks available for a cohort come from what’s externally marked. This is not a nice reality of the HSC, but it is a reality.

Because of this, how students achieve in the written exams and what’s externally marked, matters. More than I’m comfortable with, but it matters.

The best thing NESA have made, I think, for HSC analysis is the Item Analysis in the RAP. This is the best place to go for the ‘What are we teaching well?’ question. When I’m conducting an item analysis I want to sit with a paper copy of the exam in front of me and use it to reference the questions in the Item Analysis. Before I begin working in the RAP, though, I want to make a note of the questions in the exam my students should have been well prepared to answer well. These are the questions I know we covered well in class and I feel that the students were well prepared to answer them. I think it’s helpful if you can do this before you see the marks students received in those questions.

When conducting an Item Analysis, I want to get a good sense of the following:

In general:
1. Which questions have students achieved marks as high I would have hoped and expected?
2. Which questions did my students, knowing them and what we covered, not answer as well as expected?
3. What types of questions were answered particularly well or not? (e.g. multiple choice, short answer, extended response and so on)
4. Which content areas were answered particularly well or not, compared to what I would expect, knowing my students and our context.

I summarise this in a table like this.

It may be helpful, after looking at generalities, to split the item analysis up into groupings of achievement. There are a couple of ways to do this. You can do it by making classes in the RAP and then conducting an item analysis on that class, or you can pull all the data out of the RAP yourself and then split it up. I prefer to pull the data out, but it really depends on which you’re comfortable with.

I like to summarise this with a simple table. Something like this:

Groupings like this only make sense when they make sense for your context. If they don’t make sense, don’t use them. It may also be that the groupings make sense, but you just don’t need them. If you have enough information otherwise, don’t make more work than you need to.

Who Are We Serving Well?

By this I mean how are students achieving in our subject compared to their other subjects? You can’t do this with HSC scores. That’s why I’ve made my ‘What Does it Mean?’ tables. This year UAC have given schools a lot of thier student’s ATAR scores. If you have those you can then start asking how students are achieving in your subject compared to their other subjects. Because of the way ATARs work, they’re basically the average of a student’s best 10 units of ATAR contributions (ATAR contributions as defined by me, anyway).

Doing this is confronting. There are a few things to remember.

This is a relative measure.
This only tells you how a student has achieved relative to their other subjects. It says nothing about their achievement relative to their ability or level of understanding in a subject.
In and of itself, no value judgements can be made from this. So I can’t do this for a school and just automatically say whether something is good or bad.

Perhaps something like this in Excel, to show relative achievement.

In this chart:

The ATAR is a student’s actual ATAR score (I know ATARs are in 0.05 increments. This is just for illustration)
The ATAR Contribution is a student’s ATAR contribution for that one subject, if they could study just that subject and still receive an ATAR.
The Gap is the ATAR Contribution minus a student’s actual ATAR. So Halo Choi achieved 7.42 ATAR Contribution points more from this subject than from their other subjects on average (or thereabouts). The bar is just the gap as a conditionally formatted bar.

I’ve coloured between -5 and +5 here as grey because a student has to have variation in their achievement and that is, at most levels of achievement, I think, relatively normal expected variation. Very high achieving students should expect to have smaller variation in achievement across subjects.

Once I have something like this sorted I can start to ask questions around what and why.

I ask about it in a document like this.

It doesn’t matter what it looks like. But if it can be summarised simply somewhere with a straight forward graphic to show what you mean, that’s really helpful for everyone, I think.

When asking about how well students are served, it’s not all about improvement. There may well be areas where things are going well. Not everything needs to be fixed. Identifying what doesn’t need fixing is important.

Freedom of Process

In any school I’ve always seen a wide range of ability and interest when it comes to analysing student achievement data. It’s very helpful for some people to have a step by step process to work through when analysing results. For others they just need to know the end deliverable and be allowed to chart their own journey there.

There’s not going to be a correct process for this. Tasks left to teachers should be supervised by HoDs and they should know the processes their teachers are following, even if they don’t mandate what that has to be. Support where support is needed. Give freedom where it can be given. Like teaching.

So, I guess I think what is delivered in the end should be mandated. The steps to get there should be scaffolded where necessary, but may not need to be the same for everyone.

Helpful vs Interesting Data

Every time I visit a school I hear stories. Stories of classes, students, teachers, learning and a lot more. That is good and proper. Education is a human endeavour. Analysing student achievement data is a human endeavour. The people behind the data are in no way irrelevant, unimportant or anything else. In fact, understanding the people behind the data is fundamental to being able to make value judgements about the data.

BUT…

We do have to be careful to prioritise what’s helpful over what’s interesting.

In most circumstances, what’s true about a group is more helpful to future cohorts than what’s true about the individual. Why one person achieves in a certain way in a certain situation can have functionally infinite complexity. Why a group achieves certain things in common will have fewer reasons.

All I mean is that people are interesting. People we know and care about are interesting and worthy of time and energy. When analysing results, though, understanding how students have achieved in groups and seeing if there are things in common with that achievement is a helpful process.

Small Cohorts

Small cohorts make data analysis more difficult. And there are an awful lot of small cohorts around. When you have fewer than 10 students in a subject, it can be difficult to see patterns in achievement. Perhaps there are no discernible patterns to see. Then what do you do?

The smaller a cohort is, the more we need to leave room for uncertainty.

Being able to identify things you think are true and then leaving plenty of room for other possibilities is important here.

There are, perhaps, a few things I’d note about small cohorts:

It’s important not to determine that since it’s a small group there’s nothing to know.
It’s important not to determine that since something was true of the small group, it’s absolute.
Where students haven’t achieved how you’d have liked or expected, knowing them, it’s worth thinking about whether you can better serve students like them in the future, or if it was more circumstantial than that.
Don’t decide to change nothing just because small group statistics are unreliable. But don’t allow one or two students to derail something you’ve done that’s working, just because it didn’t work for them on the day.
ALWAYS talk to someone else about what you think and why. What you want to change and why. What you want to keep the same and why.
Buy some exam scripts. They’re not too expensive. One problem with the Item Analysis in the RAP is that if you have too few students you can’t see anything. But when the cohort is small anyway, any summary statistic is so problematic as to be largely irrelevant. Go to the source. See what your students wrote.
Be prepared to be wrong. People are messy. Small data is messy. Our inferences are prone to error. That’s ok. Just be willing to admit it and make changes accordingly.

Some Other Things to Think About:

There are all the usual things that schools look at in HSC analyses. I’ll just make a few notes on some of these.

School vs State Achievement
I guess a part of the reason why this is so popular is that the tools to make these comparisons are so readily available. In the Item Analysis, for each question, you can see how students have achieved against the state. There are a number of other charts and measures throughout the RAP that show student achievement compared to the state.
Comparisons to the state are fine. But they’re also fraught. Comparing student achievement to the state across subjects is, I think, going to muddy the waters in ways that can be quite difficult to clear up. I will always remember an interesting conversation I had about how well a school was doing in D&T because the school’s z-scores in the RAP were to high and how poorly they were doing in Maths Extension 2 as it was the only subject in the RAP whose average z-score was below zero. The final determination may well have been true. But it’s not because of the z-scores. Using z-scores in this manner is simplistic.
I recently had a conversation with someone about using z-scores well and I think it hit home for me how much I didn’t like them when each time we looked at an aspect of them, I found myself going to another measure to see the depth and breadth of the context.
If you have two similarly achieving year groups and there’s significant difference in achievement against the state, it may be worth asking questions about why.
Seeing student achievement in a subject vs the state may be a good flag raising tool.
Summary Statistics
Every school, no matter its size, is a small data set. This creates problems for data analysis I want to note.
We all like to use summary statistics. Mean (average) and median scores abound in reports and summaries of achievement. I use them as well. But they always make me uncomfortable. Sometimes I get asked what I do with outliers. The answer is, usually, nothing. When a data set is small, it’s difficult to know whether an outlier really is an outlier. And when we’re talking about understanding people, outliers are not to be discarded. They’re to be understood.
The smaller a population is, the less likely it is that the mean of that group will be representative of the actual mean. The same is true of the median.
Summary statistics are a fine place to start. but I always prefer to see ranges of scores. Find ways to visualise data so you can see all scores together at once. I like to colour code them according to quartiles, if there are enough scores and then compare achievement in subjects over time with what I hope is some more meaningful context.
Correlations
This is a bit of a distasteful aspect of HSC analysis, but it does matter. It’s not the be all and end all. It’s not a huge thing, but it is one thing.
It’s helpful to be able to answer:
‘How good a predictor of HSC exam achievement is my in school assessment program?’
By this I don’t mean the numbers sent off to NESA. It doesn’t matter if they don’t match up. I mean, to put it simplistically, the rankings.
The grim reality of this is that a cohort is best served when, in general terms, an in school assessment program is, overall, a pretty good predictor of HSC exam achievement. When asking this question, I think it’s important to be as vague as possible.
Is it usually true that high achievers in the assessment program are also high achievers in the exam? Are middle achievers usually middle achievers and are lower achievers usually lower achievers?
If the answers to these questions are generally yes, then the assessment program, in this respect, is doing its job.
The HSC has quite a few not so nice parts about it. This is one of them.

Making Value Judgements

One thing I hope I’m always aware of in my work is that it’s not my place to make value judgements about student achievement. Teaching is a person centred endeavour.

Making the leap from what’s true to why it’s true is to move from observable facts (whether they’re useful or relevant facts is another matter) to value based judgements. It’s important to make value judgements. It’s important for the right people in the right context, guided by the best information, to make value judgements. The best value judgements require knowing the school, the students, the teachers and the community. The Excel table I’ve included above that shows the green and red bars is not determinative of whether students have been served well. What it should do is inform the opinions of those who know the school and students as to whether students have been served well in that subject.

Some of my resources and templates I use for all this are here:

HSC Resources

Graham’s Substack

Discussion about this post