On Monday and Tuesday of this week, I had the pleasure to be invited to participate in the National Assessments in the Age of Global Metrics symposium at Deakin University. This symposium was organised by Deakin University’s Research for Educational Impact (REDI) in collaboration with Laboratory of International Assessment Studies. The aims of the symposium are to “bring together scholars and practitioners from around the world to examine models of national assessments and explore how they are affecting the policy discourse and the practices of education in different parts of the world.”
The aims of the symposium were to address the following questions.
- How are national and sub-national assessments evolving in the age of global metrics?
- What is the relationship between national assessments and ILSAs?
- What effects are they having?
- What can we learn from the experiences over the past couple of decades?
What I liked about this event was that it aimed to bring together academics from diverse backgrounds to engage in dialogue, and maybe even learn from each other, in the fields of large-scale international assessments. And it was a bit of a star cast with presentations from Ray Adams (ACER), Sara Ruto (PAL), Anil Kanjee (Tshwane University of Technology), Sue Thompson (ACER), Hans Wagemaker (ex-IEA), Sam Sellar (MMU), and Barry McGaw (ex-ACARA). Ray Adams’ presentation was very interesting, making the case for homogenising ILSAs using criteria to enable a form of meta-standardisation and I may blog on this at some stage once I have thought about this further.
On the Tuesday morning there was a panel discussion that addressed the question ‘What’s the point of national assessments’. One of the participants was Barry McGaw, who was one of the architects of Australia’s NAPLAN and MySchool intervention, an area I have done a fair bit of work in. I must admit, during the presentation I was a bit annoyed, and when there was a chance for discussion, I asked a few questions. Because this was live-streamed, there were a number of people who tweeted out that I’d asked some questions, and I got lots of responses as to what they were. Here’s my list of questions:
- If NAPLAN is impactful, and I think on this we agree, why is it only ever impactful in positive ways such as in the anecdote that you shared? Why aren’t we equally interested in the negative impacts including trying to understand all of those schools that have gone backwards?
- Given the objective of this event, I am wondering which qualitative researchers you have read on the effects of NAPLAN that informed your attempts to make the assessments better through designing responses to the unintended consequences of the assessment?
- Results across Australia have flatlined since 2010*, how do you justify that NAPLAN has been a success in its own terms?
- I’m always concerned when people mischaracterise the unattended consequences of tests as being ‘teaching to the test’. It would be better to see a hierarchy of unintended consequences ranging from:
- making decisions about people’s livelihoods such as whether to renew contracts for teachers based on NAPLAN results
- making decisions about who to enroll in a school or a particular program based on NAPLAN results
- a narrowed curriculum focus where some subjects are largely ignored, or worse, not taught at all so that schools can focus on NAPLAN prep
- teaching to the test which may or may not be a problem depending upon how closely the test aligns with curriculum etc
- The problem with the branched design for online tests is not whether students will like it or not, it is a) whether schools have the computational capacity to run the tests, extending to whether or not BYOD schools advantage/disadvantage some students depending upon the type of device they use, problems of internet connection in rural and remote schools, bandwidth in large school etc. I am interested how you characterise this as a success?**
I was unimpressed with the answers I got, but I imagine that’s my problem. I think that psychometricians do rigorous research and have important insights into education systems that need to be taken seriously, but I equally think that qualitative fieldwork is desperately needed to advance the validity of this assessments, and when you shut that insight down you only damage your own assessments in the long run.
* At the end of the session, John Ainley from ACER came over and suggested to me that there had been significant improvement in Year 3 Reading and Year 5 Numeracy, with a bump in 2016 and 2017. I conceded the point, I stopped researching NAPLAN in 2015 so I hadn’t updated my trendlines. Across the other domains, however, they have remained fairly stable since 2010. This is known as the ‘wash back effect’ in the assessment literature.
** I had this question down to ask, but felt I had gone on too long so didn’t ask it.