In the first of a short series of articles on evaluation, Martin Schmalenbach looks at the work of Donald Kirkpatrick and others in setting a framework for appraising the effectiveness of training.
A lot seems to have been written about evaluating training. This may be in part because there seems to be a fair amount of confusion around, as well as for some a sense of something missing, of not being quite ‘right’. In this series of articles I’ll try to clarify things by asking some key questions, and not straying too far from them!
So, let’s start with some key questions. I’ve taken the liberty of borrowing some from Fred Nichols as he asked these questions with impact some time ago, back in the early 1990s:
“Evaluate? Evaluate what? Training? What do we mean by training? What’s to be evaluated? A particular course? The trainees? The trainers? The training department? A certain set of training materials? Training in general?
“More to the point, why evaluate it? Do we wish to gauge its effectiveness, that is, to see if it works? If so, what is it supposed to do? Change behaviour? Shape attitudes? Improve job performance? Reduce defects? Increase sales? Enhance quality?
“What about efficiency? How much time does the training consume? Can it be shortened? Can we make do with on-the-job training or can we completely eliminate training by substituting job aids instead?
“What does it cost? Whatever it costs, is it worth it? Who says? On what basis? What are we trying to find out? For whom?”
These are all killer questions. In these articles I’ll answer some of these questions, and others, including: when to evaluate, and when not to; who is responsible for which bits; when “quick & dirty” is OK, and when it isn’t; which approach to use and when, and what is the bottom line or ROI contribution.
Let’s start with some definitions. There seem to be two different focuses for evaluation: focusing on the actual process of training or performance improvement (what’s known as formative evaluation), and focusing on the final product or outcome of the process (what’s known as summative evaluation). Evaluation seems to mean different things to different people, hence some of the confusion.
It’s suggested by some of the killer questions that the answers and meanings depend upon perspective – who is asking the questions and why.
For shareholders and managers of an organisation the need to evaluate is to help answer the related questions of “will training fix my problem and/or help achieve our goals?” and “will it be worth it or should I invest my resources elsewhere?” The final question is now more obvious: “was it worth it?”
For the trainer perhaps evaluation is driven by the need to answer different questions, such as “was the training effective?” and “did it achieve its objectives?”
And for the employee the evaluation questions are likely to be “Will it help me do my job better/easier/faster?” “Will it help my career development?” “What am I doing here?” and “What’s in it for me?”
Given that most of the thinking on evaluation seems to have been done by those in the training world, is it any wonder that there is some tension between each of these three groups when evaluation comes up for discussion?
In setting up systems and methods to answer the questions for just one group, it is quite possible that answering these key questions for the other two groups becomes difficult at best. These are all valid questions for each audience, perhaps the most famous early attempt to address these issues was made by Donald Kirkpatrick in the late 1950s with his now-famous 4 levels*:
Level 1 – Reaction – what is the reaction of the learner to the learning experience?
Level 2 – Learning – what has the learner actually learnt as a result of the learning experience?
Level 3 – Behaviour – to what extent have the behaviours of the learner changed as a result of the learning experience – sometimes referred to as transfer of learning to the workplace?
Level 4 – Results – how much better is the organisation performing as a result of the learner’s experiences in the learning programme?
In his 1994 book “Evaluating Training Programs: the Four Levels”, Kirkpatrick suggests that the effort and overheads required to evaluate at successively higher levels requires a growing amount of effort and resource, so it is perhaps easier and cheaper to evaluate at Level 1 but this is unlikely to be the case at Level 4. This is the argument (made by Kirkpatrick himself) for evaluating some 95% of training at Level 1 but perhaps only 5-10% of training at Level 4.
What is not so obvious is that Kirkpatrick’s model only prompts (it doesn’t enable as it doesn’t suggest how to do anything) you to evaluate after the fact – ie once the training has been delivered.
In this sense it does not allow one of our three key groups, the shareholders & managers, to make an informed decision about investing limited resources in training before that actually committing those resources, all it can facilitate is answering the question “was it worth it?” If the answer is ‘No’ it’s too late – the deed is done and the resources spent. This is as true for ‘hard skills’ training as for ‘soft skills’ training – it’s just that ‘hard skills’ training is usually easier to determine the benefits of in advance.
Setting aside the issues of complexity and overhead for evaluating only 5-10% at Level 4, surely for the shareholders and managers, any and every activity that may take employees away from their usual tasks must be evaluated in some way, to some level, in order to make the best decision about whether to actually engage in this additional activity or not.
This is the argument for evaluating everything. The danger is that in evaluating everything there is no time to do ‘the day job’! Clearly there needs to be some balancing, and this may vary from situation to situation.
It seems that Kirkpatrick’s 4 Levels are well suited to helping trainers in particular answer their key questions about how well the training met its objectives: did the training do “what it said on the tin?” It can go some way to helping employees answer their own key questions – but only after the fact.
Arguably Kirkpatrick’s Level 4 doesn’t readily address the question of whether it was worth it. In 1991 Jack Phillips added a 5th level to the Kirkpatrick approach, called ROI or Return On Investment. The question asked here is “did the training pay for itself and then some?”
The units of ‘currency’ don’t have to be financial, though they often are. This 5th level introduces for the first time the need for the evaluator to appreciate the finer workings of the organisation and also employ some skill in determining costs and benefits. Moreover, Phillips also developed what is more readily recognisable as a methodology or process that is repeatable and so can be more easily taught and deployed across many organisations.
Indeed there is a certification programme for practitioners, popular in the USA in particular.
Some thinkers on evaluation have reacted against this additional 5th level, partly because ROI is a term from the world of finance and there is an inference in some minds that training must pay for itself in financial terms or it shouldn’t happen at all.
There is some sympathy for this view, especially if you happen to be a shareholder or manager. The additional inference is that a lot of training that currently takes place that is seen as useful by employees but is very difficult to quantify in such hard financial terms may now not pass the ROI test. But Phillips’ addition to the original Kirkpatrick model doesn’t eliminate the issues highlighted earlier about the training having to be done first, before the evaluation can be done with any certainty. This 5th level goes some way to addressing the needs of the shareholders and managers, but perhaps not enough.
Other models have evolved that are designed to ensure that the evaluation process casts a wider net in looking at inputs, outputs, the context in which the training is carried out or needed, the product of the training, and the processes involved. Examples include Tyler’s Objectives approach, Scrivens’ focus on outcomes, Stufflebeam’s CIPP, CIRO, Guba’s Naturalistic approach and the V model (Bruce Aaron).
Other thinkers in the field, notably Paul Kearns in the UK, have sought to keep things simple and practical by suggesting that however you do the finer detail of evaluation, it’s the questions you ask that are important, that you must have a baseline (ie credibly compare ‘before’ and ‘after’) and that some training activities are clearly ‘must have’ such as health & safety or that required by law, some training is clearly a luxury in that not doing it will not hurt organisation, team or individual in any professional or performance area, and that other training is ‘value adding’ in that its purpose is primarily to enhance the performance of the organisation, in what ever aspect is deemed important.
In the next part of this series we take a look at some of these different approaches and models to evaluating training.
* Kirkpatrick, Donald L. (1994),”Evaluating Training Programs: the Four Levels”
San Francisco: Berrett-Koehler Publishers.
* Kirkpatrick, D. L. (1987). Evaluation. In R. L. Craig (Ed.), Training and development handbook. (3rd ed.). New York: McGraw-Hill
About The Author
Martin Schmalenbach has been enhancing performance through change and training & development for more than 10 years in organisations such as the RAF, local government, through to manufacturing and financial services. He has degrees in engineering and management training and development. For the past three years he has focused on developing and implementing a rigorous, robust and repeatable process for ensuring interventions contribute to the bottom line. You can find out more at 5boxes.com and How Did I Get Here?