Thursday, May 27, 2010

Traci Sitzman - happy sheet killer

Do happy sheets work? Ask Traci Sitzman who has done the research. Her work on meta-studies, on 68,245 trainees over 354 research reports, attempt to answer two questions:

Do satisfied students learn more than dissatisfied students?

After controlling for pre-training knowledge, reactions accounted for:

  • 2% of the variance in factual knowledge
  • 5% of the variance in skill-based knowledge
  • 0% of the variance in training transfer

The answer is clearly no!

Are self-assessments of knowledge accurate?

  • Self-assessment is only moderately related to learning
  • Self-assessment capture motivation and satisfaction, not actual knowledge levels

Self-assessments should NOT be included in course evaluations

Should NOT be used as a substitute for objective learning measures

Additional problems

Ever been asked at a conference or end of a training course to fill in a happy sheet? Don’t bother. It breaks the first rule in stats – randomised sampling. It’s usually a self-selecting sample, namely those who are bored, liked the trainer or simply had a pen handy. Students can be ‘Happy but stupid’ as the data tells you nothing about what they have learnt, and their self-perceptions are deceiving (see Traci’s research).

Sitzmann, T., Brown, K. G., Casper, W. J., Ely, K., & Zimmerman, R. (2008). A review and meta-analysis of the nomological network of trainee reactions. Journal of Applied Psychology, 93, 280-295.

Sitzmann, T., Ely, K., Brown, K. G., & Bauer, K. N. (in press). Self-assessment of knowledge: An affective or cognitive learning measure? Academy of Management Learning and Education.


Unknown said...

"Ever been asked at a conference or end of a training course to fill in a happy sheet? Don’t bother."

This is bad advice.

Smile sheets can provide much useful information for instructors and developers. It isn't up to the audience to preempt the use of this kind of data collection because they think someone may make poor generalizations from it or that the instrument is ill designed.

Mark Frank said...

Interesting link - but you have been pretty selective about what you report from the paper.

If you were to ask Traci Sitzman presumably her response would be somewhat similar to the conclusion of the 2008 paper:

In terms of outcomes, reactions have their largest relationships with changes in affective learning outcomes.

As for the link between reactions and cognitive learning outcomes, these results challenge the established view that reactions are not associated with learning (e.g., Hook & Bunce, 2001).

That said, the results also suggest that post-training self-efficacy is a much better predictor of cognitive learning outcomes than are reactions.

From her recommendations to practitioners:

Consistent with Kraiger (2002), we suggest the goal of an evaluation effort should guide the selection of outcome measures. Within this context, reactions are one potentially meaningful outcome, depending on the goal of training.

Then she goes on to suggest when reactions are useful and when they are not. I don't think the answer is "clearly no".

How come her comments differ from your interpretation? Several reasons:

1) You have concentrated on just one hypothesis out of 12. In particular there is evidence that reactions are good measure of affective outcomes and the measure of declarative and procedural outcomes was after controlling for affective outcomes (as well as prior learning). Put simply if you get people motivated they tend to learn more and motivation explains a lot of the variance in happiness sheets.

2) Declarative and procedural gain may explain little of the variance in happiness sheets but readers need to understand what this means. Besides affective outcomes the other important explanatory variables were related to the learning environment e.g instructor style, human interaction and organisational support. If these factors are kept constant then the role of the declarative and procedural gain in predicting the variance in reactions will be much larger. In the classroom this may or may not be easy to do. With technology based training this becomes practical. As she says:

Our study also suggests that reactions had their strongest relationship with post-training motivation, self-efficacy, and declarative knowledge when technology was used to deliver training. Thus, it makes sense to be particularly sensitive to reactions in technology-delivered courses.

Donald Clark said...

Ed - have to disagree - they say little about the 'learning' are biased in terms of sampling and often badly designed (that's why I don't fill them in). There are far better ways of solving bad training.

Mark - this was all taken directly from Traci's slides, which she sent to me after I saw her speak last year, including 'the answer is clearly no!' she even had the exclamation mark. The point about 'affective 'learning is well made, but this is rarely a part of corporate training or conference talks for that matter. The 'motivation' variable is a loose canon here, as it is so easily manipulated by jokes,anecdotes etc. As I said,there's plenty of happy, motivated stupidity out there.

Whatever the data,the whole thing is blown apart if happy sheets are a biased sample, and at conferences and in training courses I've observed, they almost always are.

Mark Frank said...


I am surprised that her slides don't gel with the paper. Are you in a position to ask her? (Of course she may have been trying to get a good score on the happiness sheets at her presentation!)

I am surprised by what you say about samples. In my experience for most serious training all delegates are asked to fill in happiness sheets and the vast majority do.

Clive Shepherd said...

Hi Katherine. I don't believe there is another link to this video, but if you contact me directly, I can put you in touch with the people from learndirect.

jay said...

Adios, Level l.


BunchberryFern said...

As ever, it's not a simple question of, "Does it work or not?"

Yes, smile sheets 'can' provide much useful information. But is it a productive use of our time?

What on earth would possess anybody to allow a developer or an instructor to be in charge of their own assessment? Asking eg a trainer to compile a list of statistics on their own performance is clearly silly. If somebody wants 'useful information' on how an event went, they should get somebody who at least has a semblance of independence to design and dish out the happy sheets.

After two decades of happy sheets, I've never received a 'score' of less than 80%. I've tried everything to get a greater range of scores but without success. And, as you can imagine, in the last twenty years I've presided over some stinkers. No amount of good design will overcome people's reluctance at offending somebody they've spent a day with. Nobody likes a grass.

This is not to say I've never learned something useful from the sheets. But you occasionally learn something useful from your horoscope if you read it in the right frame of mind.

Training Departments like happy sheets. It shows they're good at commissioning the right people.

After a particularly disastrous workshop (on a controversial issue related to potential disciplinary matters) I got the usual 80% score (I'm wise enough to know this means it was a stinker. Unfortunately, there's little difference on paper between the 80% of an absolute stinker and the 87% of a corker.) I followed up with delegates a few days later and asked a different question: what do you think other delegates thought of the event?

The answers were interesting, to say the least. Nobody thought this 'scientific' enough to be recorded.

Happy Sheets allow us to believe that all our training is better than average.

Donald Clark said...

Bunchberryfern - you've nailed it - it's time consuming and the pressure on the evaluator to produce positive results makes the exercise redundant in terms of objectivity. My primary problem with happy sheets (and Kirkpatrick) is that it misdirects valuable time towards the
wrong sort of evaluation and data, preventing people from doing the real stuff - independent, sampled data on impact.

Mark Frank said...


I am not sure where you have been teaching/instructing but it is not always like this. I don't think level 1 is the bee's knees but it has its uses and in many cases it is a very small effort to get a response from most attendees (so no question of sampling problems).

I have recently been involved with an instruction programme for a large government department. There were many instructors giving a standard introduction to a new IT system across the country. Almost without exception instructors who were new to the programme would get very mediocre level 1 results both for the class and their contribution. The comments also explained why. Too long and full of stuff they already knew. The instructors read the sheets, cut out some of the content, and up went the ratings. It is a weakness of the management of the programme that this guidance never seemed to get to new instructors as they joined the programme.

Did this improve the delegates performance on the job? You also need level 3 measurement to prove it and it is a fault of the programme that no such measurement existed. But they were certainly more motivated and when it comes to rolling out a new IT system attitude is just as important as knowledge.

The trouble is that so much level 1 measurement is poorly design and executed and all this discussion (and indeed the paper) does not take account of this. As Ed says it has to be designed and used properly. Happiness sheets can be done well or badly – the questions that are asked, the environment in which they are completed and the use that is made of them. Make the questions short, clear, and relevant. Give plenty of space for comments. Make sure the delegates know they will be read and acted on. Allow them time to do it. Act on the sheets and be seen to act on them. etc

They may or may not be the best instruments for measuring effectiveness. As the Sitzman paper says this depends on a lot of things. But, if done well, they are pretty much always a useful and cost effective way of learning about why a course or other intervention fails or works.

Donald Clark said...

Mark - my problem with the whole Kirkpatrick thing, is that it was a 1959, behaviourist, after the event approach. Much better to have conducted a short pilot to iron out these problems than wait until the courses were delivered to see the problems. This is what happens in other walks of life. You don't launch Mars Bars then give everyone a Happy Sheet to see if they liked it. You do some focus group work - but the data collection is smart.
The effort put into this also means that little or no effort is made on the real evaluation of the course.
Does your legal department, marketing department, finance department or any other department use Happy Sheets?

Mark Frank said...

Donald - why can't you have a pilot and test consumer reaction? It is almost routine for any services organisation to both pilot its services and measure client reaction. In your Epic days did you not ask the customer what they thought of your services?(I suspect Mars Inc. does both for Mars bars as well - but that is out of my expertise)

Does your legal department, marketing department, finance department or any other department use Happy Sheets?

I am a one man band - so I know my legal and marketing departments are crap! But in any large organisation it is quite common for one department to conduct surveys on the service it provides to the rest of the business. But I am sure you know that!

BunchberryFern said...


I tried to be careful in pointing out that you can learn something from Happy Sheets.

But this approach has its own costs, as Donald points out.

The Kirkpatrick-influenced approach is bonkers. For a start, any sane evaluation would begin with Level 4 and work down. I have never yet been commissioned to entertain people - yet this is what I'm asked to measure.

And at each Kirkpatrick level we have this weird 'and then a miracle occurred' process of transubstantiation. A positive reaction sublimates into learning which sublimates into etc etc

Everybody - across the board, whether traditional chalk-and-talker or new-age connectivist unlearning 2.0 - calls them 'Happy Sheets' (or 'Smile Sheets'). Surely this is an indicator not of familiarity but a cheerful contempt?

You can't do a bad thing well.

A.D. said...

Donald, I attended a recent presentation where you expounded why we should "kill Kirkpatrick"! You seemed kind of passionate about it, so I thought you would appreciate this:

Buy a copy of Don KP's undedited 1954 dissertation

Wow.One for your wish-list? :-)

Julian K said...

The ASTD did work and established there is a correlation between the question "Will you use this training in the work place?" and Level 3 - a behavioural change.

The is also an enormous amount of work relating student ratings to performance in higher education:
Nuhfer (2009) writes "This is primarily because there is a very general trend for highly rated teachers to be associated with
students who achieve well (McKeachie, 1986)."

To summarise a 26 page paper.

I think with much of this we can find a respectable paper to support what we want to believe. The trick is reflecting on both sides of the argument and making a balanced opinion. Reducing human behaviour to a Yes or No is abit too simplistic for me.

Donald Clark said...

Do you have a reference? As ASTD work tends to be survey based and therefore not very reliable. ASTDs work also confirms that few get beyond Levels 1&2 to levels 3&4 - that's the real problem. people are so obsessed with 'Happy Sheets' that they never really get round to doing real evaluation.
Nuhfer's Knowledge Surveys are very different from general Level 1 Kirkpatrick evaluations sheets. They are more akin to assessments with direct questions about the subject matter and around 200 questions. Note that Sitzman's metastudy research is much more recent.

Anonymous said...

Reference!!!! No can't find it, but it was at a conference and then I followed up (I'll keep looking for the reference). Laurie Bassi was the researcher. She has a website

I'm not sure I agree with your assessment on Nuhfer's assessments. he is aligning learning to learner opinion of an educator.

Rodney Baker said...

A little late in responding but an interesting topic!

Bad questions beget bad answers. If the questions asked are appropriate they will yield effective answers.

To complicate the matter however, if people could write effective satifaction surveys what are the odds that they would know what to do with (how to interpret) the data. Most people do not seem to know how to score happy sheets.

Also, one must consider that the surveys are not for the trainer but rather for the audience. First and foremost the survey signals that the audience's opinion matters. Conducting a survey and then failing to do what is appropriate is a confirmation that the survey didn't matter.

But we should not fault the audience for the failure of the trainers to seek and process the appropriate data.