We are going to start this topic area by looking at some historic research by Edward Thorndike, one of the first to investigate operant conditioning using a clever device that he invented called the puzzle box. This type of learning is sometimes called trial and error learning or instrumental learning.


The principles of operant conditioning were first explored by Edward Thorndike who placed cats into a ‘puzzle box’. The cats were rewarded with a fish for pulling a string which opened the lid. Thorndike noted how the cats learnt to escape faster and faster and suggested that behaviour was stamped in if it was rewarded and stamped out if it was not rewarded. He called this the Law of Effect.

American psychologist, Burrhus F Skinner, investigated this type of trial and error learning in more detail in the in the 1930s. He revealed that the behaviour of lab rats could be controlled by altering the consequences of that behaviour. All behaviours were voluntarily emitted as the rats explored the cage but when the rat accidentally trod on a lever, this was either rewarded with a food pellet or punished with a mild electric shock. When lever pressing was rewarded it became more likely (stamped in), but when it was punished it became less likely (stamped out).

This learning is not confined to rats alone. An example of operant conditioning in everyday life is the use of token economies with small children. Tokens can be given to reward desirable behaviour with the intention of improving the child’s behaviour and taken away to punish undesirable behaviour. These tokens (secondary reinforcers) can be exchanged for treats (primary reinforcers) after a set period.

Positive reinforcement

Positive reinforcement means that an individual is given something which he or she finds pleasant following a desired behaviour.  Positive reinforcement results in the preceding behaviour becoming more likely. Rewards need must be presented fairly swiftly in order for operant conditioning or instrumental learning to take place.

An example of positive reinforcement in the laboratory is when Skinner presented his rats with food pellets when they trod on the lever which results in an increase in lever pressing.

An example of positive reinforcement in real life might be that when a teacher uses praise and encouragement in response to a student’s contribution in class, the student will volunteer to answer more questions. This will only happen however, it the student finds the attention from the teacher positive and desirable. A student who does not like the attention or believes the response to be empty praise may not respond with increased contribution.

Something is only an example of reinforcement if the behaviour becomes more likely. Thus the terms reinforcement and reinforcer have a circular element to their definition, which makes their credibility questionable.

Negative reinforcement

Following a desired behaviour, negative reinforcement occurs when an individual has something unpleasant removed from them and this is rewarding. The removal of the unpleasant stimulus is rewarding and thus negative reinforcement results in an increase in the preceding behaviour.

An example of negative reinforcement in the laboratory was shown in a study by Seligman. Dogs were put into a pen with a mild electric current running through the floor.  The desired behaviour was for the dogs to jump over a small wall to escape the electric current. Jumping the wall had the effect of removing the unpleasant feeling and so wall jumping became more likely. Likewise Skinner electrified the floor of the Skinner box and when the rats pressed the lever the current was switched off temporarily. Likewise this had the effect of increasing lever pressing.

A real life example of negative reinforcement can be observed in parents of young babies. When babies cry this can be unpleasant for the parents. Picking the baby up often stops the crying temporarily. The behaviour which is being reinforced is picking the baby up. If the crying stops, and this increases the likelihood of the parent picking up the baby in the future, then negative reinforcement has occurred and is responsible for the change in the parents’ behaviour.

Another example is when a prisoner gives in to an interrogator. The interrogator is demanding information and the experience is harrowing for the prisoner. When the prisoner responds the interrogation ceases. If in the future the prisoner gives information more readily in similar situations, then negative reinforcement has occurred.

Primary reinforcer

Primary reinforcers are things which meet basic human needs such as food, water and sex and can be used to increase the frequency of certain desired behaviours. Primary reinforcement occurs when these things are provided in response to the desired behaviour, making that behaviour more likely in the future.

An example of primary reinforcement in the laboratory would be providing rats with food pellets for lever pressing in the Skinner box. A example of primary reinforcement in real life would be the use of praise which meets the basic human need for belonging and esteem (Maslow, 1943) or the use of small pieces of food (cakes, sweets, cake) which are used as rewards with autistic children in the early stages of the picture exchange communications system (PECS). When the children touch the communication pictures they are rewarded and allowed to choose something from the plate of food reinforcers. These are varied and the teacher makes a note of which food the child chooses to ensure that the reinforcers are individualised to that child’s preferences in the future, making them more effective as primary reinforcers.

Secondary reinforcer

Secondary reinforcers are given as rewards for desired behaviour but do not in themselves meet a basic human need. They are rewarding to the individual because they can be exchanged for primary reinforcers.

An example of a secondary reinforcer is real life is money; a person may not enjoy their job or get any sense of intrinsic pleasure form that job, but he or she keeps going back to work because they paid to do so. If the payment stopped they would not continue with this behaviour.

An example of the use of secondary reinforcers in clinical practice is the use of token economies which have been used successfully with eating disordered clients. Clients are rewarded with tokens for putting on weight and once a specific number of tokens have been awarded they can be exchanged for treats and privileges.

In real life, programmes such as Super Nanny have encouraged the use of sticker charts as a form of behaviour modification for small children, whereby stickers are ‘earnt’ for desired behaviour and act as secondary reinforcers which can be traded at the end of the day, week or month dependent on the child’s needs for primary reinforcers.


Punishment means that something unpleasant is given following an undesired behaviour or in the absence of the desired behaviour. This has the effect of decreasing undesired behaviour, potentially allowing more opportunity to engage in the desired behaviour.

Punishment can also take the form of removal of something pleasant and this is called negative punishment. Punishment is usually not as effective in changing behaviour as reward and some psychologists prefer not to use punishment with humans as it is ethically questionable to inflict something undesirable on another person. Punishment has also been criticised as it fails to show a person what the desired behaviour is.

Experimental psychologists often use mild electric shocks as punishment in the laboratory, and note that when an animal is punished for its behaviour, that behaviour will decrease.

In real life parents may smack their children as a punishment for dangerous or inappropriate behaviour, in the hope that the behaviour will decrease.  More socially and morally acceptable punishments include withholding pocket money (negative punishment) or removal of the child to a specific place where they are required to sit and are not given attention for a specified amount of time which is often dependent upon the age of the child.

With adults if a person is criticised or humiliated in a meeting for a comment that they have made, this may be punishing and have the effect of making the person less likely to speak up in the future.  However, as with rewards, punishers have to be individualised as some people may not decrease their contributions due to remarks of others and may in fact increase the remarks that they make possibly because they have in fact been positively reinforced in some way.

















  1. Oliver wants to train his dog, Spike, to compete in agility trials and has set up an agility course in his garden. Explain how Oliver could train Spike to compete on an agility course using operant conditioning.
  2. Jack is two years of age and is learning to use the potty/toilet without the need for a nappy. His parents are trying to think of ways to encourage Jack to use the potty/toilet. (a) Describe how Jack’s parents could use operant conditioning to encourage him to use the potty/toilet. (4)
  3. Which one of the following is an example of a positive reinforcement?

A Tina avoids detention by doing her homework

B Tina gains a sticker for her good behaviour in class

C Tina is told off for shouting at the teacher

D Tina saw her friend being given a sweet for tidying the desk.

4. Negative reinforcement is when something:

A desired is given after a behaviour

B undesired is given after a behaviour

C desired is removed after a behaviour

D undesired is removed after a behaviour

5. Match the terms from the list below with the correct examples. You must not use the same term more than once: (4)

• Negative reinforcement

• Primary reinforcement

• Punishment

• Secondary reinforcement


  • Rosie gains points on her loyalty card every time she buys some shopping from the local supermarket
  • Jim gets a fine for speeding in his car near a local infant school
  • Fiona’s mother stops shouting at her once she has tidied up her bedroom
  • David’s parents give him some sweets for helping his sister with her chores

I like to buy books at Waterstones bookshop. Waterstones use operant conditioning to encourage people to by more books. They give their customers loyalty cards where they collect stamps each time they buy a book. When you have 10 stamps you get £10 off your next purchase.

(a) Identify the primary reinforcer and secondary reinforcer in this scenario. (2)
(b) Describe how Waterstones could use two schedules of reinforcement to encourage their customers to buy more books . (4)

