‘When can I stop using food?’ in Clicker Training

This is a question equine clicker trainers get asked often and is a really fascinating question for me as positive reinforcement horse trainer.

I get that it’s a concern for people who are interested in clicker training and those who are exploring the pros and cons. It seems like a hassle, right?

Why is this such an intriguing question?

If you know the principles of training you’ll understand. Let me explain. Basically there are only two ways you can motivate a horse in training.

  1. Strengthen (reinforce) a behaviour by taking away an aversive. An aversive is something the horse wants to avoid or get away from.
  2. Strengthen (reinforce) a behaviour by giving an appetitive. An appetitive is something the horse wants to receive, something he likes.

So if someone ask me ‘When can I stop using food in training?’ it sounds like the person wants to know ‘When can I stop reinforcing behaviour?’ or ‘When can I stop offering appetitives in training?’

I have never heard someone ask a riding instructor ‘When can I stop using my whip?’ or an employer that wants to know when he can stop paying his newly hired employees.

reinforcement_hippologic

It is a legit question

However, I do understand where this question is coming from. It comes from a fear of never, ever doing something with your horse without having a treat in your pocket. I get that, but a reinforcer isn’t a bribe that you have to use every time and also have to keep increasing.

Here is what happens if you start using positive reinforcement:

  • Your horse will learn that he can influence the training by his own actions (the right behaviour leads to a click, which leads to an appetitive)
  • Your horse will gain the confidence to try out new behaviours because that increases his chances of getting what he likes (food). He is having fun discovering what leads to a treat and what doesn’t.
  • He will like the engagement with his person, because there is a ‘puzzle’ involved and there is no punishment for ‘wrong answers’. All answers are ‘Good’ or, worst case scenario, ‘Not Reinforced’.
  • In the beginning it will be about the food, yes, but if the trainer uses a marker (the click) to mark the desired behaviour in a consistent way, the horse will shift his attention from the reinforcer, the food, to the click (the marker) and therefor will be focused more on this behaviour instead of the food.
  • As soon as the marker signal (the click) becomes a reliable predictor of the appetitive, the click becomes as valuable as the food. Now the click has become secondary reinforcer. Something the horse has learned to value. First it meant nothing, now it means ‘an appetitive is coming’.

Reinforcement never stops

In positive reinforcement as well as in negative reinforcement training (traditional training and natural horsemanship methods) reinforcement never stops.

If the reinforcement stops the behaviour will go extinct (die out), unless it is ‘self _carrot_or_stick_hippologicrewarding behaviour’, behaviour that reinforces itself without external interference. 

All behaviour must be reinforced 
in order to stay in the horses 
'repertoire'.

Riders will never stop using leg aids (pressure-release) and if the horse fades out his response, he will get a reminder (the rider will use reinforcement) to ‘hurry up and respond quicker’ by the use of a stronger leg aid, the tap of the whip or the use of spurs.

Does a (well trained) horse need to be in pain every time you ride him? No, he will learn to anticipate on a light cue, that now is a reliable predictor of an aversive. It’s this principle that ‘keeps the horse in line’. The horse had learned how to avoid it.

What about positive reinforcement training? Do I have to keep using food forever?

Yes and No.

Please explain!

_cutting_carrot_hippologicYes, you will have to reinforce a learned (trained) behaviour once in a while after it is established. This will prevent extinction. This means you will have to remind your horse that there is ‘still a chance of getting something good’ (food) once in a while for good performance.

No, it doesn’t have to be food!

Once you get more experience as trainer you can use other reinforcers too that aren’t food. You can even reward behaviour with behaviour.

Yes, you will carry food almost every training, but it is not what you think. Once you have discovered how much fun it is (for you and your horse) to clicker train him and how easy you get new behaviours you can’t stop teaching him more and more.

Food is a powerful primary reinforcer and comes in handy when teaching new behaviours. That is why clicker trainers almost always carry food: they are busy training new behaviours!

No, you don’t have to reinforce well known behaviour every time with food.

HippoLogic mei '09

It can take a long time before positively reinforced behaviour goes extinct. Your horse will learn that you equal fun and he is willing to do so much more for you even when you don’t carry  food. Once your marker becomes valuable, you can replace food with other reinforcers, like scratches or other behaviours.

What about you?

What is your answer to the question ‘When can I stop using food in training?’ Please share it in the comments.

If you think this is a blog that someone can benefit from, please use one of the share buttons below. Or post a comment, I read them all!  Thanks a lot!

 HippoLogic.jpg
Sandra Poppema, B.Sc.
My mission is to improve human-horse relationships. I reconnect horse women with their inner wisdom and teach them the principles of learning and motivation, so they become confident and skilled to train their horse in a safe and effective way that is a lot of FUN for both human and horse. Win-win.
Sign up for HippoLogic’s newsletter (it’s free and it comes with a reinforcer) or visit HippoLogic’s website and discover what else I have to offer.
Follow my blog  on Bloglovin

Tips to re-train undesired behaviour in horses

In another post I explained the power of a variable reward schedule and how to use it into your advantage. A variable ratio schedule is the most powerful reward schedule because it takes the longest for a behaviour to become extinct. How can you use this information in re-training undesired behaviour?

‘Extinction’ of behaviour

Extinction means that the behaviour will never be displayed in a certain situation. There is 0% chance of a reward, so therefor the behaviour has become ‘useless’ in that situation.

This is what we want to accomplish when a horse displays undesired behaviour, like kicking the stall door. We want to ignore the behaviour in order to make clear that this will not get him anywhere.

Why does it often seem not to work at all (ignoring undesired behaviour)?. It is because of a natural occurrence in learning that is called ‘extinction burst’.

Extinction burst

Once the owner decides to ignore this undesired behaviour in order to let it become extinct (0% chance of a reward so therefor displaying the behaviour has no value for the horse anymore) the behaviour will first show an ‘extinction burst’.

Extinction_Graph

Extinction, extinction burst and spontaneous recovery graph from study.com

During the extinction burst the horse will show an increased amount effort in the hope for a reward. If one decides to ‘reward’ (read: react) to this undesired behaviour in any way, even if it is with shouting at the horse in an attempt to punish this undesired behaviour, chances are that the horse regards this as his reward. After all, it is the receiver (horse) who determines if something is a reward.

How to handle it

If the horse kicks a door in order to get your attention and he gets what he wants, it is a reward. Every time an extinction burst is rewarded it takes longer for the behaviour to become extinct.

So if you expect the horse wants your attention, make sure he doesn’t get it. Every time he kicks his stall door walk out of sight or turn your back. In this way you make sure you don’t give him attention for kicking the door.

Extinct behaviour

If you want to let a behaviour go extinct the extinction burst is the most important moment not to reinforce.

This is also the  moment most people are tempted to react. The person interprets the increased undesired behaviour as ‘the horse hasn’t learnt anything’ and because the bad behaviour increased (instead of decreased) they feel the need to interfere in the hope punishment will solve this.

Spontaneous recovery

A second, smaller extinction bursts can occur over time, which are called spontaneous recovery of behaviour. In the case of our horse kicking the barn door, he might show the behaviour  again but less extreme. When the extinction burst(s) don’t get reinforced the behaviour will go extinct.

Undesired behaviour

In dealing with undesired behaviour we always want to know what caused the behaviour, so we can work on that too.

Sometimes it is really hard to determine what reinforces a certain undesired behaviour. If the behaviour is ‘self rewarding’ just ignoring the behaviour won’t work. The horse will get his reward regardless what you are doing. Then you have to figure out how you can reinforce the opposite behaviour more than the undesired behaviour or find a way to prevent it.

KONICA MINOLTA DIGITAL CAMERA

This kind of self rewarding behaviour is hard to re-train

Rewarding the opposite behaviour

In the case of door kicking you can ignore the noise and start rewarding the horse for ‘four hooves on the ground’. In this way you communicate what it is you do want from the horse: standing still. Use the reward he wants for the undesired behaviour: your attention or during feeding time the food.

This approach works really well, but it takes a lot of effort from the trainer. You must be paying attention when the horse is standing still and is quiet. That can be a bigger challenge than just ignoring the door kicking.

Make sure everybody is on the same page if you want to re-train behaviour like door kicking. Ask everyone to follow the simple rules: go to horses that stand still and look for attention, ignore the door kickers.

Remember this

Every time an extinction burst is rewarded, the behaviour becomes stronger. Something you want in training desired behaviours, not in re-training undesired behaviours.

If you think this is a blog that someone can benefit from, please use one of the share buttons  below. Or post a comment, I read them all! Or just hit the like button if you appreciated this blog. Thank you!

HippoLogic.jpg
Sandra Poppema, B.Sc.
My mission is to improve human-horse relationships. I reconnect horse women with their inner wisdom and teach them the principles of learning and motivation, so they become confident and skilled to train their horse in a safe and effective way that is a lot of FUN for both human and horse. Win-win.
Sign up for HippoLogic’s newsletter (it’s free and it comes with a reinforcer) or visit HippoLogic’s website and discover what else I have to offer.