Tips to re-train undesired behaviour in horses

In another post I explained the power of a variable reward schedule and how to use it into your advantage. A variable ratio schedule is the most powerful reward schedule because it takes the longest for a behaviour to become extinct. How can you use this information in re-training undesired behaviour?

‘Extinction’ of behaviour

Extinction means that the behaviour will never be displayed in a certain situation. There is 0% chance of a reward, so therefor the behaviour has become ‘useless’ in that situation.

This is what we want to accomplish when a horse displays undesired behaviour, like kicking the stall door. We want to ignore the behaviour in order to make clear that this will not get him anywhere.

Why does it often seem not to work at all (ignoring undesired behaviour)?. It is because of a natural occurrence in learning that is called ‘extinction burst’.

Extinction burst

Once the owner decides to ignore this undesired behaviour in order to let it become extinct (0% chance of a reward so therefor displaying the behaviour has no value for the horse anymore) the behaviour will first show an ‘extinction burst’.

Extinction_Graph

Extinction, extinction burst and spontaneous recovery graph from study.com

During the extinction burst the horse will show an increased amount effort in the hope for a reward. If one decides to ‘reward’ (read: react) to this undesired behaviour in any way, even if it is with shouting at the horse in an attempt to punish this undesired behaviour, chances are that the horse regards this as his reward. After all, it is the receiver (horse) who determines if something is a reward.

How to handle it

If the horse kicks a door in order to get your attention and he gets what he wants, it is a reward. Every time an extinction burst is rewarded it takes longer for the behaviour to become extinct.

So if you expect the horse wants your attention, make sure he doesn’t get it. Every time he kicks his stall door walk out of sight or turn your back. In this way you make sure you don’t give him attention for kicking the door.

Extinct behaviour

If you want to let a behaviour go extinct the extinction burst is the most important moment not to reinforce.

This is also the  moment most people are tempted to react. The person interprets the increased undesired behaviour as ‘the horse hasn’t learnt anything’ and because the bad behaviour increased (instead of decreased) they feel the need to interfere in the hope punishment will solve this.

Spontaneous recovery

A second, smaller extinction bursts can occur over time, which are called spontaneous recovery of behaviour. In the case of our horse kicking the barn door, he might show the behaviour  again but less extreme. When the extinction burst(s) don’t get reinforced the behaviour will go extinct.

Undesired behaviour

In dealing with undesired behaviour we always want to know what caused the behaviour, so we can work on that too.

Sometimes it is really hard to determine what reinforces a certain undesired behaviour. If the behaviour is ‘self rewarding’ just ignoring the behaviour won’t work. The horse will get his reward regardless what you are doing. Then you have to figure out how you can reinforce the opposite behaviour more than the undesired behaviour or find a way to prevent it.

KONICA MINOLTA DIGITAL CAMERA

This kind of self rewarding behaviour is hard to re-train

Rewarding the opposite behaviour

In the case of door kicking you can ignore the noise and start rewarding the horse for ‘four hooves on the ground’. In this way you communicate what it is you do want from the horse: standing still. Use the reward he wants for the undesired behaviour: your attention or during feeding time the food.

This approach works really well, but it takes a lot of effort from the trainer. You must be paying attention when the horse is standing still and is quiet. That can be a bigger challenge than just ignoring the door kicking.

Make sure everybody is on the same page if you want to re-train behaviour like door kicking. Ask everyone to follow the simple rules: go to horses that stand still and look for attention, ignore the door kickers.

Remember this

Every time an extinction burst is rewarded, the behaviour becomes stronger. Something you want in training desired behaviours, not in re-training undesired behaviours.

If you think this is a blog that someone can benefit from, please use one of the share buttons  below. Or post a comment, I read them all! Or just hit the like button if you appreciated this blog. Thank you!

HippoLogic.jpg
Sandra Poppema, B.Sc.
My mission is to improve human-horse relationships. I reconnect horse women with their inner wisdom and teach them the principles of learning and motivation, so they become confident and skilled to train their horse in a safe and effective way that is a lot of FUN for both human and horse. Win-win.
Sign up for HippoLogic’s newsletter (it’s free and it comes with a reinforcer) or visit HippoLogic’s website and discover what else I have to offer.

Most powerful reward schedule: Variable ratio

In a  variable ratio schedule a desired behaviour (once it is established and put on cue) will be reinforced randomly. There is no way the horse can predict when he can expect a reward, so this will keep him motivated to perform well.

Benefits of a variable reinforcement schedule

With a variable ratio schedule it will take a very long time before a behaviour will become extinct. Extinction means that the behaviour will no longer be displayed in a certain situation. There is 0% chance of a reward so therefor the behaviour has become ‘useless’ in that situation.

A variable ratio schedule is the most powerful reward schedule. Your horse figures ‘This could be the time my behaviour gets rewarded, so let’s try this again’. No reward? ‘Maybe this time I will get a reward… Let’s give it a bit more effort… Yes! It worked’.__rewards_hippologic

A variable reward schedule is also the reason why most horses keep displaying undesired behaviours. I explain this further in this post.

Extinction burst

If a behaviour is never rewarded (intrinsically or extrinsically) it will go extinct. Just before a behaviour goes extinct there is usually an ‘extinction burst’.

Often when an in the past rewarded behaviour doesn’t result in a reward the animal shows a sudden and temporary increase in the behaviour followed by the eventual decline and extinction of the behaviour targeted for elimination. Novel behaviour, or emotional responses or aggressive behaviour, may also occur (Miltenberger, R. (2012). Behaviour modification, principles and procedures. (5th ed., pp. 87-99). Wadsworth Publishing Company.)

Extinction_Graph

Extinction, extinction burst and spontaneous recovery graph from study.com

The same principle occurs in a consciously applied variable reward schedule. Just before the horse loses interest in displaying the behaviour he will show a little ‘extinction burst’ as a last attempt to influence the reinforcement (reward). This is the improved behaviour a trainer is looking for and wants to mark and reward.

Withhold the click

If the horse already has a strong positive reinforcement history with a certain behaviour or with positive reinforcement training in general, it can react differently to a withdrawn click than when he is in the beginning of the learning stage of an exercise.

A well used withdrawal of the click will induce an improvement of behaviour (extinction burst). It also can help the horse figure out quicker which behaviour is rewarded and which isn’t. In this way you can give more information about what you want.

Instead of the trainer acting like a ‘vending machine’: put money (behaviour) in and expect a reward (treat comes out), the trainer now behaves more like a ‘gambling machine’ with a fair chance to win.

_reward_schedules_hippologic

The horse may become ‘superstitious’ and tries to figure out if there was a difference with the behaviour that was similar and didn’t get rewarded and the one that did. Just like superstitious people who are suddenly paying attention to the colour of their socks in order to influence their chances of winning, the animal will also pay more attention to the details of the behaviour in order to influences the chances of a click and reward.

Pitfalls of withholding a click too long

Withholding a click can also trigger impatience, frustration or confusion in the horse. So  use this technique with caution. You don’t want to discourage your horse. A little bit of frustration is no big deal, as long as the horse stays in learning mode.

Sometimes a bit of frustration can actually benefit the learning process. It is the trainers responsibility to walk this line. If the horse gets frustrated or shuts down, turn back to a continuous reward schedule for a while and make your training steps smaller and lower your criteria.

When you start teaching a new behaviour it is really important to click every improvement and use a continuous reward schedule. The next step in training should be only rewarding the behaviour when you have cued it. Once the cue is established, switch to a variable reward schedule.

Fading out the rewards

So once your horse has learned a specific behaviour you can reward less and less and still get the behaviour. This is called fading out the click.

Continuous reward schedules are very easy to use (reward 100%) because you don’t have to think about it. What about a variable reward schedule, are you using this in your training?

Sandra Poppema
For tailored positive reinforcement training advise, please visit my website and book a free intake consult!

Follow my blog with Bloglovin