PSYC202 Foundations of Cognitive Neuroscience – Week 15a: Goal-directed behaviour,
habits and addictions
We have all had experiences of intending to do one thing, and then suddenly finding
ourselves doing another, perhaps habitual behaviour. One predictable feature of setting off on
week end trips with the family as a child, was that whatever the intended destination, as often
as not my father would find himself driving in the wrong direction, towards his place of
work. We often call such behaviour being on “auto-pilot”, and clearly this has its uses,
freeing up the mind to consider things other than ongoing habitual routines. Over the last 3
decades of research, key insights behind the mechanisms of goal-directed and habitual
behaviour have helped us not only understand such every-day occurrences, but also why mal-
adaptive habits, such as addictions, may arise.
What is a habit?
The key conceptual and methodological break-through that helped define a habit came from
the work of Anthony Dickinson and colleagues at the University of Cambridge, starting in the
1980s (Dickinson, 1985). Dickinson (1985) defined goal-directed behaviour as being action-
outcome behaviour, where the agent is driven to act by a representation of the goal of the
action. Habitual behaviour on the other hand, is instrumental, stimulus-response behaviour,
whereby reward has strengthened the association between stimuli and a response, but no
representation of the rewarded goal need be present; the eliciting stimuli are enough for the
response to be produced. Dickinson developed the basic procedure for ascertaining whether a
behaviour was based on action-outcome associations, or habitual, involving devaluing the
goal or reward associated with a behaviour. For example, an animal may “work”, through
pressing a lever, for a food reward when hungry, but may not do so when they are sated on
that food reward. Or, the previously palatable food reward is rendered unpalatable by an
injection of lidocaine just after ingestion, which causes some nausea. What he found with this
lever-pressing task, and what was found across many different paradigms (Graybiel & Smith,
2014), is that during the early and middle phases of learning, behaviour is based on action-
outcome contingencies. That is, in the first few days of learning to press a lever for food, if
the food becomes devalued, rats will simply stop pressing the lever; clearly they were
pressing “for” the goal or reward, and when this is no longer wanted, the pressing behaviour
ceases to go above baseline levels (pressing when there is no contingency between pressing
and a reward).
However, with further training, or “over-training”, a habit appears to set in whereby
even if the food is devalued, rats continue to press the lever at similar rates to those before
devaluation; without then actually consuming the food pellets that appear as reward. Here, it
appears that the stimulus of the lever, within the context of the testing environment, is what
drives the behaviour, irrespective of goal value. Of course, eventually, if no reward is
forthcoming, a habitual behaviour can be altered, but it is surprisingly resistant to change. In
the everyday example of human behaviour, that of setting off to a habitual destination instead
of a less usual destination, the “pull” of the stimuli of getting in the car in the home
environment gets the better of the plan of action based on the actual current goal.
What are the neural bases of action-outcome and habitual behaviour? Last week, and in
the seminar this week and next, we saw how decision-making occurred through a
thalamocortical-basal-ganglia loop, whereby bids for action were selected, based on reward
prediction mediated by dopamine neurons. One key simplification of this process that was
made however, was to treat the striatum as one unified entity. In some ways, this is correct, in
1
habits and addictions
We have all had experiences of intending to do one thing, and then suddenly finding
ourselves doing another, perhaps habitual behaviour. One predictable feature of setting off on
week end trips with the family as a child, was that whatever the intended destination, as often
as not my father would find himself driving in the wrong direction, towards his place of
work. We often call such behaviour being on “auto-pilot”, and clearly this has its uses,
freeing up the mind to consider things other than ongoing habitual routines. Over the last 3
decades of research, key insights behind the mechanisms of goal-directed and habitual
behaviour have helped us not only understand such every-day occurrences, but also why mal-
adaptive habits, such as addictions, may arise.
What is a habit?
The key conceptual and methodological break-through that helped define a habit came from
the work of Anthony Dickinson and colleagues at the University of Cambridge, starting in the
1980s (Dickinson, 1985). Dickinson (1985) defined goal-directed behaviour as being action-
outcome behaviour, where the agent is driven to act by a representation of the goal of the
action. Habitual behaviour on the other hand, is instrumental, stimulus-response behaviour,
whereby reward has strengthened the association between stimuli and a response, but no
representation of the rewarded goal need be present; the eliciting stimuli are enough for the
response to be produced. Dickinson developed the basic procedure for ascertaining whether a
behaviour was based on action-outcome associations, or habitual, involving devaluing the
goal or reward associated with a behaviour. For example, an animal may “work”, through
pressing a lever, for a food reward when hungry, but may not do so when they are sated on
that food reward. Or, the previously palatable food reward is rendered unpalatable by an
injection of lidocaine just after ingestion, which causes some nausea. What he found with this
lever-pressing task, and what was found across many different paradigms (Graybiel & Smith,
2014), is that during the early and middle phases of learning, behaviour is based on action-
outcome contingencies. That is, in the first few days of learning to press a lever for food, if
the food becomes devalued, rats will simply stop pressing the lever; clearly they were
pressing “for” the goal or reward, and when this is no longer wanted, the pressing behaviour
ceases to go above baseline levels (pressing when there is no contingency between pressing
and a reward).
However, with further training, or “over-training”, a habit appears to set in whereby
even if the food is devalued, rats continue to press the lever at similar rates to those before
devaluation; without then actually consuming the food pellets that appear as reward. Here, it
appears that the stimulus of the lever, within the context of the testing environment, is what
drives the behaviour, irrespective of goal value. Of course, eventually, if no reward is
forthcoming, a habitual behaviour can be altered, but it is surprisingly resistant to change. In
the everyday example of human behaviour, that of setting off to a habitual destination instead
of a less usual destination, the “pull” of the stimuli of getting in the car in the home
environment gets the better of the plan of action based on the actual current goal.
What are the neural bases of action-outcome and habitual behaviour? Last week, and in
the seminar this week and next, we saw how decision-making occurred through a
thalamocortical-basal-ganglia loop, whereby bids for action were selected, based on reward
prediction mediated by dopamine neurons. One key simplification of this process that was
made however, was to treat the striatum as one unified entity. In some ways, this is correct, in
1