Similarly, if the subjective values of specific outcomes change as a result of selective feeding or taste aversion, the value functions for actions leading to those outcomes can be revised without directly experiencing
them (Holland and Straub, 1979; Dickinson, 1985). Therefore, the choices predicted by model-free and model-based reinforcement learning algorithms, as well as their corresponding neural mechanisms, might be different. As described above, errors in predicting affective outcomes, namely, reward prediction errors, are postulated to drive model-free reinforcement learning, GW3965 molecular weight including both Pavlovian conditioning and habit learning. An important clue for the neural mechanism of reinforcement learning was therefore provided by the observation that the phasic activity of www.selleckchem.com/products/SB-431542.html midbrain dopamine neurons encodes the reward prediction error (Schultz, 1998). Dopamine neurons
innervate many different targets in the brain, including the cerebral cortex (Lewis et al., 2001), striatum (Bolam et al., 2000; Nicola et al., 2000), and amygdala (Sadikot and Parent, 1990). In particular, the amygdala might be involved in both fear conditioning (LeDoux, 2000) and appetitive Pavlovian conditioning (Hatfield et al., 1996; Parkinson et al., 2000; Paton et al., 2006). Induction of synaptic plasticity in the amygdala that underlies Pavlovian conditioning might depend on the activation of dopamine receptors (Guarraci et al., 1999; Bissière et al., 2003). In addition, the ventral striatum also contributes to several different forms of appetitive Pavlovian conditioning, such as auto-shaping, conditioned place preference, and second-order conditioning (Cardinal et al., 2002). Given the increased range of actions controlled by habit learning, the anatomical substrates for habit learning might be more extensive compared
to the areas related to Pavlovian conditioning, and are likely to span both cortical and subcortical areas. Nevertheless, found the striatum has received much attention due to its dense innervation by dopamine neurons (Houk et al., 1995). The striatum integrates inputs from almost all cortical areas, and influences the activity of neurons in the motor structures, such as the superior colliculus and pedunculopontine nucleus, largely through disinhibitory mechanisms (Chevalier and Deniau, 1990; Mink, 1996). In addition, striatal neurons in the direct and indirect pathways express D1 and D2 dopamine receptors, respectively, and might influence the outputs of the basal ganglia antagonistically (Kravitz et al., 2010; Tai et al., 2012; but see Cui et al., 2013). Dopamine-dependent, bidirectional modulation of corticostriatal synapses might provide the biophysical substrates for integrating the reward prediction error signals into value functions in the striatum (Shen et al., 2008; Pawlak and Kerr, 2008; Wickens, 2009).