Learning

Neurofeedback training involves several learning processes, including operant conditioning and observational learning. Neurofeedback software incorporates powerful operant principles. A movie plays when a child increases low beta and decreases theta activity (positive reinforcement). A score counter stops reversing when a child focuses on a reading selection after several minutes of distraction (negative reinforcement). The training goal becomes progressively more demanding as the child succeeds (shaping).

BCIA Blueprint Coverage

This unit covers Overview of principles of human learning as they apply to neurofeedback (I-C).

Professionals completing this unit will be able to discuss:

Overview of principles of human learning as they apply to neurofeedback
A. Learning theory
B. Application of learning principles to neurofeedback training

Three Main Types of Learning

We can divide learning into three main categories: associative, nonassociative, and observational. Associative learning and observational learning are most relevant to neurofeedback. In associative learning, we create connections between stimuli, behaviors, or stimuli and behaviors. Associative learning aids our survival since it allows us to predict future events based on past experience. Learning that B follows A provides us with time to prepare for B. Two forms of associative learning are classical conditioning and operant conditioning (Cacioppo & Freberg, 2016).

Classical Conditioning

Pavlov demonstrated classical conditioning in 1927 based on his famous research with dogs. Due to faulty English translations, we now use the adjectives conditioned and unconditioned instead of conditional and unconditional. We will use the current terminology to minimize confusion.

Classical conditioning is an unconscious associative learning process that builds connections between paired stimuli that follow each other in time. Through this learning process, a dog in Pavlov's laboratory learned that if A (a ringing bell) occurs, B (food) will follow. The ability to predict the future from past experience is crucial to our survival since it gives us time to prepare.

Before learning to anticipate food in Pavlov's laboratory, dogs salivated (unconditioned response) when they saw food (unconditioned stimulus). An unconditioned response (UCR) follows an unconditioned stimulus (UCS) before conditioning. However, until dogs learned to associate the sound of a bell with the delivery of food, the bell was a neutral stimulus (NS) which did not elicit salivation.

Through repeated pairing of the bell with the arrival of food, Pavlov's dogs learned that the bell reliably signaled the arrival of food. Now, the bell became a conditioned stimulus (CS) which elicited the conditioned response (CR) of salivation. Graphic © VectorMine/iStockphoto.com.

When the association between the CS and CR is disrupted because B no longer consistently follows A, the frequency of the CR may decline or the CR may disappear. This phenomenon is called extinction. In Pavlov's situation, repeated trials where food did not follow bell ringing reduced or eliminated salivation. Extinction of CRs allows us to adapt to changes in our experience. You learn to use passwords less predictable than password after you become a victim of identity theft. You are able to play with dogs, again, after you were bitten as a child.

Pavlov argued that extinction is not forgetting, but evidence of new learning that overrides previous learning. The phenomenon of spontaneous recovery, where the CR (salivation) reappears after a period of time without exposure to the CS (bell) supports his position. Dogs who stopped salivating by the end of an extinction trial in which no food followed the bell often resumed salivating during a break or the next session.

Generalization and discrimination are mirror images of each other. In generalization, the conditioned response is elicited by stimuli (low-pitched bell) that resemble the original conditioned stimulus (high-pitched bell). Generalization promotes our survival because it allows us to apply learning about one stimulus (lions) to similar predators (tigers) without experiencing them.

In contrast, in discrimination, the conditioned response (salivation) is elicited by one stimulus (high-pitched bell), but not another (low-pitched bell). When soldiers return to civilian life, they must distinguish between a CS that signals danger (gunfire) and one that is benign (fireworks). Discrimination is often impaired in soldiers diagnosed with post-traumatic stress disorder (PTSD).

Operant Conditioning

Edward Thorndike's (1913) law of effect proposed that the consequences of behavior determine its addition to your behavioral repertoire. From his perspective, cats learned to escape his "puzzle boxes" by repeating successful actions and eliminating unsuccessful ones.

Operant conditioning is an unconscious associative learning process that modifies an operant behavior, voluntary behavior that operates on the environment to produce an outcome, by manipulating its consequences (Miltenberger, 2016).

Operant conditioning differs from classical conditioning in several respects. Where operant conditioning teaches the association of a voluntary behavior with its consequences, classical conditioning teaches the predictive relationship between two stimuli to modify involuntary behavior.

Neurofeedback teaches self-regulation of neural activity and related "state changes" using operant conditioning via the selective presentation of reinforcing stimuli, including visual, auditory, and tactile displays.

Operant conditioning occurs with a situational context. The identifying characteristics of a situation are called its discriminative stimuli and can include the physical environment and physical, cognitive, and emotional cues. Discriminative stimuli teach us when to perform operant behaviors.

The consequences of operant behaviors can increase or decrease their frequency. Skinner proposed four types of consequences: positive reinforcement, negative reinforcement, positive punishment, and negative punishment. Where positive and negative reinforcement increase behavior, positive and negative punishment decrease it.

Due to individual differences, we cannot know in advance whether a consequence will be reinforcing or punishing, since these are not intrinsic properties of a consequence. We can only determine whether a consequence is reinforcing or punishing by measuring how it affects the behavior that preceded it. In neurofeedback, a movie that motivates the best performance might be the most reinforcing for the client, regardless of the therapist's personal preference.

Positive reinforcement increases the frequency of a desired behavior by making a desired outcome contingent on performing the action. For example, a movie plays when a client diagnosed with attention deficit hyperactivity disorder (ADHD) increases low-beta and decreases theta activity.

Negative reinforcement increases the frequency of a desired behavior by making the avoidance, termination, or postponement of an unwanted outcome contingent on performing the action. For example, a athlete's anxiety decreases by shifting from high beta to low beta.

Positive punishment decreases or eliminates a undesirable behavior by associating it with unwanted consequences. For example, a child's increased fidgeting dims a favorite movie and decreases the sound.

Negative punishment decreases or eliminates a undesirable behavior by removing what is desired. For example, oppositional behavior could result in a clinician turning off a popular game.

Speed of Reinforcement

The timing of reinforcement is critical to associating the desired behavior with its consequences. Several early studies have shown that the optimal latency between a voluntary behavior and reinforcement is between under 250 to 350 ms (Felsinger & Gladstone, 1947; Grice, 1948). The faster the delivery of reinforcement following the desired behavior, the less time required for skill acquisition.

Reinforcement Criteria

Current research is exploring the optimal reinforcement criteria for neurofeedback training. Client skill acquisition is markedly affected by changing parameters like reinforcement schedule, frequency of reward, reinforcement delay, conflicting reinforcements, conflicting expectations, and alteration of the environment.

While continuous reinforcement, reinforcement of every desired behavior, is helpful during the early stage of skill acquisition, it is impractical as clients attempt to transfer the skill to real-world settings. Since reinforcement outside of the clinic is intermittent, partial reinforcement schedules, where desired behavior is only reinforced some of the time, are important as training progresses. This reduces the risk of extinction, where failure to reinforce a desired behavior reduces the frequency of that behavior.

For neurofeedback, variable reinforcement schedules, where reinforcement occurs after a variable number of responses (variable ratio) or following a variable duration of time (variable interval) produce superior response rates than their fixed counterparts.

Shaping

Shaping, the method of successive approximations, teaches clients new behaviors and increases the frequency of behaviors that are rarely performed. A clinician starts by reinforcing spontaneous voluntary behaviors that resemble the desired behavior and then progressively raises the criteria for reinforcement to achieve the training goal. For example, the clinician can gradually require lower theta-to-beta ratios for a movie to play.

Discrimination and Generalization

Discrimination and generalization are the ultimate goals of neurofeedback training. Discrimination teaches when a desired behavior will be reinforced. The initial discriminative stimuli include the cues provided by the training environment, like animations and tones. Following successful skill acquisition, a clinician may introduce a stressor to "raise the bar." Now, the stressor serves as another discriminative stimulus for performing the desired behavior.

Generalization teaches the transfer of a desired behavior to multiple environments and in response to diverse stressors. While the ability to perform the learned response in many situations contributes to flexibility, it is not always advantageous. Discrimination, based on an understanding of set and setting, helps determine when a response is required and which response is appropriate.

Check out the TED-Ed video, The Difference Between Classical and Operant Conditioning, when you are connected to the internet.

History of EEG Conditioning

Durup, G., & Fessard, A. I. (1935). Blocking of the alpha rhythm, L'ann'ee Psychologique, 36(1), 1-32.

Loomis, A. L., Harvey, E. N., and Hobart, G. (1936). Electrical potentials of the human brain. Journal of Experimental Psychology, 19, 249.

Travis, L. E., & Egan, J. P. (1938), Conditioning of the electrical response of the cortex. Journal of Experimental Psychology, 22(6), 524-531.

Jasper, H., & Shagass, C. (1941). Conditioning the occipital alpha rhythm in man. Journal of Experimental Psychology, 28(5), 373-387.

Knott, J. R. (1941). Electroencephalography and physiological psychology: Evaluation and statement of problem. Psychological Bulletin, 38(10), 944-975.

Kamiya, J. (2011). The first communications about operant conditioning of the EEG, Journal of Neurotherapy, 15(1), 65-73.

Clemente, C. D., Sterman, M. B., & Wyrwicka, W. (1964). Post-reinforcement EEG synchronization during alimentary behavior. Electroencephalography Clinical Neurophysiology, 355-365.

Albino, R., & Burnand, G. (1964). Conditioning of the alpha rhythm in man. Journal of Experimental Psychology, 67(6), 539-544.

Wyrwicka, W., & Sterman, M. B. (1968). Instrumental conditioning of sensorimotor cortex EEG spindles in the waking cat. Physiology and Behavior, 3(5), 703-707.

Sterman, M. B., LoPresti, R. W., & Fairchild, M. D. (1969). Electroencephalographic and behavioral studies of monomethyl hydrazine toxicity in the cat. Technical Report AMRL-TR-69-3, Wright-Patterson Air Force Base, Ohio, Air Systems Command.

Observational Learning

Observational learning allows us to rapidly modify an existing skill or acquire a new one by observing others. This social learning process is extremely efficient because it bypasses trial-and-error. We don't personally experience negative outcomes since others have done it for us.

Observational learning is interactive. For example, we listen to a teacher play a violin passage, we attempt to play the same notes, and then compare the two performances. We play the passage, again and again, until we "get it right." Feedback allows us to refine our playing until we can match the original sample.

Glossary

classical conditioning: unconscious associative learning process that builds connections between paired stimuli that follow each other in time.

conditioned response (CR): in classical conditioning, a response to a conditioned stimulus (CS). For example, salivation in response to a bell.

conditioned stimulus (CS): in classical conditioning, a stimulus that elicits a response after training. For example, a bell after pairing with food.

discrimination (classical conditioning): response to the original CS, but not to one that resembles it. For example, salivation to a high-pitched bell, but not to a low-pitched bell.

discrimination (operant conditioning): performance of the desired behavior in one context, but not another. For example, increasing sensorimotor rhythm (SMR) activity at bedtime, but not during a morning commute.

discriminative stimuli: in operant conditioning, the identifying characteristics of a situation (the physical environment and physical, cognitive, and emotional cues) that teach us when to perform operant behaviors. For example, a traffic slowdown could signal a client to practice effortless breathing.

extinction (classical conditioning): the reduction of a CR when the UCS no longer follows the CS. For example, less salivation when food no longer follows a bell.

extinction (operant conditioning): a reduction in response frequency when the desired behavior is no longer reinforced. For example, a client practices less when the clinician ceases to praise this behavior.

generalization (classical conditioning): response to stimuli that resemble the original CS. For example, salivation to both a low- and high-pitched bell.

generalization (operant conditioning): performance of the desired behavior in multiple contexts. For example, increasing low-beta activity during both classroom lecture and golf practice.

learning: the process by which we acquire new information, patterns of behavior, or skills.

negative punishment: in operant conditioning, learning by observing others. For example, a child’s oppositional behavior could result in a clinician turning off a popular game.

negative reinforcement: in operant conditioning, a process that increases the frequency of a desired behavior by making the avoidance, termination, or postponement of an unwanted outcome contingent on performing the action. For example, an athlete’s anxiety decreases by shifting from high beta to low beta, rewarding this self-regulation.

neutral stimulus (NS): in classical conditioning, a stimulus that does not elicit a response. For example, a bell before pairing with food.

operant behavior: in operant conditioning, a process that decreases or eliminates a undesirable behavior by removing what is desired. For example,

observational learning: learning by observing others. For example, a fitness client learns to stretch by watching a personal trainer demonstrate the technique.

operant conditioning: an unconscious associative learning process that modifies an operant behavior, voluntary behavior that operates on the environment to produce an outcome, by manipulating its consequences.

positive punishment: in operant conditioning, a process that decreases or eliminates an undesirable behavior by associating it with unwanted consequences. For example, a child's increased fidgeting dims a favorite movie and decreases the sound.

positive reinforcement: in operant conditioning, a process that decreases or eliminates a undesirable behavior by associating it with unwanted consequences. For example, a movie plays when a client increases low-beta and decreases theta activity.

shaping: in operant conditioning, the method of successive approximations, which teaches clients new behaviors and increases the frequency of behaviors that are rarely performed. For example, a clinician progressively requires lower theta-to-beta ratios for a movie to play.

spontaneous recovery: in classical conditioning, the reappearance of an extinguished CR following a rest period. For example, salivation following its disappearance.

stimulus discrimination: in classical conditioning, when a conditioned response (CR) is elicited by one conditioned stimulus (CS), but not by another. For example, your blood pressure increases during a painful dental procedure, but not during an uncomfortable blood draw.

stimulus generalization: in classical conditioning, when stimuli that resemble a conditioned stimulus (CS) elicit the same conditioned response (CR). For example, when your blood pressure increases during a painful dental procedures and an uncomfortable blood draw.

unconditioned response (UCR): in classical conditioning, a response to an UCS that elicits a response without training. For example, salivation in response to food

unconditioned stimulus (UCS): in classical conditioning, a stimulus that elicits a response without training. For example, food.

Test Yourself on CLASSMARKER

Click on the ClassMarker logo below to take a 10-question exam over this entire unit.

REVIEW FLASHCARDS ON QUIZLET

Click on the Quizlet logo to review our chapter flashcards.

Assignment

Now that you have completed this unit, which sounds do you prefer when you have succeeded during neurofeedback training? Which visual displays are more motivating for you?

References

Cacioppo, J. T., & Freberg, L. A. (2016). Discovering psychology. Boston, MA: Cengage Learning.

Felsinger, J. M., & Gladstone, A. I. (1947). Reaction latency (StR) as a function of the number of reinforcements (N). Journal of Experimental Psychology, 37(3), 214-228.

Grice, G. R. (1948). The acquisition of a visual discrimination habit following response to a single stimulus. Journal of Experimental Psychology, 38(6), 633-642,

Miltenberger, R. G. (2016). Behavior modification: Principles and procedures. Boston, MA: Cengage Learning.

Pavlov, I. P. (1927). Conditioned reflexes. Oxford, UK: Oxford University Press.

Thorndike, E. L. (1913). Educational psychology: Briefer course. New York, NY: Teachers College.