Learning and Reinforcement
- What are the best practices that organizations utilize to train employees in new job skills?
A central feature of most approaches to learning is the concept of reinforcement. This concept dates from Thorndike’s law of effect, which, as mentioned earlier, states that behavior that is positively reinforced tends to be repeated, whereas behavior that is not reinforced will tend not to be repeated. Hence, reinforcement can be defined as anything that causes a certain behavior to be repeated or inhibited.
Reinforcement versus Motivation
It is important to differentiate reinforcement from the concept of employee motivation. Motivation, as described in the next chapter, represents a primary psychological process that is largely cognitive in nature. Thus, motivation is largely internal—it is experienced by the employee, and we can see only subsequent manifestations of it in actual behavior. Reinforcement, on the other hand, is typically observable and most often externally administered. A supervisor may reinforce what he or she considers desirable behavior without knowing anything about the underlying motives that prompted it. For example, a supervisor who has a habit of saying “That’s interesting” whenever she is presented with a new idea may be reinforcing innovation on the part of the subordinates without the supervisor really knowing why this result is achieved. The distinction between theories of motivation and reinforcement should be kept in mind when we examine behavior modification and behavioral self-management later in this chapter.
Strategies for Behavioral Change
From a managerial standpoint, several strategies for behavioral change are available to facilitate learning in organizational settings. At least four different types should be noted: (1) positive reinforcement; (2) avoidance learning, or negative reinforcement; (3) extinction; and (4) punishment. Each type plays a different role in both the manner in which and extent to which learning occurs. Each will be considered separately here.
Positive Reinforcement. Positive reinforcement consists of presenting someone with an attractive outcome following a desired behavior. As noted by Skinner, “A positive reinforcer is a stimulus which, when added to a situation, strengthens the probability of an operant response.”
A simple example of positive reinforcement is supervisory praise for subordinates when they perform well in a certain situation. That is, a supervisor may praise an employee for being on time consistently (see (Figure)). This behavior-praise pattern may encourage the subordinate to be on time in the future in the hope of receiving additional praise.
In order for a positive reinforcement to be effective in facilitating the repetition of desired behavior, several conditions must be met. First, the reinforcer itself (praise) must be valued by the employee. It would prove ineffective in shaping behavior if employees were indifferent to it. Second, the reinforcer must be strongly tied to the desired behavior. Receipt of the reinforcer by the employee must be directly contingent upon performing the desired behavior. “Rewards must result from performance, and the greater the degree of performance by an employee, the greater should be his reward.”
It is important to keep in mind here that “desired behavior” represents behavior defined by the supervisor, not the employee. Thus, for praise to be a reinforcer, not only must it be valued by the employee, but it must directly follow the desired behavior and should be more intense as the behavior is closer to the ideal the supervisor has in mind. Praise thrown out at random is unlikely to reinforce the desired behavior. Third, there must be ample occasion for the reinforcer to be administered following desired behavior. If the reinforcer is tied to certain behavior that seldom occurs, then individuals will seldom be reinforced and will probably not associate this behavior with a reward. For example, if praise is only provided for truly exceptional performance, then it is unlikely to have a powerful impact on the desired behavior. It is important that the performance-reward contingencies be structured so that they are easily attainable.
Avoidance Learning. A second method of reinforcement is avoidance learning, or negative reinforcement. Avoidance learning refers to seeking to avoid an unpleasant condition or outcome by following a desired behavior. Employees learn to avoid unpleasant situations by behaving in certain ways. If an employee correctly performs a task or is continually prompt in coming to work (see (Figure)), the supervisor may refrain from harassing, reprimanding, or otherwise embarrassing the employee. Presumably, the employee learns over time that engaging in correct behavior diminishes admonition from the supervisor. In order to maintain this condition, the employee continues to behave as desired.
Extinction. The principle of extinction suggests that undesired behavior will decline as a result of a lack of positive reinforcement. If the perpetually tardy employee in the example in (Figure) consistently fails to receive supervisory praise and is not recommended for a pay raise, we would expect this nonreinforcement to lead to an “extinction” of the tardiness. The employee may realize, albeit subtly, that being late is not leading to desired outcomes, and she may try coming to work on time.
Punishment. Finally, a fourth strategy for behavior change used by managers and supervisors is punishment. Punishment is the administration of unpleasant or adverse outcomes as a result of undesired behavior. An example of the application of punishment is for a supervisor to publicly reprimand or fine an employee who is habitually tardy (see (Figure)). Presumably, the employee would refrain from being tardy in the future in order to avoid such an undesirable outcome. The most frequently used punishments (along with the most frequently used rewards) are shown in (Figure).
|Frequently Used Rewards and Punishments|
|Pay raise||Oral reprimands|
|Praise and recognition||Criticism from superiors|
|Sense of accomplishment||Reduced authority|
|Increased responsibility||Undesired transfer|
The use of punishment is indeed one of the most controversial issues of behavior change strategies. Although punishment can have positive work outcomes—especially if it is administered in an impersonal way and as soon as possible after the transgression—negative repercussions can also result when employees either resent the action or feel they are being treated unfairly. These negative outcomes from punishment are shown in (Figure). Thus, although punishment represents a potent force in corrective learning, its use must be carefully considered and implemented. In general, for punishment to be effective the punishment should “fit the crime” in severity, should be given in private, and should be explained to the employee.
Studies showcase that nearly 50 percent of employees in the U.S. workforce face bullying at one point in time. All types of bullying, not just discrimination or harassment, are important to consider.
Angela Anderson was working for a law school administration council and experienced bullying firsthand. Often her manager would yell at her in front of other coworkers, and it was clear to Angela that she was not well-liked. Unfortunately it was not just Angela who felt the wrath of this manager, who often handled interactions with other employees the same way. Many of the employees, including Angela, attempted to appease their bullying manager, but nothing would help. One day Angela was threatened by her manager, and before Angela could reach the HR department, she was fired. This example is an extreme case, but being able to take recourse against unwanted and disruptive employee behavior is an important action for any workplace manager.
- What steps can you take to ensure that your company can detract from employees’ bullying behavior?
- What actions should an employee take if they are experiencing unwanted behaviors from another employee or manager?
- What other departments should be involved when developing a plan and policies for how to handle unacceptable workplace behavior?
Sources: Acceptable and Unacceptable Behaviours, University of Cambridge website, accessed January 15, 2019, https://www.hr.admin.cam.ac.uk/policies-procedures/dignity-work-policy/guidance-managers-and-staff/guidance-managers/acceptable-and; Hedges, Kristi, How to Change Your Employee’s Behavior,” Forbes, March, 4, 2015, https://www.forbes.com/sites/work-in-progress/2015/03/04/how-to-change-your-employees-behavior/#c32ad4b6732a; and Kane, Sally, Workplace Bullying: True Stories, Statistics and Tips, The Balance Careers, January 29, 2019, https://www.thebalancecareers.com/bullying-stories-2164317.
In summary, positive reinforcement and avoidance learning focus on bringing about the desired response from the employee. With positive reinforcement the employee behaves in a certain way in order to gain desired rewards, whereas with avoidance learning the employee behaves in order to avoid certain unpleasant outcomes. In both cases, however, the behavior desired by the supervisor is enhanced. In contrast, extinction and punishment focus on supervisory attempts to reduce the incidence of undesired behavior. That is, extinction and punishment are typically used to get someone to stop doing something the supervisor doesn’t like. It does not necessarily follow that the individual will begin acting in the most desired, or correct, manner.
Often students have difficulty seeing the distinction between avoidance and extinction or in understanding how either could have a significant impact on behavior. Two factors are important to keep in mind. The first we will simply call the “history effect.” Not being harassed could reinforce an employee’s prompt arrival at work if in the past the employee had been harassed for being late. Arriving on time and thereby avoiding the past harassment would reinforce arriving on time. This same dynamic would hold true for extinction. If the employee had been praised in the past for arriving on time, then arrived late and was not praised, this would serve to weaken the tendency to arrive late. The second factor we will call the “social effect.” For example, if you see others harassed when they arrive late and then you are not harassed when you arrive on time, this could reinforce your arriving at work on time. Again, this same dynamic would hold true for extinction. If you had observed others being praised for arriving on time, then not receiving praise when you arrived late would serve to weaken the tendency to arrive late.
From a managerial perspective, questions arise about which strategy of behavioral change is most effective. Advocates of behavioral change strategies, such as Skinner, answer that positive reinforcement combined with extinction is the most suitable way to bring about desired behavior. There are several reasons for this focus on the positive approach to reinforcement. First, although punishment can inhibit or eliminate undesired behavior, it often does not provide information to the individual about how or in which direction to change. Also, the application of punishment may cause the individual to become alienated from the work situation, thereby reducing the chances that useful change can be effected. Similarly, avoidance learning tends to emphasize the negative; that is, people are taught to stay clear of certain behaviors, such as tardiness, for fear of repercussions. In contrast, it is felt that combining positive reinforcement with the use of extinction has the fewest undesirable side effects and allows individuals to receive the rewards they desire. A positive approach to reinforcement is believed by some to be the most effective tool management has to bring about favorable changes in organizations.
Schedules of Reinforcement
Having examined four distinct strategies for behavioral change, we now turn to an examination of the various ways, or schedules, of administering these techniques. As noted by Costello and Zalkind, “The speed with which learning takes place and also how lasting its effects will be is determined by the timing of reinforcement.”
Thus, a knowledge of the types of schedules of reinforcement is essential to managers if they are to know how to choose rewards that will have maximum impact on employee performance. Although there are a variety of ways in which rewards can be administered, most approaches can be categorized into two groups: continuous and partial (or intermittent) reinforcement schedules. A continuous reinforcement schedule rewards desired behavior every time it occurs. For example, a manager could praise (or pay) employees every time they perform properly. With the time and resource constraints most managers work under, this is often difficult, if not impossible. So, most managerial reward strategies operate on a partial schedule. A partial reinforcement schedule rewards desired behavior at specific intervals, not every time desired behavior is exhibited. Compared to continuous schedules, partial reinforcement schedules lead to slower learning but stronger retention. Thus, learning is generally more permanent. Four kinds of partial reinforcement schedules can be identified: (1) fixed interval, (2) fixed ratio, (3) variable interval, and (4) variable ratio (see (Figure)).
|Schedules of Partial Reinforcement|
|Schedule of Reinforcement||Nature of Reinforcement||Effects on Behavior When Applied||Effects on Behavior When Terminated||Example|
|Fixed interval||Reward on fixed time basis||Leads to average and irregular performance||Quick extinction of behavior||Weekly paycheck|
|Fixed ratio||Reward consistently tied to output||Leads quickly to very high and stable performance||Quick extinction of behavior||Piece-rate pay system|
|Variable interval||Reward given at variable intervals around some average time||Leads to moderately high and stable performance||Slow extinction of behavior||Monthly performance appraisal and reward at random times each month|
|Variable ratio||Reward given at variable output levels around some average output||Leads to very high performance||Slow extinction of behavior||Sales bonus tied to selling X accounts, but X constantly changes around some mean|
Fixed-Interval Schedule. A fixed-interval reinforcement schedule rewards individuals at specified intervals for their performance, as with a biweekly paycheck. If employees perform even minimally, they are paid. This technique generally does not result in high or sustained levels of performance because employees know that marginal performance usually leads to the same level of reward as high performance. Thus, there is little incentive for high effort and performance. Also, when rewards are withheld or suspended, extinction of desired behavior occurs quickly. Many of the recent job redesign efforts in organizations were prompted by recognition of the need for alternate strategies of motivation rather than paying people on fixed-interval schedules.
Fixed-Ratio Schedule. The second fixed schedule is the fixed-ratio schedule. Here the reward is administered only upon the completion of a given number of desired responses. In other words, rewards are tied to performance in a ratio of rewards to results. A common example of the fixed-ratio schedule is a piece-rate pay system, whereby employees are paid for each unit of output they produce. Under this system, performance rapidly reaches high levels. In fact, according to Hamner, “The response level here is significantly higher than that obtained under any of the interval (time-based) schedules.”
On the negative side, however, performance declines sharply when the rewards are withheld, as with fixed-interval schedules.
Variable-Interval Schedule. Using variable reinforcement schedules, both variable-interval and variable-ratio reinforcements are administered at random times that cannot be predicted by the employee. The employee is generally not aware of when the next evaluation and reward period will be. Under a variable-interval schedule, rewards are administered at intervals of time that are based on an average. For example, an employee may know that on the average her performance is evaluated and rewarded about once a month, but she does not know when this event will occur. She does know, however, that it will occur sometime during the interval of a month. Under this schedule, effort and performance will generally be high and fairly stable over time because employees never know when the evaluation will take place.
Variable-Ratio Schedule. Finally, a variable-ratio schedule is one in which rewards are administered only after an employee has performed the desired behavior a number of times, with the number changing from the administration of one reward to the next but averaging over time to a certain ratio of number of performances to rewards. For example, a manager may determine that a salesperson will receive a bonus for every 15th new account sold. However, instead of administering the bonus every 15th sale (as in a fixed-interval schedule), the manager may vary the number of sales that is necessary for the bonus, from perhaps 10 sales for the first bonus to 20 for the second. On the average, however, the 15:1 ratio prevails. If the employee understands the parameters, then the “safe” level of sales, or the level of sales most likely to result in a bonus, is in excess of 15. Consequently, the variable-ratio schedule typically leads to high and stable performance. Moreover, extinction of desired behavior is slow.
Which of these four schedules of reinforcement is superior? In a review of several studies comparing the various techniques, Hamner concludes:
The necessity for arranging appropriate reinforcement contingencies is dramatically illustrated by several studies in which rewards were shifted from a response-contingent (ratio) to a time-contingent (interval) basis. During the period in which rewards were made conditional upon occurrence of the desired behavior, the appropriate response patterns were exhibited at a consistently high level. When the same rewards were given based on time and independent of the worker’s behavior, there was a marked drop in the desired behavior. The reinstatements of the performance-contingent reward schedule promptly restored the high level of responsiveness.
In other words, the performance-contingent (or ratio) reward schedules generally lead to better performance than the time-contingent (or interval) schedules, regardless of whether such schedules are fixed or variable. We will return to this point in a subsequent chapter on performance appraisal and reward systems.
Two additional approaches to learning are found in the work of David Kolb and Mel Silberman. Kolb’s experiential learning style theory is typically represented by a four-stage learning cycle in which the learner ‘touches all the bases’. The Four stages are achieved when a person progresses through a cycle of four stages: of (1) having a concrete experience followed by (2) observation of and reflection on that experience which leads to (3) the formation of abstract concepts (analysis) and generalizations (conclusions) which are then (4) used to test hypothesis in future situations, resulting in new experiences. Silberman in his book Active Training, identified eight qualities of an effective and active learning experience. The eight qualities are: a moderate level of content; a balance between affective, behavioral, and cognitive learning, a variety of learning approaches, opportunities for group participation, encouraging participants to share their expertise, recycling concepts and skills learned earlier, advocating real-life problem solving, and allowing time for re-entry.
Sharon Johnson worked for a publishing company based in Nashville, Tennessee, that sold a line of children’s books directly to the public through a door-to-door sales force. Sharon had been a very successful salesperson and was promoted first to district and then to regional sales manager after just four years with the company. Sales bonuses were fixed, and a fixed-dollar bonus was tied to every $1,000 in sales over a specific minimum quota. However, there was a wide variety of rewards, from praise to gift certificates, that were left to Sharon’s discretion.
Sharon knew from her organizational behavior class that giving out praise to those who liked it and gifts to those who preferred them was an important means of reinforcing desired behavior, and she had been quite successful in implementing this principle. She also knew that if you reinforced a behavior that was “on the right track” to the ideal behavior you wanted out of a salesperson, eventually you could shape their behavior, almost without their realizing it.
Sharon had one particular salesperson, Lyle, that she thought had great potential, yet his weekly sales were somewhat inconsistent and often lower than she thought possible. When Lyle was questioned about his performance, he indicated that sometimes he felt that the families he approached could not afford the books he was selling and so he did not think it was right to push the sale too hard. Although Sharon argued that it was not Lyle’s place to decide for others what they could or could not afford, Lyle still felt uncomfortable about utilizing his normal sales approach with these families.
Sharon believed that through subtle reinforcement of certain behaviors she could shape Lyle’s behavior and that over time he would increasingly use his typical sales approach with the families he thought could not afford the books. For example, she knew that in the cases of families Lyle thought could not afford the books, he spent only 3.5 minutes in the house compared to 12.7 minutes in homes of families he judged able to afford the books. Sharon believed that if she praised Lyle when the average time he spent in each family’s home was quite similar that Lyle would increase the time he spent in the homes of families he judged unable to afford the books. She believed that the longer he spent in these homes, the more likely Lyle was to utilize his typical sales approach. This was just one of several ways Sharon thought she could shape Lyle’s behavior without trying to change his mind about pushing books onto people he thought could not afford them.
Sharon saw no ethical issues in this case until she told a friend about it and the friend questioned whether it was ethical to utilize learning and reinforcement techniques to change people’s behavior “against their will” even if they did not realize that this was happening.
Source: This ethical challenge is based on a true but disguised case observed by author J. Stewart Black.
- What is reinforcement, and how can it be applied to motivation?
- What are the four strategies to use for behavioral change?
- What is the significance of schedules in changing behavior?
- What are the best practices that organizations utilize to train employees in new job skills?
Reinforcement causes a certain behavior to be repeated or inhibited. Positive reinforcement is the practice of presenting someone with an attractive outcome following a desired behavior.
Avoidance learning occurs when someone attempts to avoid an unpleasant condition or outcome by behaving in a way desired by others.
Punishment is the administration of an unpleasant or adverse outcome following an undesired behavior. Reinforcement schedules may be continuous or partial. Among the partial reinforcement schedules are (1) fixed interval, (2) fixed ratio, (3) variable interval, and (4) variable ratio.
- Avoidance learning
- Refers to seeking to avoid an unpleasant condition or outcome by following a desired behavior.
- Continuous reinforcement
- Rewards desired behavior every time it occurs.
- The principle that suggests that undesired behavior will decline as a result of a lack of positive reinforcement.
- Partial reinforcement
- Rewards desired behavior at specific intervals, not every time desired behavior is exhibited.
- Positive reinforcement
- Consists of presenting someone with an attractive outcome following a desired behavior.
- The administration of unpleasant or adverse outcomes as a result of undesired behavior.
- Anything that causes a certain behavior to be repeated or inhibited.