A practical look at building and implementing your perfect performance management process.
Rating scales are very common in employee reviews and performance checkins. They help us quantitatively measure employee performance.
The benefits of employee rating questions are that they allow for simpler comparisons between employees and they can speed up the appraisal process. Faster appraisals can allow organizations to solicit feedback from more people in one review.
The downside of performance rating scales is that a lot of nuance is lost in a simple three, five or seven point scale. It can be hard to boil down all of a person’s strengths and weaknesses to one number.
The other common issue with rating scales is that they are poorly constructed. This post is designed to help you create the scales for your performance management process. We also provide a ton of examples to borrow from at the bottom.
There is literally a science to rating scales. Social scientists have been using questionnaires to collect real scientific data for many decades.
That means we don’t need to reinvent the wheel here, we should learn from our scientific colleagues.
Stay with me, we should have a quick understanding of the types of data we’re collecting before diving deeper into rating scales. There are three types of data that are most often collected on employee appraisal forms - Nominal, Binary and Ordinal. Here is what they mean…
Nominal = Categories
Example: “Which of our six company values does this employee most live-up-to?”
When the answer options have no relationship to each other, in other words they aren’t ordered, or have any numeric relationship, you are asking a question that will generate nominal data. These are not technically rating scale questions, but are commonly found on review forms.
Binary = Yes or No (either or)
Example: “Is this employee ready for promotion?”
Binary data is always either or. The most common example being yes or no. Other examples are exists or doesn’t exist, is or is not, complete or incomplete. Deloitte collects binary data in 2 of the 4 questions on their review form. Google collects binary data on their upward reviews of managers.
Ordinal = Ordered List
Example: “Rate the employee for the following statements using a five point scale from Strongly Agree to Strongly Disagree”
Ordinal data is collected when we ask rating scale questions. The answers to a question will be a list of possibilities that have a clear order or ranking. As you move up the scale, options should clearly be better/more and as you move down the scale, the options are worse/less.
There are two common ways to present rating scale answers, Numeric and Descriptiv. Here is what they mean…
Numeric - Just numbers (like 1-5)
Example: “Score the employee’s leadership ability between 1 and 5.”
Numeric scales rightfully get a lot of pushback. It can be really hard for managers to understand what constitutes a 4 verse a 5 when it comes to subjective competencies like “assertiveness.”
Descriptive - Ordered descriptions
Example: Everything from Agree to Disagree all the way to Behaviorally Anchored Rating Scales.
Descriptive rating scales include descriptions of what each step up on the scale looks like. This could be as simple as different levels of agreement or it could be as complex as a set of specific actions an employee should have taken to achieve each level.
Likert scales are the most common scales and one we’ve all seen before. This scale measures our response to a statement, with the most common being…
Strongly Disagree - Disagree - Neither Agree nor Disagree - Agree - Strongly Agree.
Well designed likert scales will be symmetrical, with an equal number of positive and negative responses. They will also be balanced with what feels like the same distance between each choice.
Five choices is the most common, but any number is possible. One of the most important decisions to make is whether to give an odd or even number. An odd number of choices will mean the central option is neutral, neither positive or negative. An even number of options is sometimes called a “forced choice” does not give a neutral option and so the respondent has to pick a side.
Semantic scales are similar to likert scales but present just two extremes with unnamed options in between. For example you might ask an employee to rate a recent project between success and failure with 7 options in between.
This is one of the most common choices on performance rating forms. We find that HR teams like to create their own scales to fit their needs. This is a bold move that could lead to unexpected distortions in your data. But! If you’re up for it we’ve provided many real life rating scale examples below.
The University of California, Berkeley human resources department currently conducts performance appraisals with a 5-level rating scale, ranging from Exceptional to Unsatisfactory. Supervisors that assign a Level 2 (Improvement Needed) or Level 1 (Unsatisfactory) rating to an employee must complete a Performance Improvement Plan for said employee. This plan is developed to improve or correct poor performance, containing timelines that are outlined and monitored to measure the employee’s progress. A Level 5 (Exceptional) rating is said to be achievable, but given fairly infrequently. High-performing employees often receive a Level 4 (Exceeds Expectations) or Level 3 (Meets Expectations) rating.
This company uses a rating system that is both numerical and alphabetical, focused on whether or not employees meet company goals. Their 5-point scale assigns abbreviations that coincide with each numerical ranking: 5 = FE (Far Exceeds), 4 = EX (Exceeds Expectations), 3 = ME (Meets Expectations), 2 = DR (Development Required), and 1 = IR (Improvement Required).
Harvard makes use of multiple rating scales within their organization, including overall performance ratings of employees, goals, competencies, and direct report ratings. Overall performance ratings are given on a 5-point scale, observing employees with performances that are leading (5), strong (4), solid (3), building (2), and not meeting expectations (1).
Goals are also tracked using a 3-point rating scale that measures whether a goal or project was on time, on budget, and accomplished. A 3 ranking implies that a goal was met, a 2 ranking is given to partially met goals, and a 1 ranking is assigned to an unfinished goal where most or all dimensions were not achieved.
Competencies ratings are given to employees who demonstrate thorough to lacking knowledge of the organization’s core competencies. This 4-point scale ranges from Advanced, to Proficient, to Developing, and lastly, Does Not Demonstrate.
Direct report ratings are reserved for managers only, and determine whether the ratee’s capabilities are Highly Effective (3), Effective (2), or Needs Improvement (1).
Emory University’s HR team operates an in-depth rating system that is similar to BARS. Each employee is rated against a long list of unique core competencies that the organization abides by. This checklist includes building trust, collaboration, communication, delivering results, problem solving, taking initiative, functional knowledge and skills, and service to others/customer focus.
Each of these categories deals with how well an employee displays honesty, respect, listening and sharing, productivity, decision making, and reasoning. The competencies are rated with a 3-point system ranging from Exceeds Expectations, Meets Expectations, and Unacceptable. All ratings apply to supervisors and managers, as well as non-managers.
Unsatisfactory | Needs Improvement | Meets Expectations | Exceeds Expectations | Distinguished
Needs Improvement | Meets Expectations
Does Not Meet | Meets | Exceeds
Below Level | At Level | Above Level
Needs Attention | Satisfactory
Unacceptable | Needs Improvement | Acceptable | Good | Excellent
Did not meet expectations | Met some but not all expectations | Fully met expectations | Exceeded expectations | Significantly exceeded expectations
Area of Deficiency | Inconsistently Meets Standards | Meets Standards | Meets High Standards | Regularly Exceeds High Standards
Needs Improvement | Consistently Meets Expectations | Exceeds Expectations | Strongly Exceeds Expectations | Superb
Unsatisfactory | Meets Most | Fully Meets and Sometimes Exceeds | Consistently Exceeds | Far Exceeds
Never | Sometimes | Often | Always
Not Often Enough | From Time to Time | Most of the Time
Minor Contribution | Important Contribution | Critical Contribution
Low Performer | Developing Performer | Highly Valued Performer | Top Performer
Unacceptable Performance | Partially Successful | Fully Successful | Superior | Distinguished Performance
Poor | Below Average | Good | Very Good | Outstanding
PerformYard drives employee performance for the enterprise through our intuitive web-based software platform. PerformYard software enables executives to better leverage their most important resource — people.