Completing a Program Review

Getting Started

When you first open up the Program Report you might find it helpful to begin by reading Section I first, scanning Sections II and III, and then opening the attachments and in Section IV. This should help give you an understanding of the assessments and how they relate to the standards. You can then go to the Reviewer Worksheet (and/or other SPA documents) and work through each of the questions.

The Reviewer Worksheet is organized around your SPA standards. You are asked to evaluate each of the assessments assigned to a specific standard (from Section III of the Program Report), evaluate the reported candidate data, and then to make a decision about whether or not each standard is met.

The following information is to help guide you as you answer each of the questions on the Reviewer Worksheet. Rubrics will be developed for each of these questions. The rubrics have three levels: target, acceptable, and unacceptable. Each of these is defined below: 

  • Target: Fully meets and exceeds standard

  • Acceptable: Meets standard; weaknesses may be found, but overall the standard is met

  • Unacceptable: Weaknesses are serious and must be addressed prior to positive rating

A. Are the Assessments Aligned with the SPA Standards?

Assessments must be aligned with the standards--there must be a match between the content of the standard and what the assessment is measuring. It is quite likely that a single assessment could address components of multiple standards (as indicated in the chart in Section III of the program report). Here are some questions that reviewers might ask as they evaluate alignment of the assessments: 

  • CONTENT - Do the same or consistent content topics appear in the assessments that are in the standards?

  • RANGE - Do the assessments address the range of knowledge, skills, and dispositions that are delineated in the standard? Some SPA standards are very comprehensive, some cover smaller elements. In general, is the preponderance of the content of the standard addressed by the assessments assigned to it? If the SPA standard is very dense and covers a number of concepts, it is not necessary to check off every single element. It is better to look holistically at the standard as you compare it to the assessments. SPA resources should be helpful to you when addressing this question.

B. Do the Assessments Assess Meaningful Cognitive Demands and Skill Requirements at Challenging Levels for Candidates?

Here are two questions that reviewers might ask as they evaluate this question:

  • COMPLEXITY - Are the assessments congruent with the complexity, cognitive demands, and skill requirements described in the standards?

  • DIFFICULTY - Is the level of effort required, or the difficulty or degree of challenge of the assessments consistent with standards? Is this level reasonable for candidates who are ready to teach or to take on other professional educator responsibilities?

  • From what you find in the assessment, the instructions, and the scoring guide, is the assessment measuring what it purports to measure?

Other Issues:

  • SPECIFICITY - Are the assessments vague or poorly defined? The assessments might include an entry like “portfolio entries, test results, observations.” What entries? What test results? What observations? These need to be identified as specific experiences. Is the assessment information oblique or confused? Sometimes the response does not actually address the standard.  


    • If grades are used as evidence, then the program report must follow the instructions on the NCATE website that describes how programs use grades to assess content knowledge. This document should give a description of the assessment by providing brief descriptions of the courses and a rationale as to why these courses were chosen, show alignment between the course grades and the SPA standards, submit grading policies and a minimum expectation for candidate grades and finally, present data tables presenting, at a minimum, the grade distributions and mean course grades for candidates in the selected courses.  Institutions cannot claim that an acceptable grade in a course in which an important experience is embedded is sufficient to assume that the specific experience is satisfactory.  For example, if a research project in a required course is cited as an example of how candidates meet a SPA standard, the course grade (which includes many measures beyond the research project) cannot automatically be assumed to reflect information about candidate mastery of the standard.  Please see NCATE’s Grades Policy, which went into effect in Fall 2008.

C. Are the Assessments Free of Bias?

From information provided in the program report, reviewers should be able to infer some important qualities about the avoidance of bias.  Assessments should be constructed in ways that avoid biases in both language and in testing situations.

Reviewers can consider the following question:

  • Are the assessments and their scoring guide free from racial, ethnic, gender, cultural, or other bias?

D. Are the Scoring Guides Clear and are the Levels of Candidate Proficiency they Describe Distinct and Appropriate?

A scoring guide is the tool faculty use to determine candidates’ ratings on specific assessments. Scoring guides should address relevant and meaningful attributes of candidate knowledge and performance related to the standards on an assessment task and should be used to reach meaningful decisions.  Scoring guides can take many forms (such as Likert scales and rubrics) depending on the assessment activity.

Regardless of the form the scoring guides take, they should have written and shared criteria for judging performance that indicate the qualities by which levels of performance can be differentiated. They should be explicit enough to anchor judgments about the degree of success on a candidate assessment.

Many assessments are little more than checklists completed at the end of the student teaching experience.  They do not define what is being sought and the ratings are in some cases mere numbers or words subject to broad interpretation (e.g., 1, 2, or 3; or excellent, good, acceptable).  Such instruments do not provide either candidates or supervisors with substantive guidance as to what is being sought.

To be reliable, assessments must be capable of yielding approximately the same values across raters.  One way to achieve inter-rater reliability is to train raters, but this is difficult to evaluate in this paper review.  A second and more practical approach is to carefully review instruments that are highly explicit as to expectations and ratings.

When evaluating scoring guides, reviewers can consider such questions as the following:

  • Are scoring guides clear and explicit about faculty expectations for candidate proficiencies in relation to standards?

  • Do scoring guides address relevant and meaningful attributes of candidate performance on an assessment task?  Do assessments and scoring guides work together so that different levels of candidate proficiency can be clearly distinguished

  • When rubrics are used, is there specific guidance on what a rater might look for?  

E. Do the Data as Reported Indicate the Extent to Which the Candidates Meet the Standard?

The key summarizing question for reviewers is: does the program present convincing evidence that its graduates can demonstrate that they have mastered the SPA standards? The primary sources of information for you to use to address this question are the narratives and data charts in Section IV for each assessment. This should give you a complete picture of the data, how faculty interpret the data, and any contextual issues that might have had an impact on the data.

F. Is the Standard Met?

After answering the previous four questions, you are now asked to make a holistic decision on whether or not the standard is met. In general, most of the previous four questions should be met at the acceptable level for the standard to be met, but this should certainly be a matter of professional judgment. For example, you may deem that the assessments and scoring guide are appropriate but that there are no data available yet, because the assessment is new and data have not been collected. In this situation, you may determine that the standard is met with conditions. In another situation, the assessments may be appropriate but the scoring guide is so weak that the data are essentially useless. In this case, the standard could not be met.

G. Final Program Recognition Decision

After you have made individual decisions for each of the standards, you are asked to look at all of these decisions and then make one recognition decision for the program as a whole. As you do this, there are several things to consider.

Consideration in Determining a Program Rating

  • Number of standards not met.

  • Degree of divergence of ratings across standards


  • There may be many ways to reach the same goal.

  • Judgments must be based on standards, not personal opinion.

  • Be reasonable; neither harsh nor gullible.

Final Decision

There are three possible decisions and each of these is defined below. These decisions have been approved by the NCATE Specialty Areas Studies Board in October 2008.

National Recognition

Criteria for Making Decision:

  • The program substantially meets standards.

  • The program substantially meets standards, but may have some areas which may be related to standards assessments, scoring guides, or data.

Consequences of Decision:

  • Once the unit is accredited, the program will remain nationally recognized until the next unit accreditation decision is made.

National Recognition with Conditions

Criterion for Making Decision:

  • The program generally meets standards; however, one or more conditions must be remediated within 18 months to extend national recognition for the full 5-7 year accreditation period. The response to the conditions must be submitted no more than twice within the same 18 months. Conditions are limited to one or more of the following:

    • Insufficient data to determine if standards are met

    • Insufficient alignment among standards, scoring assessments or scoring guides

    • Lack of quality in some assessments or scoring guides

    • The SPA-required number of standards is not met.

    • The NCATE requirement for an 80% pass rate on state licensure tests is not met

Consequences of Decision:

  • Program must submit a Response to Conditions Report no more than twice within 18 months of the original recognition decision in order to maintain national recognition.

  • If conditions are adequately remediated the program will receive full national recognition. Recognition is valid until the next unit accreditation decision is made.

  • If conditions are not adequately remediated within the 18 month period, the program’s status will change to Not Nationally Recognized. A new program report can be submitted to reapply for national recognition.

Implications for Further Action:

  • Conditions must be explicit and clearly stated in Part G of the National Recognition Report.  If possible, the report will be sent to the original team for review.

  • NCATE will provide training to reviewers, SPA Coordinators, and Audit Teams on writing explicit and specific conditions statements.     

Further Development Required/Recognition with Probation

Criterion for Making Decision:

  • The standards that are not met are serious and more than a few in number OR are few in number but so fundamentally important that recognition is not appropriate.

Consequences of Decision:

  • The unit may submit a revised program report addressing unmet standards no more than twice within a 12-14 month period. [This report will be sent to the original team if at all possible.]

  • The unit may submit a new program report for national recognition within the 12-14 month period. [Note: This report will be sent to a new team of reviewers.]


As you read through the report, you should pay attention to aspects of the program that are unique and/or that you see as strengths. There is a section on the Reviewer Worksheet for you to note these as you see them. Strengths can either be specific aspects of the program (e.g. diversity of clinical sites) or more global statements (e.g. a major focus on teaching in urban settings). These will be cited in Part A in the NCATE Program Recognition Report in the text box labeled as “Summary of Strengths.”