Chapter 5 Statistical Analysis Plan (SAP) Development


Chapter Objectives

The Statistical Analysis Plan (SAP) is the definitive document that governs how clinical trial data are analyzed and interpreted.
This chapter provides a comprehensive and practical framework for developing a robust, regulator-ready SAP.

After completing this chapter, the reader should be able to:

  • Define primary analysis methods with sufficient operational detail
  • Specify models, covariates, and stratification handling clearly
  • Plan secondary, exploratory, sensitivity, and subgroup analyses
  • Define coherent multiplicity and missing data strategies
  • Support interim analyses and Data Monitoring Committee (DMC) activities, if applicable
  • Ensure all analyses are reproducible and implementable without post–database lock decisions

5.1 Role of the SAP in Clinical Trials

5.1.1 Why the SAP Is a Critical Document

From a statistical perspective, the SAP serves as:

  • The binding interpretation of the protocol
  • The operational blueprint for statistical programming
  • The primary reference for regulatory review and inspection

Any analysis not prospectively specified in the SAP is vulnerable to being considered post hoc, regardless of scientific plausibility.


5.1.2 Relationship Between the Protocol and the SAP

  • The protocol defines what will be studied
  • The SAP defines how the data will be analyzed

The SAP must be fully consistent with the protocol while providing substantially more detail to remove analytical ambiguity.


5.2 Definition of the Primary Analysis Method

5.2.1 Purpose of the Primary Analysis

The primary analysis directly addresses the primary estimand and supports the main study conclusion.
It must be defined in sufficient detail so that:

  • Independent statisticians would implement the same analysis
  • Results are reproducible
  • No analytical discretion remains after database lock

5.2.2 Model Type Specification

The SAP must explicitly specify the statistical model, including:

  • Model family (e.g., linear model, generalized linear model, Cox model)
  • Link function, if applicable
  • Distributional assumptions

Typical examples include:

  • ANCOVA for continuous endpoints
  • Logistic regression for binary endpoints
  • Cox proportional hazards models for time-to-event endpoints

Any key model assumptions should be stated and, where appropriate, assessed.


5.2.3 Covariate Specification

The SAP should clearly define:

  • Which covariates are included in the model
  • Whether covariates are pre-specified or data-driven
  • How covariates are coded (continuous or categorical)

Covariates typically include baseline measures of the endpoint or other strong prognostic factors identified during study design.


5.2.4 Handling of Stratification Factors

If stratified randomization was used, the SAP should specify:

  • Whether stratification factors are included as covariates
  • Whether stratified tests or stratified models are applied
  • How sparse or empty strata are handled

Consistency between randomization and analysis strategies is essential.


5.3 Secondary and Exploratory Analyses

5.3.1 Secondary Analyses

Secondary analyses address pre-specified secondary objectives and support interpretation of the primary results.
They should be fully specified in the SAP but clearly distinguished from the primary analysis.


5.3.2 Exploratory Analyses

Exploratory analyses are hypothesis-generating and descriptive in nature.
The SAP should outline their general analytical approach while clearly labeling them as exploratory.


5.4 Multiplicity Control Strategy

5.4.1 Importance of Multiplicity Control

Multiplicity affects the interpretation of statistical significance.
The SAP must describe how Type I error is controlled across:

  • Multiple endpoints
  • Multiple treatment comparisons
  • Multiple time points or analyses

5.4.2 Common Multiplicity Approaches

Multiplicity strategies commonly specified in SAPs include:

  • Hierarchical testing procedures
  • Gatekeeping strategies
  • Alpha-splitting or adjustment methods

The chosen strategy must align with study objectives and be defined prior to unblinding.


5.5 Missing Data Handling Strategies

5.5.1 Importance of Pre-Specifying Missing Data Methods

Assumptions about missing data directly affect interpretation of treatment effects.
The SAP must prospectively specify missing data handling methods for each key analysis.


5.5.2 Commonly Used Methods

The SAP should clearly state when and how the following methods are applied:

  • MMRM (Mixed Model for Repeated Measures)
  • Multiple Imputation (MI)
  • Last Observation Carried Forward (LOCF)
  • Non-Responder Imputation (NRI)

Each method’s assumptions and limitations should be acknowledged.


5.5.3 Alignment With Estimands

Missing data strategies should be consistent with the estimand framework (e.g., treatment policy, hypothetical, or composite strategies).


5.6 Definition of Repeat Assessments and Visit Window Rules

5.6.1 Purpose of Visit Window Rules

Repeated measurements and visit deviations must be handled consistently.
The SAP should define:

  • Visit windows
  • Rules for selecting analysis values
  • Handling of unscheduled or repeated assessments

5.6.2 Statistical Rules for Repeat or Follow-Up Measurements

The SAP should specify:

  • Which value is used when multiple measurements are available
  • Whether averaging or selection rules apply
  • How confirmatory or repeat tests are treated

Clear rules prevent downstream programming discrepancies.


5.7 Sensitivity Analysis Plan

5.7.1 Purpose of Sensitivity Analyses

Sensitivity analyses evaluate the robustness of the primary analysis to key assumptions.
They are essential for assessing the reliability of study conclusions.


5.7.2 Common Sensitivity Analyses

Examples include:

  • Alternative missing data assumptions
  • Different analysis populations
  • Alternative model specifications

Each sensitivity analysis should be linked to a specific assumption being tested.


5.8 Subgroup Analysis Definition

5.8.1 Purpose of Subgroup Analyses

Subgroup analyses explore the consistency of treatment effects across predefined subpopulations.


5.8.2 Pre-Specification Requirements

The SAP should define:

  • Subgroup variables and category definitions
  • Statistical models used for subgroup analyses
  • Whether treatment-by-subgroup interaction tests are conducted

Results should be interpreted cautiously and in context.


5.9 Interim Analysis and DMC Support (If Applicable)

5.9.1 Interim Analysis Specifications

If interim analyses are planned, the SAP should specify:

  • Timing or triggering criteria
  • Statistical methods applied
  • Alpha spending or adjustment approaches
  • Decision boundaries

5.9.2 DMC Support

The SAP may include or reference:

  • Analysis outputs prepared for DMC review
  • Data handling procedures for unblinded analyses
  • Role separation and access controls

Clear procedures are essential to protect study integrity.


5.10 SAP Quality and Implementation Checklist

Before finalizing the SAP, confirm that:


5.11 Chapter Summary

The SAP transforms study objectives and protocol concepts into executable statistical analyses.
A high-quality SAP eliminates analytical ambiguity, ensures reproducibility, and provides a defensible basis for interpretation.

Careful, detailed, and prospective SAP development is one of the most critical responsibilities of the statistician in clinical research.