Skip to contents

Create a calibration plot to assess the agreement between predicted probabilities and observed treatment rates. This function wraps geom_calibration().

Usage

plot_calibration(
  .data,
  .fitted,
  .group,
  treatment_level = NULL,
  method = "breaks",
  bins = 10,
  smooth = TRUE,
  conf_level = 0.95,
  window_size = 0.1,
  step_size = window_size/2,
  k = 10,
  include_rug = FALSE,
  include_ribbon = TRUE,
  include_points = TRUE,
  na.rm = FALSE,
  ...
)

Arguments

.data

A data frame containing the variables.

.fitted

Column name of predicted probabilities (propensity scores). Can be unquoted (e.g., .fitted) or quoted (e.g., ".fitted").

.group

Column name of treatment/group variable. Can be unquoted (e.g., qsmk) or quoted (e.g., "qsmk").

treatment_level

Value indicating which level of .group represents treatment. If NULL (default), uses the last level for factors or max value for numeric.

method

Character; calibration method - "breaks", "logistic", or "windowed".

bins

Integer >1; number of bins for the "breaks" method.

smooth

Logical; for "logistic" method, use GAM smoothing if available.

conf_level

Numeric in (0,1); confidence level for CIs (default = 0.95).

window_size

Numeric; size of each window for "windowed" method.

step_size

Numeric; distance between window centers for "windowed" method.

k

Integer; the basis dimension for GAM smoothing when method = "logistic" and smooth = TRUE. Default is 10.

include_rug

Logical; add rug plot showing distribution of predicted probabilities.

include_ribbon

Logical; show confidence interval ribbon.

include_points

Logical; show points (only for "breaks" and "windowed" methods).

na.rm

Logical; if TRUE, drop NA values before computation.

...

Additional parameters passed to geom_calibration().

Value

A ggplot2 object.

Details

Calibration plots visualize how well predicted probabilities match observed outcome rates. Since outcomes are binary (0/1), the "observed rate" represents the proportion of units with outcome = 1 within each prediction group. For example, among all units with predicted probability around 0.3, we expect approximately 30% to actually have the outcome. Perfect calibration occurs when predicted probabilities equal observed rates (points fall on the 45-degree line).

The plot supports three calibration assessment methods:

  • "breaks": Bins predictions into groups and compares mean prediction vs observed rate within each bin

  • "logistic": Fits a logistic regression of outcomes on predictions; perfect calibration yields slope=1, intercept=0

  • "windowed": Uses sliding windows across the prediction range for smooth calibration curves

Examples

library(ggplot2)

# Basic calibration plot
plot_calibration(nhefs_weights, .fitted, qsmk)
#> Warning: Small sample sizes or extreme proportions detected in bins 9, 10 (n = 8, 3).
#> Confidence intervals may be unreliable. Consider using fewer bins or a
#> different calibration method.
#> Warning: Small sample sizes or extreme proportions detected in bins 9, 10 (n = 8, 3).
#> Confidence intervals may be unreliable. Consider using fewer bins or a
#> different calibration method.
#> Warning: Small sample sizes or extreme proportions detected in bins 9, 10 (n = 8, 3).
#> Confidence intervals may be unreliable. Consider using fewer bins or a
#> different calibration method.


# With rug plot
plot_calibration(nhefs_weights, .fitted, qsmk, include_rug = TRUE)
#> Warning: Small sample sizes or extreme proportions detected in bins 9, 10 (n = 8, 3).
#> Confidence intervals may be unreliable. Consider using fewer bins or a
#> different calibration method.
#> Warning: Small sample sizes or extreme proportions detected in bins 9, 10 (n = 8, 3).
#> Confidence intervals may be unreliable. Consider using fewer bins or a
#> different calibration method.
#> Warning: Small sample sizes or extreme proportions detected in bins 9, 10 (n = 8, 3).
#> Confidence intervals may be unreliable. Consider using fewer bins or a
#> different calibration method.


# Different methods
plot_calibration(nhefs_weights, .fitted, qsmk, method = "logistic")

plot_calibration(nhefs_weights, .fitted, qsmk, method = "windowed")
#> Warning: Small sample sizes or extreme proportions detected in windows centered at 0.7,
#> 0.75, 0.8 (n = 5, 3, 1). Confidence intervals may be unreliable. Consider using
#> a larger window size or a different calibration method.
#> Warning: Small sample sizes or extreme proportions detected in windows centered at 0.7,
#> 0.75, 0.8 (n = 5, 3, 1). Confidence intervals may be unreliable. Consider using
#> a larger window size or a different calibration method.
#> Warning: Small sample sizes or extreme proportions detected in windows centered at 0.7,
#> 0.75, 0.8 (n = 5, 3, 1). Confidence intervals may be unreliable. Consider using
#> a larger window size or a different calibration method.


# Specify treatment level explicitly
plot_calibration(nhefs_weights, .fitted, qsmk, treatment_level = "1")
#> Warning: Small sample sizes or extreme proportions detected in bins 9, 10 (n = 8, 3).
#> Confidence intervals may be unreliable. Consider using fewer bins or a
#> different calibration method.
#> Warning: Small sample sizes or extreme proportions detected in bins 9, 10 (n = 8, 3).
#> Confidence intervals may be unreliable. Consider using fewer bins or a
#> different calibration method.
#> Warning: Small sample sizes or extreme proportions detected in bins 9, 10 (n = 8, 3).
#> Confidence intervals may be unreliable. Consider using fewer bins or a
#> different calibration method.